Re: [HACKERS] hardware donation

2013-07-19 Thread Benedikt Grundmann
The server is already turned off and in our nyc office (I'm based in the
ldn one).  But I'm pretty sure its a LSI MegaRAID SAS 9285.


On Thu, Jul 18, 2013 at 11:58 PM, Greg Smith g...@2ndquadrant.com wrote:

 On 7/10/13 12:53 PM, Benedikt Grundmann wrote:

 The server will probably be most interesting for the disks in it.  That
 is where we spend the largest amount of time optimizing (for sequential
 scan speed in particular):
 22x600GB disks in a Raid6+0 (Raid0 of 2x 10disk raid 6 arrays) + 2 spare
 disks. Overall size 8.7 TB in that configuration.


 What is the RAID controller used in the server?  That doesn't impact the
 donation, I'm just trying to fit this one into my goals for finding useful
 community performance testing equipment.

 There are a good number of systems floating around the community with HP
 controllers--I have even one myself now--but we could use more LSI Logic
 and Adaptec based systems.

 --
 Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
 PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



Re: [HACKERS] AGG_PLAIN thinks sorts are free

2013-07-19 Thread Ashutosh Bapat
On Fri, Jul 19, 2013 at 8:34 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Jeff Janes jeff.ja...@gmail.com writes:
  AGG_PLAIN sometimes does sorts, but it thinks they are free.  Also, under
  explain analyze it does not explicitly report whether the sort was
 external
  or not, nor report the disk or memory usage, the way other sorts do.  I
  don't know if those two things are related or not.

 DISTINCT (and also ORDER BY) properties of aggregates are implemented
 at runtime; the planner doesn't really do anything about them, except
 suppress the choice it might otherwise make of using hashed aggregation.
 Since the behavior is entirely local to the Agg plan node, it's also
 not visible to the EXPLAIN ANALYZE machinery.

 Arguably we should have the planner add on some cost factor for such
 aggregates, but that would have no effect whatever on the current level
 of plan, and could only be useful if this was a subquery whose cost
 would affect choices in an outer query level.  Which is a case that's
 pretty few and far between AFAIK (do you have a real-world example where
 it matters?).  So it's something that hasn't gotten to the top of
 anybody's to-do list.

 An arguably more useful thing to do would be to integrate this behavior
 into the planner more completely, so that (for instance) if only one
 aggregate had ORDER BY then we would make the underlying query produce
 that order instead of implementing a sort locally in the Agg node.


Slight modification would be *all* aggregates having same ORDER BY clause.
I think it would be easy given that we already influence the sortedness of
underlying query result in planner.


 That hasn't risen to the top of the to-do list either, as yet.

 regards, tom lane


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-07-19 Thread KONDO Mitsumasa

(2013/07/19 0:41), Greg Smith wrote:

On 7/18/13 11:04 AM, Robert Haas wrote:

On a system where fsync is sometimes very very slow, that
might result in the checkpoint overrunning its time budget - but SO
WHAT?


Checkpoints provide a boundary on recovery time.  That is their only purpose.
You can always do better by postponing them, but you've now changed the 
agreement
with the user about how long recovery might take.
Recently, a user who think system availability is important uses synchronous 
replication cluster. And, as Robert saying, a user who cannot build cluster 
system will not use this function in GUC.


When it became IO busy in calling fsync(), my patch does not take the over IO 
load in fsync(). Actually, it is the same as OS writeback structure. I read 
kernel source code which is fs/fs-writeback.c in linux-2.6.32-358.0.1.el6. It is 
latest RHEL6.4 kernel code. It seems that wb_writeback() controlled disk IO in 
OS-writeback function. Please see under source code. If OS think IO is busy, it 
does not write more IO for bail.


fs/fs-writeback.c @wb_writeback()
 623 /*
 624  * For background writeout, stop when we are below the
 625  * background dirty threshold
 626  */
 627 if (work-for_background  !over_bground_thresh())
 628 break;
 629
 630 wbc.more_io = 0;
 631 wbc.nr_to_write = MAX_WRITEBACK_PAGES;
 632 wbc.pages_skipped = 0;
 633
 634 trace_wbc_writeback_start(wbc, wb-bdi);
 635 if (work-sb)
 636 __writeback_inodes_sb(work-sb, wb, wbc);
 637 else
 638 writeback_inodes_wb(wb, wbc);
 639 trace_wbc_writeback_written(wbc, wb-bdi);
 640 work-nr_pages -= MAX_WRITEBACK_PAGES - wbc.nr_to_write;
 641 wrote += MAX_WRITEBACK_PAGES - wbc.nr_to_write;
 642
 643 /*
 644  * If we consumed everything, see if we have more
 645  */
 646 if (wbc.nr_to_write = 0)
 647 continue;
 648 /*
 649  * Didn't write everything and we don't have more IO, bail
 650  */
 651 if (!wbc.more_io)
 652 break;
 653 /*
 654  * Did we write something? Try for more
 655  */
 656 if (wbc.nr_to_write  MAX_WRITEBACK_PAGES)
 657 continue;
 658 /*
 659  * Nothing written. Wait for some inode to
 660  * become available for writeback. Otherwise
 661  * we'll just busyloop.
 662  */
 663 spin_lock(inode_lock);
 664 if (!list_empty(wb-b_more_io))  {
 665 inode = list_entry(wb-b_more_io.prev,
 666 struct inode, i_list);
 667 trace_wbc_writeback_wait(wbc, wb-bdi);
 668 inode_wait_for_writeback(inode);
 669 }
 670 spin_unlock(inode_lock);
 671 }
 672
 673 return wrote;

I want you to read especially point that is line 631, 651, and 656. 
MAX_WRITEBACK_PAGES is 1024 (1024 * 4096 byte). OS writeback scheduler does not 
write over MAX_WRITEBACK_PAGES. Because, if it write big data than 
MAX_WRITEBACK_PAGES, it will be IO-busy. And if it cannot write at all, OS think 
it needs recovery of IO performance. It is same as my patch's logic.


In addition, if you set a large value of a checkpoint_timeout or 
checkpoint_complete_taget, you have said that performance is improved, but is it 
true in all the cases? Since the write of the dirty buffer which passed 30 
seconds or more is carried out at intervals of 5 seconds, as there are many 
recesses of a write, a possibility of becoming an inefficient random write. For 
example, as for the worsening case, when the sleep time for 200 ms is inserted 
each time, since only 25 page (200 KB) can write in 5 seconds. I think it is bad 
efficiency to write. When a checkpoint complication target is actually enlarged, 
performance may fall in some cases. I think this as the last fsync having become 
heavy owing to having write in slowly.


I would like to make a itemizing list which can be proof of my patch from you. 
Because DBT-2 benchmark spent lot of time about 1 setting test per 3 - 4 hours. 
Of course, I think it is important to obtain your consent.


Best regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Using ini file to setup replication

2013-07-19 Thread Samrat Revagade
Hi,

I was going through the archives and there was a discussion about
using  ini file to setup
replication.(http://www.postgresql.org/message-id/4c9876b4.9020...@enterprisedb.com).
I think if we work on this proposal and separate out the replication
setup from postgresql.conf file then we can provide more granularity
while setting up the replication parameters.
for example, we can set different values of wal_sender_timeout for
each standby sever.

So i think it is good idea to separate out the replication settings
from postgresql.conf file and put into ini file.
Once it is confirmed then we can extend the ini file to support future
developments into replication.
*for example: for failback safe standby.*

Below I have explained how to to use ini file for failback safe stadby setup:
While discussing feature of fail-back safe standby
(CAF8Q-Gy7xa60HwXc0MKajjkWFEbFDWTG=ggyu1kmt+s2xcq...@mail.gmail.com)
We have decided to use the *ini* file  to configure the fail-back safe standby
here is the link for that:
cad21aocy2_bqvpzjey7s77amnccbxfj+1gphggdbulklav0...@mail.gmail.com
But there is no strong positive/negative feedback for the concept of
introducing the ini file.please have a look at it and  give feedback.


In todays scenario, In replication we only have the 2 ways of
configuring the standby server
1. Asynchronous standby
2. Synchronous standby

With the patch of failback safe standby we have more granularity in
setting up the  standby servers.
1. Default synchronous standby.(AAA)
2. Default asynchronous standby. (BBB)
3. Synchronous standby and also make same standby as a failback safe
standby.(CCC)
4. Asynchronous standby and also make same standby as a failback safe
standby.(DDD)

In failback safe standby we are thinking to add to more granular
settings of replication parameters for example
1. User can set seperate value for wal_sender_timeout for each server.
2. User can set seperate value of synchronous_transfer for each server.

Consider the scenario where user want to setup the 4 standby servers
as explained above so setting for them will be:
 - ini file -
[Server]
standby_name = 'AAA'
synchronous_transfer = commit
wal_sender_timeout = 60

[Server]
standby_name = 'BBB'
synchronous_transfer = none
wal_sender_timeout = 40

[Server]
standby_name = 'CCC'
synchronous_transfer = all
wal_sender_timeout = 50

[Server]
standby_name = 'DDD'
synchronous_transfer = data_flush
wal_sender_timeout = 50

so setting up such a scenario through postgresql.conf file is
impossible and if we try to do that it will add lot of complexity to
the code.
so use of ini file will be the very good choice in this case.

Thank you ,
Samrat


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using ini file to setup replication

2013-07-19 Thread Samrat Revagade
Please find updated hyperlinks:

 Below I have explained how to to use ini file for failback safe stadby setup:
 While discussing feature of fail-back safe standby
 (CAF8Q-Gy7xa60HwXc0MKajjkWFEbFDWTG=ggyu1kmt+s2xcq...@mail.gmail.com)

http://www.postgresql.org/message-id/CAF8Q-Gy7xa60HwXc0MKajjkWFEbFDWTG=ggyu1kmt+s2xcq...@mail.gmail.com

 We have decided to use the *ini* file  to configure the fail-back safe standby
 here is the link for that:
 cad21aocy2_bqvpzjey7s77amnccbxfj+1gphggdbulklav0...@mail.gmail.com

http://www.postgresql.org/message-id/cad21aocy2_bqvpzjey7s77amnccbxfj+1gphggdbulklav0...@mail.gmail.com

Regards,
Samrat Revgade


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using ini file to setup replication

2013-07-19 Thread Andres Freund
On 2013-07-19 14:54:16 +0530, Samrat Revagade wrote:
 I was going through the archives and there was a discussion about
 using  ini file to setup
 replication.(http://www.postgresql.org/message-id/4c9876b4.9020...@enterprisedb.com).
 I think if we work on this proposal and separate out the replication
 setup from postgresql.conf file then we can provide more granularity
 while setting up the replication parameters.
 for example, we can set different values of wal_sender_timeout for
 each standby sever.
 
 So i think it is good idea to separate out the replication settings
 from postgresql.conf file and put into ini file.
 Once it is confirmed then we can extend the ini file to support future
 developments into replication.
 *for example: for failback safe standby.*

I think that introducing another configuration format is a pretty bad
idea. While something new might not turn out to be as bad, we've seen
how annoying a separate configuration format turned out for
recovery.conf.

I'd much rather go ahead and remove the nesting limit of GUCs. That
should give us just about all that can be achieved with an ini file with
a 1 line change. Sometime we might want to extend our format to add ini
like sections but I think that *definitely* should be a separate
proposal.

I've even proposed that in the past in
20130225211533.gd3...@awork2.anarazel.de . I plan to propose an updated
version of that patch (not allowing numeric 2nd level ids) for the next
CF.

So you can just do stuff like:

server.foo.grand_unified_config = value.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH CHECK OPTION for auto-updatable views

2013-07-19 Thread Dean Rasheed
On 18 July 2013 22:27, Stephen Frost sfr...@snowman.net wrote:
 Dean,


 * Stephen Frost (sfr...@snowman.net) wrote:
 Thanks!  This is really looking quite good, but it's a bit late and I'm
 going on vacation tomorrow, so I didn't quite want to commit it yet. :)

 Apologies on this taking a bit longer than I expected, but it's been
 committed and pushed now.  Please take a look and let me know of any
 issues you see with the changes that I made.


Excellent. Thank you! The changes look good to me.

Cheers,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LATERAL quals revisited

2013-07-19 Thread Ashutosh Bapat
I have couple of questions.

On Wed, Jun 26, 2013 at 1:30 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 I've been studying the bug reported at

 http://www.postgresql.org/message-id/20130617235236.GA1636@jeremyevans.local
 that the planner can do the wrong thing with queries like

 SELECT * FROM
   i LEFT JOIN LATERAL (SELECT * FROM j WHERE i.n = j.n) j ON true;

 I think the fundamental problem is that, because the i.n = j.n clause
 appears syntactically in WHERE, the planner is treating it as if it were
 an inner-join clause; but really it ought to be considered a clause of
 the upper LEFT JOIN.  That is, semantically this query ought to be
 equivalent to

 SELECT * FROM
   i LEFT JOIN LATERAL (SELECT * FROM j) j ON i.n = j.n;

 However, because distribute_qual_to_rels doesn't see the clause as being
 attached to the outer join, it's not marked with the correct properties
 and ends up getting evaluated in the wrong place (as a filter clause
 not a join filter clause).  The bug is masked in the test cases we've
 used so far because those cases are designed to let the clause get
 pushed down into the scan of the inner relation --- but if it doesn't
 get pushed down, it's evaluated the wrong way.

 After some contemplation, I think that the most practical way to fix
 this is for deconstruct_recurse and distribute_qual_to_rels to
 effectively move such a qual to the place where it logically belongs;
 that is, rather than processing it when we look at the lower WHERE
 clause, set it aside for a moment and then add it back when looking at
 the ON clause of the appropriate outer join.  This should be reasonably
 easy to do by keeping a list of postponed lateral clauses while we're
 scanning the join tree.

 For there to *be* a unique appropriate outer join, we need to require
 that a LATERAL-using qual clause that's under an outer join contain
 lateral references only to the outer side of the nearest enclosing outer
 join.  There's no such restriction in the spec of course, but we can
 make it so by refusing to flatten a sub-select if pulling it up would
 result in having a clause in the outer query that violates this rule.
 There's already some code in prepjointree.c (around line 1300) that
 attempts to enforce this, though now that I look at it again I'm not
 sure it's covering all the bases.  We may need to extend that check.


Why do we need this restriction? Wouldn't a place (specifically join qual
at such a place) in join tree where all the participating relations are
present, serve as a place where the clause can be applied. E.g. in the query

select * from tab1 left join tab2 t2 using (val) left join lateral (select
val from tab2 where val2 = tab1.val * t2.val) t3 using (val);

Can't we apply (as a join qual) the qual val2 = tab1.val * t2.val at a
place where we are computing join between tab1, t2 and t3?


 I'm inclined to process all LATERAL-using qual clauses this way, ie
 postpone them till we recurse back up to a place where they can
 logically be evaluated.  That won't make any real difference when no
 outer joins are present, but it will eliminate the ugliness that right
 now distribute_qual_to_rels is prevented from sanity-checking the scope
 of the references in a qual when LATERAL is present.  If we do it like
 this, we can resurrect full enforcement of that sanity check, and then
 throw an error if any postponed quals are left over when we're done
 recursing.


Parameterized nested loop join would always be able to evaluate a LATERAL
query. Instead of throwing error, why can't we choose that as the default
strategy whenever we fail to flatten subquery?

Can we put the clause with lateral references at its appropriate place
while flattening the subquery? IMO, that will be cleaner and lesser work
than first pulling the clause and then putting it back again? Right, now,
we do not have that capability in pull_up_subqueries() but given its
recursive structure, it might be easier to do it there.


 Thoughts, better ideas?

 regards, tom lane


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Postgres Database Company


Re: [HACKERS] [v9.4] row level security

2013-07-19 Thread Dean Rasheed
On 19 July 2013 04:09, Greg Smith g...@2ndquadrant.com wrote:
 On 7/18/13 11:03 PM, Stephen Frost wrote:

 Wasn't there a wiki page about this
 feature which might also help?  Bigger question- is it correct for the
 latest version of the patch?


 https://wiki.postgresql.org/wiki/RLS has accumulated a lot of older debris
 that may or may not be useful here.

 This useful piece was just presented at PGCon:
 http://www.pgcon.org/2013/schedule/attachments/273_PGcon2013-kaigai-row-level-security.pdf
 That is very up to date intro to the big picture issues.


Hi,

I took a look at this patch too. I didn't read all the code in detail,
but there was one area I wondered about --- is it still necessary to
add the new field rowsec_relid to RangeTblEntry?

As far as I can see, it is only used in the new function getrelid()
which replaces the old macro of the same name, so that it can work if
it is passed the index of the sourceRelation subquery RTE instead of
the resultRelation. This seems to be encoding a specific assumption
that a subquery RTE is always going to come from row-level security
expansion.

Is it the case that this can only happen from expand_targetlist(), in
which case why not pass both source_relation and result_relation to
expand_targetlist(), which I think will make that code neater. As it
stands, what expand_targetlist() sees as result_relation is actually
source_relation, which getrelid() then handles specially. Logically I
think expand_targetlist() should pass the actual result_relation to
getrelid(), opening the target table, and then pass source_relation to
makeVar() when building new TLEs.

With that change to expand_targetlist(), the change to getrelid() may
be unnecessary, and then also the new rowsec_relid field in
RangeTblEntry may not be needed.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Allow background workers to be started dynamically.

2013-07-19 Thread Robert Haas
On Fri, Jul 19, 2013 at 7:24 AM, Andres Freund and...@2ndquadrant.com wrote:
 The changes here make it impossible to write a bgworker which properly
 works in 9.3 and 9.4. Was that intended? If so, the commit message
 should mention the compatibility break...

Yeah, sorry, I probably should have mentioned that.  The structure
needs to be fixed size for us to store it in shared memory.

 If it was intended I propose changing the signature for 9.3 as
 well. There's just no point in releasing 9.3 when we already know which
 trivial but breaking change will be required for 9.4

I think that would be a good idea.  And I'd also propose getting rid
of bgw_sighup and bgw_sigterm in both branches, while we're at it.
AFAICT, they don't add any functionality, and they're basically
unusable for dynamically started background workers.  Probably better
not to get people to used to using them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Eliminating PD_ALL_VISIBLE, take 2

2013-07-19 Thread Robert Haas
On Mon, Jul 15, 2013 at 1:41 PM, Jeff Davis pg...@j-davis.com wrote:
 I'm of the opinion that we ought to extract the parts of the patch
 that hold the VM pin for longer, review those separately, and if
 they're good and desirable, apply them.

 I'm confused. My patch holds a VM page pinned for those cases where
 PD_ALL_VISIBLE is currently used -- scans or insert/update/delete. If we
 have PD_ALL_VISIBLE, there's no point in the cache, right?

Hmm.  You might be right.  I thought there might be some benefit
there, but I guess we're not going to clear the bit more than once, so
maybe not.

 To check visibility from an index scan, you either need to pin a heap
 page or a VM page. Why would checking the heap page be cheaper? Is it
 just other code in the VM-testing path that's slower? Or is there
 concurrency involved in your measurements which may indicate contention
 over VM pages?

Well, I have seen some data that hints at contention problems with VM
pages, but it's not conclusive.  But the real issue is that, if the
index-only scan finds the VM page not set, it still has to check the
heap page.  Thus, it ends up accessing two pages rather than one, and
it turns out that's more expensive.  It's unfortunately hard to
predict whether the cost of checking VM first will pay off.  It's a
big win if we learn that we don't need to look at the heap page
(because we don't need to read, lock, pin, or examine it, in that
case) but it's a loss if we do (because checking the VM isn't free).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Allow background workers to be started dynamically.

2013-07-19 Thread Magnus Hagander
On Fri, Jul 19, 2013 at 1:23 PM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Jul 19, 2013 at 7:24 AM, Andres Freund and...@2ndquadrant.com wrote:
 The changes here make it impossible to write a bgworker which properly
 works in 9.3 and 9.4. Was that intended? If so, the commit message
 should mention the compatibility break...

 Yeah, sorry, I probably should have mentioned that.  The structure
 needs to be fixed size for us to store it in shared memory.

 If it was intended I propose changing the signature for 9.3 as
 well. There's just no point in releasing 9.3 when we already know which
 trivial but breaking change will be required for 9.4

 I think that would be a good idea.  And I'd also propose getting rid
 of bgw_sighup and bgw_sigterm in both branches, while we're at it.
 AFAICT, they don't add any functionality, and they're basically
 unusable for dynamically started background workers.  Probably better
 not to get people to used to using them.

+1. Much better to take that pain now, before we have made a production release.


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Allow background workers to be started dynamically.

2013-07-19 Thread Andres Freund
Hi,

On 2013-07-19 08:23:25 -0400, Robert Haas wrote:
 On Fri, Jul 19, 2013 at 7:24 AM, Andres Freund and...@2ndquadrant.com wrote:
  The changes here make it impossible to write a bgworker which properly
  works in 9.3 and 9.4. Was that intended? If so, the commit message
  should mention the compatibility break...

 Yeah, sorry, I probably should have mentioned that.  The structure
 needs to be fixed size for us to store it in shared memory.

  If it was intended I propose changing the signature for 9.3 as
  well. There's just no point in releasing 9.3 when we already know which
  trivial but breaking change will be required for 9.4

 I think that would be a good idea.

 And I'd also propose getting rid
 of bgw_sighup and bgw_sigterm in both branches, while we're at it.
 AFAICT, they don't add any functionality, and they're basically
 unusable for dynamically started background workers.  Probably better
 not to get people to used to using them.

I don't have a problem with getting rid of those, it's easy enough to
register them inside the worker and it's safe since we start with
blocked signals. I seem to remember some discussion about why they were
added but I can't find a reference anymore. Alvaro, do you remember?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] getting rid of SnapshotNow

2013-07-19 Thread Robert Haas
On Fri, Jul 19, 2013 at 1:27 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Noah Misch n...@leadboat.com writes:
 To me, the major advantage of removing SnapshotNow is to force all
 third-party code to reevaluate.  But that could be just as well
 achieved by renaming it to, say, SnapshotImmediate.  If there are
 borderline-legitimate SnapshotNow uses in our code base, I'd lean
 toward a rename instead.  Even if we decide to remove every core use,
 third-party code might legitimately reach a different conclusion on
 similar borderline cases.

 Meh.  If there is third-party code with a legitimate need for
 SnapshotNow, all we'll have done is to create an annoying version
 dependency for them.  So if we think that's actually a likely scenario,
 we shouldn't rename it.  But the entire point of this change IMO is that
 we *don't* think there is a legitimate use-case for SnapshotNow.

 Indeed, I'm thinking I don't believe in SnapshotSelf anymore either.
 It's got all the same consistency issues as SnapshotNow.  In fact, it
 has *more* issues, because it's also vulnerable to weirdnesses caused by
 inconsistent ordering of tuple updates among multiple tuples updated by
 the same command.

 Why not tell people to use SnapshotDirty if they need a
 not-guaranteed-consistent result?  At least then it's pretty obvious
 that you're getting some randomness in with your news.

You know, I didn't really consider that before, but I kind of like it.
 I think that would be entirely suitable (and perhaps better) for
pgstattuple and get_actual_variable_range().

On further reflection, I think perhaps pgrowlocks should just register
a fresh MVCC snapshot and use that.  Using SnapshotDirty would return
TIDs of unseen tuples, which does not seem to be what is wanted there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ODBC] getting rid of SnapshotNow

2013-07-19 Thread Andres Freund
On 2013-07-19 08:57:01 +0900, Inoue, Hiroshi wrote:
 I had the idea they were used for a client-side implementation of WHERE
 CURRENT OF.  Perhaps that's dead code and could be removed entirely?
 
 It's been reported that ODBC still uses them.
 
 Though PostgreSQL's TID is similar to Orale's ROWID, it is transient
 and changed after update operations unfortunately. I implemented
 the currtid_xx functions to supplement the difference. For example
 
   currtid(relname, original tid)
 
 (hopefully) returns the current tid of the original row when it is
 updated.

That is only guaranteed to work though when you're in a transaction old
enough to prevent removal of the old or intermediate row versions. E.g.
BEGIN;
INSERT INTO foo...; -- last tid (0, 1)
COMMIT;
BEGIN;
SELECT currtid(foo, '(0, 1'));
COMMIT;

can basically return no or even an arbitrarily different row. Same with
an update...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using ini file to setup replication

2013-07-19 Thread Samrat Revagade
 *for example: for failback safe standby.*

I think that introducing another configuration format is a pretty bad
idea. While something new might not turn out to be as bad, we've seen
how annoying a separate configuration format turned out for
recovery.conf.

Its not totally different way of configuration.
ini file will be parsed in the same way as postgresql.conf.
just want to separate out the replication parameters, to make simpler
configuration for future developments in the field of replication such
as failback-safe standby.

 So you can just do stuff like:

 server.foo.grand_unified_config = value.

But according to your approach and considering the use case of
failback safe standby
the parameters into the postgresql.conf will vary dynamically, and i
don't think so doing this in the postgresql.conf is a good idea
because it already contains whole bunch of parameters:

for example:
if i want to configure 2 servers then it will add 6 lines,for 3 -9, for 4-12
setting's for particular server will be:

considering the way of setting value to conf parameters  : guc . value

standby_name.'AAA'
synchronous_transfer. commit
wal_sender_timeout.60


Regards,
Samrat
Samrat Revgade


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] getting rid of SnapshotNow

2013-07-19 Thread Andres Freund
On 2013-07-19 01:27:41 -0400, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  To me, the major advantage of removing SnapshotNow is to force all
  third-party code to reevaluate.  But that could be just as well
  achieved by renaming it to, say, SnapshotImmediate.  If there are
  borderline-legitimate SnapshotNow uses in our code base, I'd lean
  toward a rename instead.  Even if we decide to remove every core use,
  third-party code might legitimately reach a different conclusion on
  similar borderline cases.

I don't think there are many people that aren't active on -hackers that
can actually understand the implications of using SnapshotNow. Given
-hackers hasn't fully grasped them in several cases... And even if those
borderline cases are safe, that's really only valid for a specific
postgres version. Catering to that doesn't seem like a good idea to me.

 Indeed, I'm thinking I don't believe in SnapshotSelf anymore either.
 It's got all the same consistency issues as SnapshotNow.  In fact, it
 has *more* issues, because it's also vulnerable to weirdnesses caused by
 inconsistent ordering of tuple updates among multiple tuples updated by
 the same command.

Hm. I kind of can see the point of it in constraint code where it
probably would be rather hard to remove usage of it, but e.g. the
sepgsql usage looks pretty dubious to me.
At least in the cases where the constraint code uses them I don't think
the SnapshotNow dangers apply since those specific rows should be locked
et al.

The selinux usage looks like a design flaw to me, but I don't really
understand that code, so I very well may be wrong.

 Why not tell people to use SnapshotDirty if they need a
 not-guaranteed-consistent result?  At least then it's pretty obvious
 that you're getting some randomness in with your news.

Especially if we're going to lower the lock level of some commands, but
even now, that opens us to more issues due to nonmatching table
definitions et al. That doesn't sound like a good idea to me.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Adding new joining alghoritm to postgresql

2013-07-19 Thread tubadzin
Hi.I'm a little confused. 1.I have source code 9.2.4. version 
fromhttp://www.postgresql.org/ftp/source/ 2.I want to add new alghoritm to 
index nested loops join, merge join and hash join. I have Executor catalog in 
src catalag containing nodeHash.c, nodeHasjoin.c, nodeMergejoin and 
nodeNestloop.c 3.After changes, I want to compile postgresql and use it. 
4.Problem is:a)I do not know which library is responsible for this 
functionality. I understand, that I have to compile src and replace library (I 
don't know which library) in path where Postgresql in installed:C:\Program 
Files (x86)\PostgreSQL\9.2 b)I don't know how use files/library (which 
library?) with visual studio 2010 and how compile it. Thanks for you all 
answers. Tom.

Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-07-19 Thread Greg Smith

On 7/19/13 3:53 AM, KONDO Mitsumasa wrote:

Recently, a user who think system availability is important uses
synchronous replication cluster.


If your argument for why it's OK to ignore bounding crash recovery on 
the master is that it's possible to failover to a standby, I don't think 
that is acceptable.  PostgreSQL users certainly won't like it.



I want you to read especially point that is line 631, 651, and 656.
MAX_WRITEBACK_PAGES is 1024 (1024 * 4096 byte).


You should read http://www.westnet.com/~gsmith/content/linux-pdflush.htm 
to realize everything you're telling me about the writeback code and its 
congestion logic I knew back in 2007.  The situation is even worse than 
you describe, because this section of Linux has gone through multiple, 
major revisions since then.  You can't just say here is the writeback 
source code; you have to reference each of the commonly deployed 
versions of the writeback feature to tell how this is going to play out 
if released.  There are four major ones I pay attention to.  The old 
kernel style as see in RHEL5/2.6.18--that's what my 2007 paper 
discussed--the similar code but with very different defaults in 2.6.22, 
the writeback method/tuning in RHEL6/Debian Squeeze/2.6.32, and then 
there are newer kernels.  (The newer ones separate out into a few 
branches too, I haven't mapped those as carefully yet)


If you tried to model your feature on Linux's approach here, what that 
means is that the odds of an ugly feedback loop here are even higher. 
You're increasing the feedback on what's already a bad situation that 
triggers trouble for people in the field.  When Linux's congestion logic 
causes checkpoint I/O spikes to get worse than they otherwise might be, 
people panic because it seems like they stopped altogether.  There are 
some examples of what really bad checkpoints look like in 
http://www.2ndquadrant.com/static/2quad/media/pdfs/talks/WriteStuff-PGCon2011.pdf 
if you want to see some of them.  That's the talk I did around the same 
time I was trying out spreading the database fsync calls out over a 
longer period.


When I did that, checkpoints became even less predictable, and that was 
a major reason behind why I rejected the approach.  I think your 
suggestion will have the same problem.  You just aren't generating test 
cases with really large write workloads yet to see it.  You also don't 
seem afraid of how exceeding the checkpoint timeout is a very bad thing yet.



In addition, if you set a large value of a checkpoint_timeout or
checkpoint_complete_taget, you have said that performance is improved,
but is it true in all the cases?


The timeout, yes.  Throughput is always improved by increasing 
checkpoint_timeout.  Less checkpoints per unit of time increases 
efficiency.  Less writes of the most heavy accessed buffers happen per 
transaction.  It is faster because you are doing less work, which on 
average is always faster than doing more work.  And doing less work 
usually beats doing more work, but doing it smarter.


If you want to see how much work per transaction a test is doing, track 
the numbers of buffers written at the beginning/end of your test via 
pg_stat_bgwriter.  Tests that delay checkpoints will show a lower total 
number of writes per transaction.  That seems more efficient, but it's 
efficiency mainly gained by ignoring checkpoint_timeout.



When a checkpoint complication target is actually enlarged,
performance may fall in some cases. I think this as the last fsync
having become heavy owing to having write in slowly.


I think you're confusing throughput and latency here.  Increasing the 
checkpoint timeout, or to a lesser extent the completion target, on 
average that increases throughput.  It results in less work, and the 
more/less work amount is much more important than worrying about 
scheduler details.  Now matter how efficient a given write is, whether 
you've sorted it across elevator horizon boundary A or boundary B, it's 
better not do it at all.


But having less checkpoints makes latency worse sometimes too.  Whether 
latency or throughput is considered the more important thing is very 
complicated.  Having checkpoint_completion_target as the knob to control 
the latency/throughput trade-off hasn't worked out very well.  No one 
has done a really comprehensive look at this trade-off since the 8.3 
development.  I got halfway through it for 9.1, we figured out that the 
fsync queue filling was actually responsible for most of my result 
variation, and then Robert fixed that.  It was a big enough change that 
all my data from before that I had to throw out as no longer relevant.


By the way:  if you have a theory like the last fsync having become 
heavy for why something is happening, measure it.  Set log_min_messages 
to debug2 and you'll get details about every single fsync in your logs. 
 I did that for all my tests that led me to conclude fsync delaying on 
its own didn't help that problem.  I was 

Re: [HACKERS] [ODBC] getting rid of SnapshotNow

2013-07-19 Thread Hiroshi Inoue

(2013/07/19 22:03), Andres Freund wrote:

On 2013-07-19 08:57:01 +0900, Inoue, Hiroshi wrote:

I had the idea they were used for a client-side implementation of WHERE
CURRENT OF.  Perhaps that's dead code and could be removed entirely?


It's been reported that ODBC still uses them.


Though PostgreSQL's TID is similar to Orale's ROWID, it is transient
and changed after update operations unfortunately. I implemented
the currtid_xx functions to supplement the difference. For example

currtid(relname, original tid)

(hopefully) returns the current tid of the original row when it is
updated.


That is only guaranteed to work though when you're in a transaction old
enough to prevent removal of the old or intermediate row versions. E.g.


Yes it's what I meant by (hopefully).
At the time when I implemented currtid(), I was able to use TIDs in
combination with OIDs.

regards,
Hiroshi Inoue


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.4] row level security

2013-07-19 Thread Tom Lane
Dean Rasheed dean.a.rash...@gmail.com writes:
 I took a look at this patch too. I didn't read all the code in detail,
 but there was one area I wondered about --- is it still necessary to
 add the new field rowsec_relid to RangeTblEntry?

 As far as I can see, it is only used in the new function getrelid()
 which replaces the old macro of the same name, so that it can work if
 it is passed the index of the sourceRelation subquery RTE instead of
 the resultRelation. This seems to be encoding a specific assumption
 that a subquery RTE is always going to come from row-level security
 expansion.

I haven't read the patch at all, but I would opine that anything that's
changing the behavior of getrelid() is broken by definition.  Something
is being done at the wrong level of abstraction if you think that you
need to do that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Simple documentation typo patch

2013-07-19 Thread Robert Haas
On Thu, Jul 18, 2013 at 2:53 PM, David Christensen da...@endpoint.com wrote:
 Hey folks, this corrects typos going back to 
 75c6519ff68dbb97f73b13e9976fb8075bbde7b8 where EUC_JIS_2004 and 
 SHIFT_JIS_2004 were first added.

 These typos are present in all supported major versions of PostgreSQL, back 
 through 8.3; I didn't look any further than that. :-)

Committed and back-patched to all currently-supported versions.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] confusing typedefs in jsonfuncs.c

2013-07-19 Thread Andrew Dunstan


On 07/18/2013 09:20 PM, Peter Eisentraut wrote:

The new jsonfuncs.c has some confusing typedef scheme.  For example, it
has a bunch of definitions like this:

typedef struct getState
{
 ...
} getState, *GetState;

So GetState is a pointer to getState.  I have never seen that kind of
convention before.

This then leads to code like

GetStatestate;

state = palloc0(sizeof(getState));

which has useless mental overhead.

But the fact that GetState is really a pointer isn't hidden at all,
because state is then derefenced with - or cast from or to void*.  So
whatever abstraction might have been intended isn't there at all.  And
all of this is an intra-file interface anyway.

And to make this even more confusing, other types such as ColumnIOData
and JsonLexContext are not pointers but structs directly.

I think a more typical PostgreSQL code convention is to use capitalized
camelcase for structs, and use explicit pointers for pointers.  I have
attached a patch that cleans this up, in my opinion.



I don't have a problem with this. Sometimes when you've stared at 
something for long enough you fail to notice such things, so I welcome 
more eyeballs on the code.


Note that this is an externally visible API, so anyone who has written 
an extension against it will possibly find it broken. But that's all the 
more reason to clean it now, before it makes it to a released version.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LATERAL quals revisited

2013-07-19 Thread Tom Lane
Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes:
 On Wed, Jun 26, 2013 at 1:30 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 For there to *be* a unique appropriate outer join, we need to require
 that a LATERAL-using qual clause that's under an outer join contain
 lateral references only to the outer side of the nearest enclosing outer
 join.  There's no such restriction in the spec of course, but we can
 make it so by refusing to flatten a sub-select if pulling it up would
 result in having a clause in the outer query that violates this rule.
 There's already some code in prepjointree.c (around line 1300) that
 attempts to enforce this, though now that I look at it again I'm not
 sure it's covering all the bases.  We may need to extend that check.

 Why do we need this restriction? Wouldn't a place (specifically join qual
 at such a place) in join tree where all the participating relations are
 present, serve as a place where the clause can be applied.

No.  If you hoist a qual that appears below an outer join to above the
outer join, you get wrong results in general: you might eliminate rows
from the outer side of the join, which a qual from within the inner side
should never be able to do.

 select * from tab1 left join tab2 t2 using (val) left join lateral (select
 val from tab2 where val2 = tab1.val * t2.val) t3 using (val);
 Can't we apply (as a join qual) the qual val2 = tab1.val * t2.val at a
 place where we are computing join between tab1, t2 and t3?

This particular example doesn't violate the rule I gave above, since
both tab1 and t2 are on the left side of the join to the lateral
subquery, and the qual doesn't have to get hoisted *past* an outer join,
only to the outer join of {tab1,t2} with {t3}.

 I'm inclined to process all LATERAL-using qual clauses this way, ie
 postpone them till we recurse back up to a place where they can
 logically be evaluated.  That won't make any real difference when no
 outer joins are present, but it will eliminate the ugliness that right
 now distribute_qual_to_rels is prevented from sanity-checking the scope
 of the references in a qual when LATERAL is present.  If we do it like
 this, we can resurrect full enforcement of that sanity check, and then
 throw an error if any postponed quals are left over when we're done
 recursing.

 Parameterized nested loop join would always be able to evaluate a LATERAL
 query. Instead of throwing error, why can't we choose that as the default
 strategy whenever we fail to flatten subquery?

I think you misunderstood.  That error would only be a sanity check that
we'd accounted for all qual clauses, it's not something a user should
ever see.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgindent behavior we could do without

2013-07-19 Thread Andrew Dunstan


On 07/18/2013 11:30 PM, Tom Lane wrote:

Noah Misch n...@leadboat.com writes:

On Thu, Jul 18, 2013 at 12:27:21AM -0400, Tom Lane wrote:

It's always annoyed me that pgindent insists on adjusting vertical
whitespace around #else and related commands.  This has, for example,
rendered src/include/storage/barrier.h nigh illegible: you get things
like

Agreed.  I've similarly disliked how pgindent adds a blank line between an #if
and a multi-line comment, like at the top of get_restricted_token().

AFAICT it forces a blank line before a multiline comment in most
contexts; #if isn't special (though it does seem to avoid doing that
just after a left brace).  I too don't much care for that behavior,
although it's not as detrimental to readability as removing blank lines
can be.

In general, pgindent isn't nearly as good about managing vertical
whitespace as it is about horizontal spacing ...



I agree.

I should perhaps note that when I rewrote pgindent, I deliberately 
didn't editorialize about its behaviour - but I did intend to make it 
more maintainable and simpler to change stuff like this, and so it is :-)


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] LOCK TABLE Permissions

2013-07-19 Thread Stephen Frost
Greetings,

  We've run into a curious case and I'd like to solicit feedback
  regarding a possible change to the access rights required to acquire
  locks on a relation.  Specifically, we have a process which normally
  INSERTs into a table and another process which Exclusive locks that
  same table in order to syncronize other processing.  We then ran into
  a case where we didn't actually want to INSERT but still wanted to
  have the syncronization happen.  Unfortunately, we don't allow
  LOCK TABLE to acquire RowExclusive unless you have UPDATE, DELETE, or
  TRUNCATE privileges.

  My first impression is that the current code was just overly
  simplistic regarding what level of permissions are required for a
  given lock type and that it wasn't intentional to deny processes which
  have INSERT privileges from acquiring RowExclusive (as they can do so
  anyway using an actual INSERT).  Therefore, I'd like to propose the
  below simple 3-line patch to correct this.

  Thoughts?  Objections to back-patching?

Thanks,

Stephen

diff --git a/src/backend/commands/lockcmds.c b/src/backend/commands/lockcmds.c
new file mode 100644
index 49950d7..60f54c5 100644
*** a/src/backend/commands/lockcmds.c
--- b/src/backend/commands/lockcmds.c
*** LockTableAclCheck(Oid reloid, LOCKMODE l
*** 174,179 
--- 174,182 
if (lockmode == AccessShareLock)
aclresult = pg_class_aclcheck(reloid, GetUserId(),
  ACL_SELECT);
+   else if (lockmode == RowExclusiveLock)
+   aclresult = pg_class_aclcheck(reloid, GetUserId(),
+ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE);
else
aclresult = pg_class_aclcheck(reloid, GetUserId(),
  ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE);



signature.asc
Description: Digital signature


Re: [HACKERS] Foreign Tables as Partitions

2013-07-19 Thread Robert Haas
On Thu, Jul 18, 2013 at 8:56 PM, David Fetter da...@fetter.org wrote:
 Please find attached a PoC patch to implement $subject.

 So far, with a lot of help from Andrew Gierth, I've roughed out an
 implementation which allows you to ALTER FOREIGN TABLE so it inherits
 a local table.

 TBD: CREATE FOREIGN TABLE ... INHERITS ..., docs, regression testing,
 etc., etc.

I think this could be useful, but it's going to take more than a
three-line patch.  Removing code that prohibits things is easy; what's
hard is verifying that nothing else breaks as a result.  And I bet it
does.

This functionality was actually present in the original submission for
foreign tables.  I ripped it out before commit because I didn't think
all of the interactions with other commands had been adequately
considered.  But I think they did consider it to some degree, which
this patch does not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [ODBC] [HACKERS] getting rid of SnapshotNow

2013-07-19 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-07-20 00:49:11 +0900, Hiroshi Inoue wrote:
 Using SnapshotSelf instead of SnapshotNow for currtid_ () wouldn't
 matter.

 I think it actually might. You could get into dicey situations if you
 use currtid_ in a query performing updates or inserts because it would
 see the to-be-inserted tuple...

I'm pretty sure Hiroshi-san was only opining about whether it would
matter for ODBC's usage.  IIUC, ODBC is using this function to re-fetch
rows that it inserted, updated, or at least selected-for-update in a
previous command of the current transaction, so actually any snapshot
would do fine.

In any case, since I moved the goalposts by suggesting that SnapshotSelf
is just as dangerous as SnapshotNow, what we need to know is whether
it'd be all right to change this code to use a fresh MVCC snapshot;
and if not, why not.  It's pretty hard to see a reason why client-side
code would want to make use of the results of a non-MVCC snapshot.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using ini file to setup replication

2013-07-19 Thread Josh Berkus

 I've even proposed that in the past in
 20130225211533.gd3...@awork2.anarazel.de . I plan to propose an updated
 version of that patch (not allowing numeric 2nd level ids) for the next
 CF.
 
 So you can just do stuff like:
 
 server.foo.grand_unified_config = value.

Sounds good to me.  I can see lots of uses for that.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] getting rid of SnapshotNow

2013-07-19 Thread Hiroshi Inoue

(2013/07/18 21:46), Robert Haas wrote:

There seems to be a consensus that we should try to get rid of
SnapshotNow entirely now that we have MVCC catalog scans, so I'm
attaching two patches that together come close to achieving that goal:


...


With that done, the only remaining uses of SnapshotNow in our code
base will be in currtid_byreloid() and currtid_byrelname().  So far no
one on this list has been able to understand clearly what the purpose
of those functions is, so I'm copying this email to pgsql-odbc in case
someone there can provide more insight.  If I were a betting man, I'd
bet that they are used in contexts where the difference between
SnapshotNow and SnapshotSelf wouldn't matter there, either.


Using SnapshotSelf instead of SnapshotNow for currtid_ () wouldn't
 matter.

regards,
Hiroshi Inoue




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] getting rid of SnapshotNow

2013-07-19 Thread Andres Freund
On 2013-07-20 00:49:11 +0900, Hiroshi Inoue wrote:
 (2013/07/18 21:46), Robert Haas wrote:
 There seems to be a consensus that we should try to get rid of
 SnapshotNow entirely now that we have MVCC catalog scans, so I'm
 attaching two patches that together come close to achieving that goal:
 
 ...
 
 With that done, the only remaining uses of SnapshotNow in our code
 base will be in currtid_byreloid() and currtid_byrelname().  So far no
 one on this list has been able to understand clearly what the purpose
 of those functions is, so I'm copying this email to pgsql-odbc in case
 someone there can provide more insight.  If I were a betting man, I'd
 bet that they are used in contexts where the difference between
 SnapshotNow and SnapshotSelf wouldn't matter there, either.
 
 Using SnapshotSelf instead of SnapshotNow for currtid_ () wouldn't
  matter.

I think it actually might. You could get into dicey situations if you
use currtid_ in a query performing updates or inserts because it would
see the to-be-inserted tuple...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] getting rid of SnapshotNow

2013-07-19 Thread Alvaro Herrera
Robert Haas escribió:
 On Fri, Jul 19, 2013 at 1:27 AM, Tom Lane t...@sss.pgh.pa.us wrote:

  Why not tell people to use SnapshotDirty if they need a
  not-guaranteed-consistent result?  At least then it's pretty obvious
  that you're getting some randomness in with your news.

 On further reflection, I think perhaps pgrowlocks should just register
 a fresh MVCC snapshot and use that.  Using SnapshotDirty would return
 TIDs of unseen tuples, which does not seem to be what is wanted there.

I think seeing otherwise invisible rows is useful in pgrowlocks.  It
helps observe the effects on tuples written by concurrent transactions
during experimentation.  But then, maybe this functionality belongs in
pageinspect instead.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LOCK TABLE Permissions

2013-07-19 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:

 if (lockmode == AccessShareLock)
 aclresult = pg_class_aclcheck(reloid, GetUserId(),
   ACL_SELECT);
 +   else if (lockmode == RowExclusiveLock)
 +   aclresult = pg_class_aclcheck(reloid, GetUserId(),
 +ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE);
 else
 aclresult = pg_class_aclcheck(reloid, GetUserId(),
   ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE);

Perhaps it would be better to refactor with a local variable for the
aclmask and just one instance of the pg_class_aclcheck call.  Also, I'm
pretty sure that the documentation work needed is more extensive
than the actual patch ;-).  Otherwise, I don't see a problem with this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-07-19 Thread Josh Berkus
On 07/15/2013 10:19 AM, Jeff Davis wrote:
 On Thu, 2013-07-11 at 10:51 -0400, Nicholas White wrote:
 I've attached a revised version that fixes the issues above:
 
 I'll get to this soon, sorry for the delay.
 
 Regards,
   Jeff Davis

So ... are you doing a final review of this for the CF, Jeff?  We need
to either commit it or bounce it to the next CF.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FKey not enforced resulting in broken Dump/Reload

2013-07-19 Thread Robert Haas
On Fri, Jul 19, 2013 at 12:58 PM, Rod Taylor rod.tay...@gmail.com wrote:
 A poorly coded trigger on the referencing table has the ability to break
 foreign keys, and as a result create a database which cannot be dumped and
 reloaded.

 The BEFORE DELETE trigger accidentally does RETURN NEW, which suppresses the
 DELETE action by the foreign key trigger. This allows the record from the
 referenced table to be deleted and the record in the referencing table to
 remain in place.

 While I don't expect Pg to do what the coder meant, but it should throw an
 error and not leave foreign key'd data in an invalid state.

This is a known limitation of our foreign key machinery.  It might
well be susceptible to improvement, but I wouldn't count on anyone
rewriting it in the near future.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Minmax indexes

2013-07-19 Thread Josh Berkus
On 07/18/2013 10:39 PM, Alvaro Herrera wrote:
 To scan the index, we first obtain the TID of index tuple for page 0.  If
 this returns a valid TID, we read that tuple to determine the min/max bounds
 for this page range.  If it returns invalid, then the range is unsummarized,
 and the scan must return the whole range as needing scan.  After this
 index entry has been processed, we obtain the TID of the index tuple for
 page 0+pagesPerRange (currently this is a compile-time constant, but
 there's no reason this cannot be a per-index property).  Continue adding
 pagesPerRange until we reach the end of the heap.

Conceptually, this sounds like a good initial solution to the update
problem.

I still think we could do incremental updates to the minmax indexes per
the idea I discussed, but that could be a later version.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] getting rid of SnapshotNow

2013-07-19 Thread Alvaro Herrera
Robert Haas escribió:
 On Fri, Jul 19, 2013 at 12:29 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:

  I think seeing otherwise invisible rows is useful in pgrowlocks.  It
  helps observe the effects on tuples written by concurrent transactions
  during experimentation.  But then, maybe this functionality belongs in
  pageinspect instead.
 
 It does seem like it should be useful, at least as an option.  But I
 feel like changing that ought to be separate from getting rid of
 SnapshotNow.  It seems like too big of a behavior change to pass off
 as a harmless tweak.

Agreed.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] getting rid of SnapshotNow

2013-07-19 Thread Robert Haas
On Fri, Jul 19, 2013 at 12:29 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Robert Haas escribió:
 On Fri, Jul 19, 2013 at 1:27 AM, Tom Lane t...@sss.pgh.pa.us wrote:

  Why not tell people to use SnapshotDirty if they need a
  not-guaranteed-consistent result?  At least then it's pretty obvious
  that you're getting some randomness in with your news.

 On further reflection, I think perhaps pgrowlocks should just register
 a fresh MVCC snapshot and use that.  Using SnapshotDirty would return
 TIDs of unseen tuples, which does not seem to be what is wanted there.

 I think seeing otherwise invisible rows is useful in pgrowlocks.  It
 helps observe the effects on tuples written by concurrent transactions
 during experimentation.  But then, maybe this functionality belongs in
 pageinspect instead.

It does seem like it should be useful, at least as an option.  But I
feel like changing that ought to be separate from getting rid of
SnapshotNow.  It seems like too big of a behavior change to pass off
as a harmless tweak.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] getting rid of SnapshotNow

2013-07-19 Thread Robert Haas
On Fri, Jul 19, 2013 at 12:51 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Robert Haas escribió:
 On Fri, Jul 19, 2013 at 12:29 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:

  I think seeing otherwise invisible rows is useful in pgrowlocks.  It
  helps observe the effects on tuples written by concurrent transactions
  during experimentation.  But then, maybe this functionality belongs in
  pageinspect instead.

 It does seem like it should be useful, at least as an option.  But I
 feel like changing that ought to be separate from getting rid of
 SnapshotNow.  It seems like too big of a behavior change to pass off
 as a harmless tweak.

 Agreed.

So any change we make to pgrowlocks is going to have some behavior consequences.

1. If we use SnapshotSelf, then nobody will notice the difference
unless this is used as part of a query that locks or modifies tuples
in the table being examined.  But in that case you might see the
results of the current query.  Thus, I think this is the smallest
possible behavior change, but Tom doesn't like SnapshotSelf any more
than he likes SnapshotNow.

2. If we use SnapshotDirty, then the difference is probably
noticeable, because you'll see the results of concurrent, uncommitted
transactions.  Maybe useful, but probably shouldn't be the new
default.

3. If we use a fresh MVCC snapshot, then when you scan a table you'll
see the state of play as of the beginning of your scan rather than the
state of play as of when your scan reaches the target page.  This
might be noticeable on a large table.  However, it might also be
thought an improvement.

4. If we use GetActiveSnapshot, all the comments about about a fresh
MVCC snapshot still apply.  However, the snapshot in question could be
even more stale, especially in repeatable read or serializable mode.
However, this might be thought a more consistent behavior than what we
have now.  And I'm guessing that this function is typically run as its
own transaction, so in practice this doesn't seem much different from
an MVCC snapshot, only cheaper.

At the moment, I dislike #2 and slightly prefer #4 to #3.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] FKey not enforced resulting in broken Dump/Reload

2013-07-19 Thread Rod Taylor
A poorly coded trigger on the referencing table has the ability to break
foreign keys, and as a result create a database which cannot be dumped and
reloaded.

The BEFORE DELETE trigger accidentally does RETURN NEW, which suppresses
the DELETE action by the foreign key trigger. This allows the record from
the referenced table to be deleted and the record in the referencing table
to remain in place.

While I don't expect Pg to do what the coder meant, but it should throw an
error and not leave foreign key'd data in an invalid state.

This applies to both 9.1 and 9.2.


Please see attached bug.sql.


bug.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LOCK TABLE Permissions

2013-07-19 Thread Robert Haas
On Fri, Jul 19, 2013 at 12:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Stephen Frost sfr...@snowman.net writes:
 if (lockmode == AccessShareLock)
 aclresult = pg_class_aclcheck(reloid, GetUserId(),
   ACL_SELECT);
 +   else if (lockmode == RowExclusiveLock)
 +   aclresult = pg_class_aclcheck(reloid, GetUserId(),
 +ACL_INSERT | ACL_UPDATE | ACL_DELETE | 
 ACL_TRUNCATE);
 else
 aclresult = pg_class_aclcheck(reloid, GetUserId(),
   ACL_UPDATE | ACL_DELETE | 
 ACL_TRUNCATE);

 Perhaps it would be better to refactor with a local variable for the
 aclmask and just one instance of the pg_class_aclcheck call.  Also, I'm
 pretty sure that the documentation work needed is more extensive
 than the actual patch ;-).  Otherwise, I don't see a problem with this.

I don't really care one way or the other whether we change this in
master, but I think back-patching changes that loosen security
restrictions is a poor idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] getting rid of SnapshotNow

2013-07-19 Thread Alvaro Herrera
Robert Haas escribió:

 4. If we use GetActiveSnapshot, all the comments about about a fresh
 MVCC snapshot still apply.  However, the snapshot in question could be
 even more stale, especially in repeatable read or serializable mode.
 However, this might be thought a more consistent behavior than what we
 have now.  And I'm guessing that this function is typically run as its
 own transaction, so in practice this doesn't seem much different from
 an MVCC snapshot, only cheaper.
 
 At the moment, I dislike #2 and slightly prefer #4 to #3.

+1 for #4, and if we ever need more then we can provide a non-default
way to get at #2.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AGG_PLAIN thinks sorts are free

2013-07-19 Thread Jeff Janes
On Thu, Jul 18, 2013 at 8:04 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jeff Janes jeff.ja...@gmail.com writes:
 AGG_PLAIN sometimes does sorts, but it thinks they are free.  Also, under
 explain analyze it does not explicitly report whether the sort was external
 or not, nor report the disk or memory usage, the way other sorts do.  I
 don't know if those two things are related or not.

 DISTINCT (and also ORDER BY) properties of aggregates are implemented
 at runtime; the planner doesn't really do anything about them, except
 suppress the choice it might otherwise make of using hashed aggregation.
 Since the behavior is entirely local to the Agg plan node, it's also
 not visible to the EXPLAIN ANALYZE machinery.

Couldn't a hash aggregate be superior to a sort one (for the distinct,
not the order by)?

 Arguably we should have the planner add on some cost factor for such
 aggregates, but that would have no effect whatever on the current level
 of plan, and could only be useful if this was a subquery whose cost
 would affect choices in an outer query level.  Which is a case that's
 pretty few and far between AFAIK (do you have a real-world example where
 it matters?).

Not that I know of.  It is mainly an analytical headache.  I'm trying
to figure out why the planner makes the choices it does on more
complex queries, but one of the component queries I'm trying to build
it up from suddenly falls into this plan, where I can't see the
estimated costs and can't use set enable_*  to shift it away from
that into a more transparent one.


Thanks for the explanation.

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AGG_PLAIN thinks sorts are free

2013-07-19 Thread Tom Lane
Jeff Janes jeff.ja...@gmail.com writes:
 On Thu, Jul 18, 2013 at 8:04 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 DISTINCT (and also ORDER BY) properties of aggregates are implemented
 at runtime; the planner doesn't really do anything about them, except
 suppress the choice it might otherwise make of using hashed aggregation.

 Couldn't a hash aggregate be superior to a sort one (for the distinct,
 not the order by)?

If it worked at all, it might be superior, but since it doesn't, it ain't.

This isn't really a matter of lack of round tuits, but a deliberate
judgment that it's probably not worth the trouble.  Per the comment in
choose_hashed_grouping:

/*
 * Executor doesn't support hashed aggregation with DISTINCT or ORDER BY
 * aggregates.(Doing so would imply storing *all* the input values in
 * the hash table, and/or running many sorts in parallel, either of which
 * seems like a certain loser.)
 */

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign Tables as Partitions

2013-07-19 Thread David Fetter
On Fri, Jul 19, 2013 at 11:41:14AM -0400, Robert Haas wrote:
 On Thu, Jul 18, 2013 at 8:56 PM, David Fetter da...@fetter.org wrote:
  Please find attached a PoC patch to implement $subject.
 
  So far, with a lot of help from Andrew Gierth, I've roughed out an
  implementation which allows you to ALTER FOREIGN TABLE so it inherits
  a local table.
 
  TBD: CREATE FOREIGN TABLE ... INHERITS ..., docs, regression testing,
  etc., etc.
 
 I think this could be useful, but it's going to take more than a
 three-line patch.

Of course!

 Removing code that prohibits things is easy; what's hard is
 verifying that nothing else breaks as a result.  And I bet it does.

It may well.  I see our relations eventually being as equivalent as
their definitions permit (views, materialized or not, foreign tables,
etc.), and this as work toward that.  Yes, it's normative, and we may
never fully get there, but I'm pretty sure it's worth working towards.

 This functionality was actually present in the original submission
 for foreign tables.  I ripped it out before commit because I didn't
 think all of the interactions with other commands had been
 adequately considered.  But I think they did consider it to some
 degree, which this patch does not.

A ref to the patch as submitted  patch as applied would help a lot :)

Were there any particular things you managed to break with the patch
as submitted?  Places you thought would be soft but didn't poke at?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FKey not enforced resulting in broken Dump/Reload

2013-07-19 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Jul 19, 2013 at 12:58 PM, Rod Taylor rod.tay...@gmail.com wrote:
 A poorly coded trigger on the referencing table has the ability to break
 foreign keys, and as a result create a database which cannot be dumped and
 reloaded.

 This is a known limitation of our foreign key machinery.  It might
 well be susceptible to improvement, but I wouldn't count on anyone
 rewriting it in the near future.

If we failed to fire triggers on foreign-key actions, that would not be
an improvement.  And trying to circumscribe the trigger's behavior so
that it couldn't break the FK would be (a) quite expensive, and
(b) subject to the halting problem, unless perhaps you circumscribed
it so narrowly as to break a lot of useful trigger behaviors.  Thus,
there's basically no alternative that's better than so don't do that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fatal error after starting postgres : sys identifiers must be different

2013-07-19 Thread Indrajit Roychoudhury
One more change was required to add both the users in each node's db as
super users and replication started!!

Thanks.


On Thu, Jul 18, 2013 at 5:35 PM, Andres Freund and...@anarazel.de wrote:

 Hi!

 On 2013-07-19 07:31:07 +0900, Michael Paquier wrote:
  If this behavior is confirmed, I think that this bug should be
  reported directly to Andres and not this mailing list, just because
  logical replication is not integrated into Postgres core as of now.

 I think I agree, although I don't know where to report it, but to me
 personally, so far. Hackers seems to be the wrong crowd for it, given
 most of the people on it haven't even heard of bdr, much less read its
 code.
 We're definitely planning to propose it for community inclusion in some
 form, but there are several rather large preliminary steps (like getting
 changeset extraction in) that need to be done first.

 Not sure what's best.

 On 2013-07-19 07:31:07 +0900, Michael Paquier wrote:
  On Fri, Jul 19, 2013 at 5:02 AM, Indrajit Roychoudhury
  indrajit.roychoudh...@gmail.com wrote:
   Could you please let me know what does the error system identifiers
 are
   same mean? Below is the snapshot from one of the masters.
   I am setting up BDR as per the wiki
 http://wiki.postgresql.org/wiki/BDR_User_Guide#Initial_setup
   and source @
git://git.postgresql.org/git/users/andresfreund/postgres.git
  
   irc1@ubuntuirc:~/bdr2/postgres-d6ed89e/postgres-bdr-bin/bin$
 ./postgres -D
   ~/bdr2/postgres-d6ed89e/postgres-bdr-bin/bin/data2/
   LOG:  bgworkers, connection: dbname=testdb2
   LOG:  option: dbname, val: testdb2
   LOG:  registering background worker: bdr apply: ubuntuirc2
   LOG:  loaded library bdr
   LOG:  database system was shut down at 2013-03-17 10:56:52 PDT
   LOG:  doing logical startup from 0/17B6410
   LOG:  starting up replication identifier with ckpt at 0/17B6410
   LOG:  autovacuum launcher started
   LOG:  starting background worker process bdr apply: ubuntuirc2
   LOG:  database system is ready to accept connections
   LOG:  bdr apply: ubuntuirc2 initialized on testdb2, remote
 dbname=testdb2
   replication=true fallback_application_name=bdr
   FATAL:  system identifiers must differ between the nodes
   DETAIL:  Both system identifiers are 5856368744762683487.

  I am not the best specialist about logical replication, but as it
  looks to be a requirement to have 2 nodes with different system
  identifiers, you shouldn't link 1 node to another node whose data
  folder has been created from the base backup of the former (for
  example create the 2nd node based on a base backup of the 1st node).
  The error returned here would mean so.

 Yes, that's correct.

   LOG:  worker process: bdr apply: ubuntuirc2 (PID 16712) exited with
 exit
   code 1
   ^CLOG:  received fast shutdown request
   LOG:  aborting any active transactions
   LOG:  autovacuum launcher shutting down
   LOG:  shutting down
  
   pgcontrol_data outputs different database system ids for the two
 nodes. So
   don't understand why it says identifiers are same.
  Are you sure? Please re-ckeck.
 
   postgresql.conf content in one of the masters is like this-
  
   /
   shared_preload_libraries = 'bdr'
   bdr.connections = 'ubuntuirc2'
   bdr.ubuntuirc2.dsn = 'dbname=testdb2'
   /
  
   Two nodes are ubuntuirc and ubuntuirc2. Above is the output of the
   postgresql.conf from ubuntuirc.

 The problem seems to be that your dsn doesn't include a hostname but
 only a database name. So, what probably happens is both your hosts try
 to connect to themselves since connecting to the local host is the
 default when no hostname is specified. Which is one of the primary
 reasons the requirement for differing system identifiers exist...

 Greetings,

 Andres Freund



Re: [HACKERS] FKey not enforced resulting in broken Dump/Reload

2013-07-19 Thread Robert Haas
On Fri, Jul 19, 2013 at 1:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Jul 19, 2013 at 12:58 PM, Rod Taylor rod.tay...@gmail.com wrote:
 A poorly coded trigger on the referencing table has the ability to break
 foreign keys, and as a result create a database which cannot be dumped and
 reloaded.

 This is a known limitation of our foreign key machinery.  It might
 well be susceptible to improvement, but I wouldn't count on anyone
 rewriting it in the near future.

 If we failed to fire triggers on foreign-key actions, that would not be
 an improvement.  And trying to circumscribe the trigger's behavior so
 that it couldn't break the FK would be (a) quite expensive, and
 (b) subject to the halting problem, unless perhaps you circumscribed
 it so narrowly as to break a lot of useful trigger behaviors.  Thus,
 there's basically no alternative that's better than so don't do that.

I think a lot of people would be happier if foreign keys were always
checked after all regular triggers and couldn't be disabled.   But,
eh, that's not how it works.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fatal error after starting postgres : sys identifiers must be different

2013-07-19 Thread Indrajit Roychoudhury
*I am not the best specialist about logical replication, but as it
looks to be a requirement to have 2 nodes with different system
identifiers, you shouldn't link 1 node to another node whose data
folder has been created from the base backup of the former (for
example create the 2nd node based on a base backup of the 1st node).
The error returned here would mean so.*
**
*- *I didn't create the /data folders from any backup for both the nodes;
they were created fresh by initdb individually. Is there anything specific
that I need to do apart from initdb?

*pgcontrol_data outputs different database system ids for the two nodes. So
 don't understand why it says identifiers are same.
Are you sure? Please re-ckeck.*
**
*- *I re-verified and they are different. Pasting the snapshots below-

irc1@ubuntuirc:~/bdr2/postgres-d6ed89e/postgres-bdr-bin/bin$
./pg_controldata ./data2/
pg_control version number:937
Catalog version number:   201305061
Database system identifier:   *5856368744762683487*
Database cluster state:   in production


irc2@ubuntuirc2:~/bdr/postgres-bin/bin$ ./pg_controldata ./data/
pg_control version number:937
Catalog version number:   201305061
Database system identifier:   *5901663435516996514*
Database cluster state:   in production


Also, am pasting the below snapshot to let you know that postgres doesn't
seem to be the su; in case this problem is due to that.

irc1@ubuntuirc:~/bdr2/postgres-d6ed89e/postgres-bdr-bin/bin$ ./psql testdb2
psql (9.3beta1)
Type help for help.
testdb2=# \du
 List of roles
 Role name |   Attributes   | Member of
---++---
 irc1  | Superuser, Create role, Create DB, Replication | {}

irc2@ubuntuirc2:~/bdr/postgres-bin/bin$ ./psql testdb2
psql (9.3beta1)
Type help for help.
testdb2=# \du
 List of roles
 Role name |   Attributes   | Member of
---++---
 irc2  | Superuser, Create role, Create DB, Replication | {}


Please let me know which mailing list will be more appropriate to raise my
concern regarding this postgres dev flavor.
I am yet to receive any response from Andres.

Thanks.


On Thu, Jul 18, 2013 at 3:31 PM, Michael Paquier
michael.paqu...@gmail.comwrote:

 On Fri, Jul 19, 2013 at 5:02 AM, Indrajit Roychoudhury
 indrajit.roychoudh...@gmail.com wrote:
  Could you please let me know what does the error system identifiers are
  same mean? Below is the snapshot from one of the masters.
  I am setting up BDR as per the wiki
http://wiki.postgresql.org/wiki/BDR_User_Guide#Initial_setup
  and source @
   git://git.postgresql.org/git/users/andresfreund/postgres.git
 
  irc1@ubuntuirc:~/bdr2/postgres-d6ed89e/postgres-bdr-bin/bin$ ./postgres
 -D
  ~/bdr2/postgres-d6ed89e/postgres-bdr-bin/bin/data2/
  LOG:  bgworkers, connection: dbname=testdb2
  LOG:  option: dbname, val: testdb2
  LOG:  registering background worker: bdr apply: ubuntuirc2
  LOG:  loaded library bdr
  LOG:  database system was shut down at 2013-03-17 10:56:52 PDT
  LOG:  doing logical startup from 0/17B6410
  LOG:  starting up replication identifier with ckpt at 0/17B6410
  LOG:  autovacuum launcher started
  LOG:  starting background worker process bdr apply: ubuntuirc2
  LOG:  database system is ready to accept connections
  LOG:  bdr apply: ubuntuirc2 initialized on testdb2, remote dbname=testdb2
  replication=true fallback_application_name=bdr
  FATAL:  system identifiers must differ between the nodes
  DETAIL:  Both system identifiers are 5856368744762683487.
 I am not the best specialist about logical replication, but as it
 looks to be a requirement to have 2 nodes with different system
 identifiers, you shouldn't link 1 node to another node whose data
 folder has been created from the base backup of the former (for
 example create the 2nd node based on a base backup of the 1st node).
 The error returned here would mean so.

  LOG:  worker process: bdr apply: ubuntuirc2 (PID 16712) exited with exit
  code 1
  ^CLOG:  received fast shutdown request
  LOG:  aborting any active transactions
  LOG:  autovacuum launcher shutting down
  LOG:  shutting down
 
  pgcontrol_data outputs different database system ids for the two nodes.
 So
  don't understand why it says identifiers are same.
 Are you sure? Please re-ckeck.

  postgresql.conf content in one of the masters is like this-
 
  /
  shared_preload_libraries = 'bdr'
  bdr.connections = 'ubuntuirc2'
  bdr.ubuntuirc2.dsn = 'dbname=testdb2'
  /
 
  Two nodes are ubuntuirc and ubuntuirc2. Above is the output of the
  postgresql.conf from ubuntuirc.
 If this behavior is confirmed, I think that this bug should be
 reported directly to Andres and not this mailing list, just 

Re: [HACKERS] Review: UNNEST (and other functions) WITH ORDINALITY

2013-07-19 Thread Greg Stark
On Wed, Jun 26, 2013 at 2:47 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Some of the rest of us would like to hear those reasons, because my
 immediate reaction is that the patch must be broken by design.  WITH
 ORDINALITY should not be needing to mess with the fundamental evaluation
 semantics of SRFs, but it sure sounds like it is doing so if that case
 doesn't work as expected.

As Dan pointed out later the patch is not affecting whether this case
works as expected. Only that since you can only use WITH ORDINALITY on
SRFs in the FROM list and not in the target list if you want to use it
you have to rewrite this query to put the SRFs in the FROM list.

I think we're ok with that since SRFs in the target list are something
that already work kind of strangely and probably we would rather get
rid of rather than work to extend. It would be hard to work ordinality
into the grammar in the middle of the target list let alone figure out
how to implement it.


My only reservation with this patch is whether the WITH_ORDINALITY
parser hack is the way we want to go. The precedent was already set
with WITH TIME ZONE though and I think this was the best option. The
worst failure I can come up with is this which doesn't seem like a
huge problem:

postgres=# with o as (select 1 ) select * from o;
 ?column?
--
1
(1 row)

postgres=# with ordinality as (select 1 ) select * from ordinality;
ERROR:  syntax error at or near ordinality
LINE 1: with ordinality as (select 1 ) select * from ordinality;
 ^



-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)

2013-07-19 Thread John Galloway
some Salesforce folks that would be me!   It looks like I didn't quite
communicate to Tom just what I was looking for as I do indeed want to have
a variable number of any types, as:

CREATE AGGREGATE FOO ( ANYELEMENT, more types, VARIADIC any) (
...
STYPE = ANYARRAY
...)
so the corresponding transition function would be
CREATE FUNCTION FOO_sfunc( ANYARRAY, ANYELEMENT, more types, VARIADIC
any) RETURNS ANYARRAY
and the final func is
CREATE FUNCTION FOO_ffunc( ANYARRAY ) RETURNS ANYELEMENT

The functions are in C, and I cheat and actually use the ANYARRAY
transition variable as a struct just keeping the varlena length correct
(thanks to Tom for that idea).  Currently I just  support a fixed number of
any args but really need to have that be variable.

So supporting VARIADIC any for user defined aggregates would be most
useful.


On Thu, Jul 18, 2013 at 7:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Noah Misch n...@leadboat.com writes:
  (I don't know whether VARIADIC transition functions work today, but that
 would
  become an orthogonal project.)

 Coincidentally enough, some Salesforce folk were asking me about allowing
 VARIADIC aggregates just a few days ago.  I experimented enough to find
 out that if you make an array-accepting transition function, and then
 force the aggregate's pg_proc entry to look like it's variadic (by
 manually setting provariadic and some other fields), then everything
 seems to Just Work: the parser and executor are both fine with it.
 So I think all that's needed here is to add some syntax support to
 CREATE AGGREGATE, and probably make some tweaks in pg_dump.  I was
 planning to go work on that sometime soon.

 Having said that, though, what Andrew seemed to want was VARIADIC ANY,
 which is a *completely* different kettle of fish, since the actual
 parameters can't be converted to an array.  I'm not sure if that's
 as easy to support.

 regards, tom lane




Re: [HACKERS] FKey not enforced resulting in broken Dump/Reload

2013-07-19 Thread Robert Haas
On Fri, Jul 19, 2013 at 2:15 PM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Jul 19, 2013 at 1:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Jul 19, 2013 at 12:58 PM, Rod Taylor rod.tay...@gmail.com wrote:
 A poorly coded trigger on the referencing table has the ability to break
 foreign keys, and as a result create a database which cannot be dumped and
 reloaded.

 This is a known limitation of our foreign key machinery.  It might
 well be susceptible to improvement, but I wouldn't count on anyone
 rewriting it in the near future.

 If we failed to fire triggers on foreign-key actions, that would not be
 an improvement.  And trying to circumscribe the trigger's behavior so
 that it couldn't break the FK would be (a) quite expensive, and
 (b) subject to the halting problem, unless perhaps you circumscribed
 it so narrowly as to break a lot of useful trigger behaviors.  Thus,
 there's basically no alternative that's better than so don't do that.

 I think a lot of people would be happier if foreign keys were always
 checked after all regular triggers and couldn't be disabled.   But,
 eh, that's not how it works.

Please ignore this fuzzy thinking.  You're right.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] CREATE EVENT TRIGGER syntax

2013-07-19 Thread Joe Abbate
Hello,

What is the purpose of the [ AND ... ] at the end of the WHEN clause?
Is that for later releases, when presumably additional filter_variables
will be introduced?  Right now, if I add AND tag IN ... I get an

ERROR:  filter variable tag specified more than once

Joe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSOC13 proposal - extend RETURNING syntax

2013-07-19 Thread Karol Trzcionka
New version:
- fix returning after values if there are not before
- add more regression tests
I'd like to hear/read any feedback ;)
Regards,
Karol
diff --git a/doc/src/sgml/ref/update.sgml b/doc/src/sgml/ref/update.sgml
index 90b9208..eba35f0 100644
--- a/doc/src/sgml/ref/update.sgml
+++ b/doc/src/sgml/ref/update.sgml
@@ -194,12 +194,27 @@ UPDATE [ ONLY ] replaceable class=PARAMETERtable_name/replaceable [ * ] [
 termreplaceable class=PARAMETERoutput_expression/replaceable/term
 listitem
  para
-  An expression to be computed and returned by the commandUPDATE/
-  command after each row is updated.  The expression can use any
-  column names of the table named by replaceable class=PARAMETERtable_name/replaceable
-  or table(s) listed in literalFROM/.
-  Write literal*/ to return all columns.
+  An expression to be computed and returned by the
+  commandUPDATE/ command either before or after (prefixed with
+  literalBEFORE./literal and literalAFTER./literal,
+  respectively) each row is updated.  The expression can use any
+  column names of the table named by replaceable
+  class=PARAMETERtable_name/replaceable or table(s) listed in
+  literalFROM/.  Write literalAFTER.*/literal to return all 
+  columns after the update. Write literalBEFORE.*/literal for all
+  columns before the update. Write literal*/literal to return all
+  columns after update and all triggers fired (these values are in table
+  after command). You may combine BEFORE, AFTER and raw columns in the
+  expression.
  /para
+ warningpara
+ Mixing table names or aliases named before or after with the
+ above will result in confusion and suffering.  If you happen to
+ have a table called literalbefore/literal or
+ literalafter/literal, alias it to something else when using
+ RETURNING.
+ /para/warning
+
 /listitem
/varlistentry
 
@@ -287,15 +302,16 @@ UPDATE weather SET temp_lo = temp_lo+1, temp_hi = temp_lo+15, prcp = DEFAULT
   /para
 
   para
-   Perform the same operation and return the updated entries:
+   Perform the same operation and return information on the changed entries:
 
 programlisting
 UPDATE weather SET temp_lo = temp_lo+1, temp_hi = temp_lo+15, prcp = DEFAULT
   WHERE city = 'San Francisco' AND date = '2003-07-03'
-  RETURNING temp_lo, temp_hi, prcp;
+  RETURNING temp_lo AS new_low, temp_hi AS new_high, BEFORE.temp_hi/BEFORE.temp_low AS old_ratio, AFTER.temp_hi/AFTER.temp_low AS new_ratio prcp;
 /programlisting
   /para
 
+
   para
Use the alternative column-list syntax to do the same update:
 programlisting
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index d86e9ad..fafd311 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -2335,7 +2335,7 @@ ExecASUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
 TupleTableSlot *
 ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
 	 ResultRelInfo *relinfo,
-	 ItemPointer tupleid, TupleTableSlot *slot)
+	 ItemPointer tupleid, TupleTableSlot *slot, TupleTableSlot **planSlot)
 {
 	TriggerDesc *trigdesc = relinfo-ri_TrigDesc;
 	HeapTuple	slottuple = ExecMaterializeSlot(slot);
@@ -2381,6 +2381,7 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
 	if (newSlot != NULL)
 	{
 		slot = ExecFilterJunk(relinfo-ri_junkFilter, newSlot);
+		*planSlot = newSlot;
 		slottuple = ExecMaterializeSlot(slot);
 		newtuple = slottuple;
 	}
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 15f5dcc..06ebaf3 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -603,7 +603,7 @@ ExecUpdate(ItemPointer tupleid,
 		resultRelInfo-ri_TrigDesc-trig_update_before_row)
 	{
 		slot = ExecBRUpdateTriggers(estate, epqstate, resultRelInfo,
-	tupleid, slot);
+	tupleid, slot, planSlot);
 
 		if (slot == NULL)		/* do nothing */
 			return NULL;
@@ -737,6 +737,7 @@ lreplace:;
 		   hufd.xmax);
 	if (!TupIsNull(epqslot))
 	{
+		planSlot = epqslot;
 		*tupleid = hufd.ctid;
 		slot = ExecFilterJunk(resultRelInfo-ri_junkFilter, epqslot);
 		tuple = ExecMaterializeSlot(slot);
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index a896d76..409b4d1 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -1928,6 +1928,7 @@ range_table_walker(List *rtable,
 		{
 			case RTE_RELATION:
 			case RTE_CTE:
+			case RTE_BEFORE:
 /* nothing to do */
 break;
 			case RTE_SUBQUERY:
@@ -2642,6 +2643,7 @@ range_table_mutator(List *rtable,
 		{
 			case RTE_RELATION:
 			case RTE_CTE:
+			case RTE_BEFORE:
 /* we don't bother to copy eref, aliases, etc; OK? */
 break;
 			case RTE_SUBQUERY:
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 48cd9dc..79b03af 100644
--- 

[HACKERS] Isn't PLy_spi_execute_fetch_result completely broken?

2013-07-19 Thread Tom Lane
This function appears to believe that it can PG_CATCH any random error
and then just carry on, without doing a subtransaction cleanup.  This
is as wrong as can be.  It might be all right if we had sufficiently
narrow constraints on what could happen inside the PG_TRY block, but in
point of fact it's calling datatype input functions, which could do
anything whatsoever.  I think it needs to do PG_RE_THROW() not just
return NULL at the end of the CATCH.  It looks like both of the
existing callers have cleanup logic, so this should be sufficient.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: template-ify (binary) extensions

2013-07-19 Thread Dimitri Fontaine
Markus Wanner mar...@bluegap.ch writes:
   - per-installation (not even per-cluster) DSO availability
 
 If you install PostGIS 1.5 on a system, then it's just impossible to
 bring another cluster (of the same PostgreSQL major version), let
 On Debian, that should be well possible. Certainly installing Postgres
 9.1 w/ postgis-1.5 in parallel to Postgres 9.2 w/ postgis-2.0 is. I
 designed it to be.

 I think I'm misunderstanding the problem statement, here.

(of the same PostgreSQL major version)

 Can CREATE EXTENSION check if the standbys have the extension installed?
 And refuse creation, if they don't?

No, because we don't register standbies so we don't know who they are,
and also because some things that we see connected and using the
replication protocol could well be pg_basebackup or pg_receivexlog.

Also, it's possible that the standby is only there for High Availability
purposes and runs no user query.

 I'm sure you are aware that even without this clear separation of roles,
 the guarantee means we provide an additional level of security against
 attackers.

Given lo_import() to upload a file from the client to the server then
LOAD with the absolute path where the file ended up imported (or any
untrusted PL really), this argument carries no sensible weight in my
opinion.

 None the less, the safe by default has served us well, I think.

That's true. We need to consider carefully the proposal at hand though.

It's all about allowing the backend to automatically load a file that it
finds within its own $PGDATA so that we can have per-cluster and
per-database modules (DSO files).

The only difference with before is the location where the file is read
from, and the main security danger comes from the fact that we used to
only consider root-writable places and now propose to consider postgres
bootstrap user writable places.

Having the modules in different places in the system when it's a
template and when it's instanciated allows us to solve a problem I
forgot to list:

  - upgrading an extension at the OS level

Once you've done that, any new backend will load the newer module
(DSO file), so you have to be real quick if installing an hot fix in
production and the SQL definition must be changed to match the new
module version…

With the ability to instanciate the module in a per-cluster
per-database directory within $PGDATA the new version of the DSO module
would only put in place and loaded at ALTER EXTENSION UPDATE time.

I'm still ok with allowing to fix those problems only when a security
option that defaults to 'false' has been switched to 'true', by the way,
so that it's an opt-in, but I will admit having a hard time swallowing
the threat model we're talking about…

 I don't think the relaxed behaviour we're talking about is currently
 possible to develop as an extension, by the way.

 It's extensions that undermine the guarantee, at the moment. But yeah,
 it depends a lot on what kind of relaxed behavior you have in mind.

The ability to load a module from another place than the current
Dynamic_library_path is what we're talking about here, and IIUC Robert
mentioned that maybe it could be down to an extension to allow for
changing the behavior. I didn't look at that in details but I would be
surprised to be able to tweak that without having to patch the backend.

 If the sysadmin wants to disallow arbitrary execution of native code to
 postgres (the process), any kind of embedded compiler likely is equally
 unwelcome.

You already mentioned untrusted PL languages, and I don't see any
difference in between offering PL/pythonu and PL/C on security grounds,
really. I don't think we would consider changing the current model of
the LANGUAGE C PL, we would add a new LANGUAGE PL/C option, either
untrusted or with a nice sandbox.

The problem of sandboxing PL/C is that it makes it impossible to address
such use cases as uuid or PostGIS extensions even if it still allows for
hstore or other datatypes styles of extensions.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Preventing tuple-table leakage in plpgsql

2013-07-19 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 On Fri, Jul 05, 2013 at 02:47:06PM -0400, Tom Lane wrote:
 So I'm inclined to propose that SPI itself should offer some mechanism
 for cleaning up tuple tables at subtransaction abort.  We could just
 have it automatically throw away tuple tables made in the current
 subtransaction, or we could allow callers to exercise some control,
 perhaps by calling a function that says don't reclaim this tuple table
 automatically.  I'm not sure if there's any real use-case for such a
 call though.

 I suppose that would be as simple as making spi_dest_startup() put the
 tuptabcxt under CurTransactionContext?  The function to prevent reclamation
 would then just do a MemoryContextSetParent().

I experimented with this, and found out that it's not quite that simple.
In a SPI procedure that hasn't created a subtransaction (eg, a plpgsql
function without an exception block), if we attach tuple tables to the
outer transaction's CurTransactionContext then they fail to go away
during SPI_finish(), creating leaks in code that was previously correct.

However, we can use your idea when running inside a subtransaction,
while still attaching the tuple table to the procedure's own procCxt
when no subtransaction is involved.  The attached draft patch does it
that way, and successfully resolves the complained-of leakage case.

I like this better than what I originally had in mind, because it
produces no change in semantics in the case where a SPI procedure
doesn't create any subtransactions, which I imagine is at least 99.44%
of third-party SPI-using code.

Also, because the tuple tables don't go away until
CleanupSubTransaction(), code that explicitly frees them will still work
as long as it does that before rolling back the subxact.  Thus, the
patch's changes to remove SPI_freetuptable() calls in
plpy_cursorobject.c are not actually necessary for correctness, it's
just that we no longer need them.  Ditto for the change in
AtEOSubXact_SPI.  Unfortunately, the change in pl_exec.c *is* necessary
to avoid a coredump, because that call was being made after rolling back
the subxact.

All in all, the risk of breaking anything outside core code seems very
small, and I'm inclined to think that we don't need to provide an API
function to reparent a tuple table.  But having said that, I'm also
inclined to not back-patch this further than 9.3, just in case.

The patch still requires documentation changes, and there's also one
more error-cleanup SPI_freetuptable() call that ought to go away, in
PLy_spi_execute_fetch_result.  But that one needs the fix posited in
http://www.postgresql.org/message-id/21017.1374273...@sss.pgh.pa.us
first.

Comments?

regards, tom lane

diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index f30b2cd93330d72e0a412d6517aec08c96cc753e..3f4d1e625d738ad0c08d0245a0169451d29a631d 100644
*** a/src/backend/executor/spi.c
--- b/src/backend/executor/spi.c
*** AtEOSubXact_SPI(bool isCommit, SubTransa
*** 284,291 
  	{
  		/* free Executor memory the same as _SPI_end_call would do */
  		MemoryContextResetAndDeleteChildren(_SPI_current-execCxt);
! 		/* throw away any partially created tuple-table */
! 		SPI_freetuptable(_SPI_current-tuptable);
  		_SPI_current-tuptable = NULL;
  	}
  }
--- 284,291 
  	{
  		/* free Executor memory the same as _SPI_end_call would do */
  		MemoryContextResetAndDeleteChildren(_SPI_current-execCxt);
! 		/* forget any partially created tuple-table */
! 		/* (we don't need to free it, subxact cleanup will do so) */
  		_SPI_current-tuptable = NULL;
  	}
  }
*** void
*** 1641,1648 
  spi_dest_startup(DestReceiver *self, int operation, TupleDesc typeinfo)
  {
  	SPITupleTable *tuptable;
! 	MemoryContext oldcxt;
  	MemoryContext tuptabcxt;
  
  	/*
  	 * When called by Executor _SPI_curid expected to be equal to
--- 1641,1649 
  spi_dest_startup(DestReceiver *self, int operation, TupleDesc typeinfo)
  {
  	SPITupleTable *tuptable;
! 	MemoryContext parentcxt;
  	MemoryContext tuptabcxt;
+ 	MemoryContext oldcxt;
  
  	/*
  	 * When called by Executor _SPI_curid expected to be equal to
*** spi_dest_startup(DestReceiver *self, int
*** 1656,1669 
  	if (_SPI_current-tuptable != NULL)
  		elog(ERROR, improper call to spi_dest_startup);
  
! 	oldcxt = _SPI_procmem();	/* switch to procedure memory context */
  
! 	tuptabcxt = AllocSetContextCreate(CurrentMemoryContext,
  	  SPI TupTable,
  	  ALLOCSET_DEFAULT_MINSIZE,
  	  ALLOCSET_DEFAULT_INITSIZE,
  	  ALLOCSET_DEFAULT_MAXSIZE);
! 	MemoryContextSwitchTo(tuptabcxt);
  
  	_SPI_current-tuptable = tuptable = (SPITupleTable *)
  		palloc(sizeof(SPITupleTable));
--- 1657,1681 
  	if (_SPI_current-tuptable != NULL)
  		elog(ERROR, improper call to spi_dest_startup);
  
! 	/*
! 	 * If we're in the same subtransaction our SPI procedure was called in,
! 	 * attach the tuple table context to 

Re: [ODBC] [HACKERS] getting rid of SnapshotNow

2013-07-19 Thread Hiroshi Inoue
(2013/07/20 1:11), Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
 On 2013-07-20 00:49:11 +0900, Hiroshi Inoue wrote:
 Using SnapshotSelf instead of SnapshotNow for currtid_ () wouldn't
 matter.
 
 I think it actually might. You could get into dicey situations if you
 use currtid_ in a query performing updates or inserts because it would
 see the to-be-inserted tuple...
 
 I'm pretty sure Hiroshi-san was only opining about whether it would
 matter for ODBC's usage.  IIUC, ODBC is using this function to re-fetch
 rows that it inserted, updated, or at least selected-for-update in a
 previous command of the current transaction, so actually any snapshot
 would do fine.
 
 In any case, since I moved the goalposts by suggesting that SnapshotSelf
 is just as dangerous as SnapshotNow, what we need to know is whether
 it'd be all right to change this code to use a fresh MVCC snapshot;
 and if not, why not.  It's pretty hard to see a reason why client-side
 code would want to make use of the results of a non-MVCC snapshot.

OK I agree to replace SnapshotNow for currtid_xx() by a MVCC-snapshot.

regards,
Hiroshi Inoue



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE EVENT TRIGGER syntax

2013-07-19 Thread Dimitri Fontaine
Joe Abbate j...@freedomcircle.com writes:
 What is the purpose of the [ AND ... ] at the end of the WHEN clause?
 Is that for later releases, when presumably additional filter_variables
 will be introduced?  Right now, if I add AND tag IN ... I get an

Yes. I had other filter variables in some versions of the patch, but
we're yet to agree on a design for the things I wanted to solve with
them.

See http://www.postgresql.org/message-id/m2txrsdzxa@2ndquadrant.fr
for some worked out example of the CONTEXT part of the Event Trigger
proposal.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Allow background workers to be started dynamically.

2013-07-19 Thread Alvaro Herrera
Andres Freund escribió:

 On 2013-07-19 08:23:25 -0400, Robert Haas wrote:

  And I'd also propose getting rid
  of bgw_sighup and bgw_sigterm in both branches, while we're at it.
  AFAICT, they don't add any functionality, and they're basically
  unusable for dynamically started background workers.  Probably better
  not to get people to used to using them.
 
 I don't have a problem with getting rid of those, it's easy enough to
 register them inside the worker and it's safe since we start with
 blocked signals. I seem to remember some discussion about why they were
 added but I can't find a reference anymore. Alvaro, do you remember?

I left them there because it was easy; but they were absolutely
necessary only until we decided that we would start the worker's main
function with signals blocked.  I don't think there's any serious reason
not to remove them now.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers