date:20100506

Re: [HACKERS] possible memory leak with SRFs

2010-05-06 Thread Nikhil Sontakke

Hi,

>> Can someone please explain why we do not reset the expression context
>> if an SRF is involved during execution?
>
> Consider
>        srf(foo(col))
> where foo returns a pass-by-reference datatype.  Your proposed patch
> would cut the knees out from under argument values that the SRF could
> reasonably expect to still be there on subsequent calls.
>

Yeah this is my basic confusion. But wouldn't the arguments be
evaluated afresh on the subsequent call for this SRF? In that case
freeing up the context of the *last* call should not be an issue I
would think.

And if this is indeed the case we should be using a different longer
lived context and not the ecxt_per_tuple_memory context..

Regards,
Nikhils
-- 
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding xpath_exists function

2010-05-06 Thread Robert Haas

On Thu, May 6, 2010 at 5:53 PM,   wrote:
> Quoting Robert Haas :
>
>>
>> I'm not sure I understand how this more convenient than just using
>> xpath() with exists()?
>>
>
> It will save a lot of complexity in WHERE clauses. For example using
> exists() in xpath() you might construct something like:
>
> WHERE array_dims(xpath('exists(/foo/bar)',''::xml) IS NOT
> NULL ...
>
> Whereas a dedicated xpath_exists() would look like:
>
> WHERE xpath_exists('/foo/bar',''::xml) 
>
> I accept this example is quite basic, but I hope it illustrates the added
> usability. I think xml in sql is complex enough, especially when you start
> considering namespaces, that anything we can do to simplify common use cases
> can only help improve the uptake of postgres xml.

Oh, I see.  Well, that might be reasonable syntactic sugar, although I
think you should make it wrap the path in exists() unconditionally,
rather than testing for an existing wrap.

Please email your patch to the list (replying to this email is fine)
and add it here:
https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_transaction patch

2010-05-06 Thread Takahiro Itagaki

Joel Jacobson  wrote:

> I propose a set of new statistics functions and system views.
> 
> I need these functions in order to do automated testing of our system,
> consisting of hundreds of stored procedures in plpgsql.
> My plan is to develop some additional functions to pgTAP, benefiting from
> the new system tables I've added.

I ported your patch into 9.0beta, but it doesn't work well.
I had two assertion failures from the run.sql:

TRAP: FailedAssertion("!(entry->trans == ((void *)0))", File: "pgstat.c", Line: 
715)
TRAP: FailedAssertion("!(tabstat->trans == trans)", File: "pgstat.c", Line: 
1756)

Also, pg_stat_transaction_functions returned no rows from the test case even
after I removed those assertions. There are no rows in your test/run.out, too.

I like your idea itself, but more works are required for the implementation.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

pg_stat_transaction-9.0beta.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding xpath_exists function

2010-05-06 Thread mike


Quoting Robert Haas :



I'm not sure I understand how this more convenient than just using
xpath() with exists()?



It will save a lot of complexity in WHERE clauses. For example using  
exists() in xpath() you might construct something like:


WHERE array_dims(xpath('exists(/foo/bar)',''::xml) IS  
NOT NULL ...


Whereas a dedicated xpath_exists() would look like:

WHERE xpath_exists('/foo/bar',''::xml) 

I accept this example is quite basic, but I hope it illustrates the  
added usability. I think xml in sql is complex enough, especially when  
you start considering namespaces, that anything we can do to simplify  
common use cases can only help improve the uptake of postgres xml.


Thanks,

--
Mike Fowler


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Bruce Momjian

Greg Smith wrote:
> Bruce Momjian wrote:
> > Remember, delaying wal application just delays making the standby a
> > master and makes the slave data appear staler.  We can just tell people
> > that the larger their queries are, the larger this delay will be.  If
> > they want to control this, they can set 'statement_timeout' already.
> >   
> 
> While a useful defensive component, statement_timeout is a user setting, 
> so it can't provide guaranteed protection against a WAL application 
> denial of service from a long running query.  A user that overrides the 
> system setting and kicks off a long query puts you right back into 
> needing a timeout to ensure forward progress of standby replay.

The nice thing about query cancel is that it give predictable behavior. 
We could make statement_timeout that can't be changed if it is set in
postgresql.conf.  Again, let's think of that for 9.1.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SELECT * in a CREATE VIEW statement doesn't update column set automatically

2010-05-06 Thread Merlin Moncure

On Thu, May 6, 2010 at 3:23 PM, Andrew Dunstan  wrote:
> And many places regard "select *" in anything other than throw-away queries
> as bad practice anyway. I have seen people get bitten by it over and over
> again, and I have worked at companies where it is explicitly forbidden in
> coding standards.

In terms of application queries I generally agree.  However, I think
this rule does not apply to server side definitions, especially in
regards to views and/or composite types.  There are cases where you
_want_ the view to be define as 'all fields of x'...In fact, it's
pretty typical IMNSHO.  It may be possible to expose this behavior.

I'd like to see:
select * from foo
  -- and --
select (foo).*
exhibit different behaviors -- ().* is more a type operator, returning
all the fields of foo, than a field list expression.  This gives us a
cool loophole to exploit for views that really want to be defined with
*:
create view particular_foos as select (foo).* from foo where something = true;
create view something_complex as select (foo).*, (func(foo.field)).*;
-- execute func() just one time please!

The something_complex case above is a real problem in how it behaves
currently -- sometimes without a hassle free workaround.  Am I off my
rocker? :-) I've made this point many times (prob got annoying a long
time ago) but I'm curious if you guys agree...

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Greg Smith


Bruce Momjian wrote:

Remember, delaying wal application just delays making the standby a
master and makes the slave data appear staler.  We can just tell people
that the larger their queries are, the larger this delay will be.  If
they want to control this, they can set 'statement_timeout' already.
  


While a useful defensive component, statement_timeout is a user setting, 
so it can't provide guaranteed protection against a WAL application 
denial of service from a long running query.  A user that overrides the 
system setting and kicks off a long query puts you right back into 
needing a timeout to ensure forward progress of standby replay.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Bruce Momjian

Josh Berkus wrote:
> All,
> 
> We are in Beta1, now, and it's May.  Original goal for 9.0 was
> June-July.  We cannot be introducing major design revisions to HS/SR at
> this date, or we won't ship until December.
> 
> There are at least 10 other major features in 9.0, some of which are
> more important to some of our users than HS/SR.  More importantly, I
> think the discussion on this thread makes it very clear that no matter
> how much discussion we have on standby delay, we are NOT going to get it
> right the first time.  That is, *even if* we replace Simon's code with
> something else, that something else will have as many issues for real
> users as the current delay does, especially since we won't even have
> started debugging or testing the new code yet.
> 
> So changing to a lock-based mechanism or designing a plugin interface
> are really not at all realistic at this date.
> 
> Realistically, we have two options at this point:
> 
> 1) reduce max_standby_delay to a boolean.

I suggest calling it 'delay_wal_application' or 'wal_query_cancel' or
something like that.

> 2) have a delay option (based either on WAL glob start time or on system
> time) like the current max_standby_delay, preferably with some bugs fixed.

I don't think releasing something that many of us can barely understand
is going to help.  I think Heikki is right that we might get feedback
from 9.0 that this setting isn't even useful.  If we can't get this
right, and it seems we can't, we should just push this to 9.1.

Remember, delaying wal application just delays making the standby a
master and makes the slave data appear staler.  We can just tell people
that the larger their queries are, the larger this delay will be.  If
they want to control this, they can set 'statement_timeout' already.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator to /contrib in a later 9.0 beta

2010-05-06 Thread Dave Page

On Thu, May 6, 2010 at 8:10 PM, Thom Brown  wrote:

> You will call it pg_upgrade.  I have spoken.
> Thom

LOL.

-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator to /contrib in a later 9.0 beta

2010-05-06 Thread Tom Lane

Bruce Momjian  writes:
> OK, seems people like pg_upgrade, but do we call it "pgupgrade" or
> "pg_upgrade"?

The latter.  The former is unreadable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator to /contrib in a later 9.0 beta

2010-05-06 Thread Thom Brown

On 6 May 2010 20:55, Bruce Momjian  wrote:

> Bruce Momjian wrote:
> > Tom Lane wrote:
> > > "Joshua D. Drake"  writes:
> > > > On Wed, 2010-05-05 at 20:24 -0400, Robert Haas wrote:
> > > >> I think it will be confusing if we change the name, so I vote to not
> > > >> change the name.
> > >
> > > > Actually, I would vote yes to change the name.
> > >
> > > I lean that way too.  If there were no history involved, we'd certainly
> > > prefer pg_upgrade to pg_migrator.
> >
> > Yeah, that was my feeling too.  People like "pg_upgrade", or something
> > else?  I will add some text like "pg_upgrade (formerly pg_migrator)" in
> > the docs.
>
> OK, seems people like pg_upgrade, but do we call it "pgupgrade" or
> "pg_upgrade"?  I don't see consistent naming in /contrib:
>
>pg_buffercache/
>pg_freespacemap/
>pg_standby/
>pg_stat_statements/
> pg_trgm/
>pgbench/
>pgcrypto/
>pgrowlocks/
>pgstattuple/
>
> The original 7.2 name was "pg_upgrade":
>
>
> http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/contrib/pg_upgrade/Attic/
>
> --
>
>
You will call it pg_upgrade.  I have spoken.

Thom

Re: [HACKERS] pg_migrator to /contrib in a later 9.0 beta

2010-05-06 Thread Bruce Momjian

Bruce Momjian wrote:
> Tom Lane wrote:
> > "Joshua D. Drake"  writes:
> > > On Wed, 2010-05-05 at 20:24 -0400, Robert Haas wrote:
> > >> I think it will be confusing if we change the name, so I vote to not
> > >> change the name.
> > 
> > > Actually, I would vote yes to change the name.
> > 
> > I lean that way too.  If there were no history involved, we'd certainly
> > prefer pg_upgrade to pg_migrator.
> 
> Yeah, that was my feeling too.  People like "pg_upgrade", or something
> else?  I will add some text like "pg_upgrade (formerly pg_migrator)" in
> the docs.

OK, seems people like pg_upgrade, but do we call it "pgupgrade" or
"pg_upgrade"?  I don't see consistent naming in /contrib:

pg_buffercache/
pg_freespacemap/
pg_standby/
pg_stat_statements/
pg_trgm/
pgbench/
pgcrypto/
pgrowlocks/
pgstattuple/

The original 7.2 name was "pg_upgrade":

http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/contrib/pg_upgrade/Attic/

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] PATCH: Minor notes in CLUSTER page

2010-05-06 Thread Andy Lester

I was looking for how to undo a CLUSTER call earlier today.  Nothing on
the CLUSTER page told me how to do it, or pointed me to the ALTER TABLE
page.  I figure a pointer to would help the next person in my situation.

xoxo,
Andy



-- 
Andy Lester => a...@petdance.com => www.petdance.com => AIM:petdance
diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 662de36..7afe72a 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -176,6 +176,15 @@ CREATE TABLE newtable AS
 temporary file about the same size as the table itself, so peak disk usage
 is about three times the table size instead of twice the table size.

+
+   
+To remove the internal flag that notes that an index has been clustered,
+use ALTER TABLE.
+
+ALTER TABLE table SET WITHOUT 
CLUSTER;
+
+   
+
  
 
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator to /contrib in a later 9.0 beta

2010-05-06 Thread Bruce Momjian

Tom Lane wrote:
> Alvaro Herrera  writes:
> > Excerpts from Bruce Momjian's message of jue may 06 11:19:27 -0400 2010:
> >> It seems copying over pg_statistic would require preservation of
> >> pg_class.oid.  Right now we only preserve pg_class.relfilenode.
> 
> > That could be fixed easily by creating a throwaway table which included the
> > qualified table name instead of the OID, and reloading it into pg_statistic
> > after the migration.
> 
> Yeah.  Casting to and from regclass would do the trick too.

I will add this idea to the pg_migrator TODO file.  I didn't bother with
this in the 8.3 -> 8.4 migration because we changed the default number
of statistics collected in 8.4.

This would be a 9.1 item.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] PATCH: Minor doc addition to CLUSTER page

2010-05-06 Thread Andy Lester

I couldn't figure out how to get rid of the CLUSTER flag on an index.
Once I got the answer from IRC, I wrote this patch to point future users
to the answer.

xoxo,
Andy



-- 
Andy Lester => a...@petdance.com => www.petdance.com => AIM:petdance
diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 662de36..7afe72a 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -176,6 +176,15 @@ CREATE TABLE newtable AS
 temporary file about the same size as the table itself, so peak disk usage
 is about three times the table size instead of twice the table size.

+
+   
+To remove the internal flag that notes that an index has been clustered,
+use ALTER TABLE.
+
+ALTER TABLE table SET WITHOUT 
CLUSTER;
+
+   
+
  
 
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SELECT * in a CREATE VIEW statement doesn't update column set automatically

2010-05-06 Thread Andrew Dunstan




Tom Lane wrote:

Joseph Adams  writes:
  

This isn't exactly a bug, but it could be considered unintuitive
behavior.



It's required by the SQL standard.


  


And many places regard "select *" in anything other than throw-away 
queries as bad practice anyway. I have seen people get bitten by it over 
and over again, and I have worked at companies where it is explicitly 
forbidden in coding standards.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SELECT * in a CREATE VIEW statement doesn't update column set automatically

2010-05-06 Thread Merlin Moncure

On Thu, May 6, 2010 at 3:01 PM, Joseph Adams  wrote:
> This isn't exactly a bug, but it could be considered unintuitive
> behavior.  Consider this:

by unintuitive you mean: 'explicitly defined in the SQL standard' :-).
 I happen to agree with you but that's irrelevant.  If you absolutely
require this use the composite type workaround.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SELECT * in a CREATE VIEW statement doesn't update column set automatically

2010-05-06 Thread Tom Lane

Joseph Adams  writes:
> This isn't exactly a bug, but it could be considered unintuitive
> behavior.

It's required by the SQL standard.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Josh Berkus

All,

We are in Beta1, now, and it's May.  Original goal for 9.0 was
June-July.  We cannot be introducing major design revisions to HS/SR at
this date, or we won't ship until December.

There are at least 10 other major features in 9.0, some of which are
more important to some of our users than HS/SR.  More importantly, I
think the discussion on this thread makes it very clear that no matter
how much discussion we have on standby delay, we are NOT going to get it
right the first time.  That is, *even if* we replace Simon's code with
something else, that something else will have as many issues for real
users as the current delay does, especially since we won't even have
started debugging or testing the new code yet.

So changing to a lock-based mechanism or designing a plugin interface
are really not at all realistic at this date.

Realistically, we have two options at this point:

1) reduce max_standby_delay to a boolean.

2) have a delay option (based either on WAL glob start time or on system
time) like the current max_standby_delay, preferably with some bugs fixed.

If we do (1), we'll be having this discussion all over again in
September, and will be no better off because we won't have any
production feedback on Simon's approach.  If we do (2) we can hedge it
in the documentation with requirements and cautions, and hopefully only
dedicated DBAs will touch it, and we'll get good feedback from them on
how we should redesign it for 9.1.  And if it works as badly as Tom
expects, then we won't have an issue with maintaining backwards
compatibility, because people will be *happy* to change.

One way to communicate this would be to have 2 GUCs instead of one:
allow_query_cancel = on|off # defaults to on
max_standby_timeout = 0 # SEE DOCS BEFORE CHANGING

We named this release 9.0 because, among other things, we expected it to
be less stable than the prior 3 releases.  And we can continue to tell
users that.  I know I won't be moving any of my clients to 9.0.0.

I said it before and I'll say it again: "release early, release often".

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] SELECT * in a CREATE VIEW statement doesn't update column set automatically

2010-05-06 Thread Joseph Adams

This isn't exactly a bug, but it could be considered unintuitive
behavior.  Consider this:

CREATE VIEW foo AS SELECT * FROM a;
CREATE VIEW foo_v AS SELECT * FROM foo;
ALTER TABLE foo ADD COLUMN b INT;

The ALTER TABLE statement affects VIEW foo, but the column addition
does not propagate to VIEW foo_v.  Thus, it makes this deceptive:

... AS SELECT * FROM foo;

I ran into this with an application where a real table is accessed if
the user is an "admin", while regular users access a view instead.  I
considered "AS SELECT * FROM foo" to be a promise that all columns
from foo would be included in the view, but the promise is broken when
ADD COLUMN is applied later on.

Would it be a desirable feature to make `CREATE VIEW foo_v AS SELECT *
FROM foo;` automatically update the column set when foo's columns
change?  Instead of the wildcard * being expanded once at CREATE VIEW
time, it would (semantically) be expanded every time foo_v is selected
on.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Robert Haas

On Thu, May 6, 2010 at 2:47 PM, Josh Berkus  wrote:
>
>> Now that I've realized what the real problem is with max_standby_delay
>> (namely, that inactivity on the master can use up the delay), I think
>> we should do what Tom originally suggested here.  It's not as good as
>> a really working max_standby_delay, but we're not going to have that
>> for 9.0, and it's clearly better than a boolean.
>
> I guess I'm not clear on how what Tom proposed is fundamentally
> different from max_standby_delay = -1.  If there's enough concurrent
> queries, recovery would never catch up.

If your workload is that the standby server is getting pounded with
queries like crazy, then it's probably not that different: it will
fall progressively further behind.  But I suspect many people will set
up standby servers where most of the activity happens on the primary,
but they run some reporting queries on the standby.  If you expect
your reporting queries to finish in <10s, you could set the max delay
to say 60s.  In the event that something gets wedged, recovery will
eventually kill it and move on rather than just getting stuck forever.
 If the volume of queries is known not to be too high, it's reasonable
to expect that a few good whacks will be enough to get things back on
track.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Josh Berkus


> Now that I've realized what the real problem is with max_standby_delay
> (namely, that inactivity on the master can use up the delay), I think
> we should do what Tom originally suggested here.  It's not as good as
> a really working max_standby_delay, but we're not going to have that
> for 9.0, and it's clearly better than a boolean.

I guess I'm not clear on how what Tom proposed is fundamentally
different from max_standby_delay = -1.  If there's enough concurrent
queries, recovery would never catch up.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQLSTATE for Hot Standby cancellation

2010-05-06 Thread Kevin Grittner

Simon Riggs  wrote:

> We don't have a retryable SQLSTATE, so perhaps we should document
> that serialization_failure and deadlock_detected are both
> retryable error conditions. As the above implies HS can throw some
> errors that are non-retyable, so we use ERRCODE_ADMIN_SHUTDOWN.

There is one SQLSTATE (40001 - serialization_failure) to indicate
that the transaction failed due to conflicts with concurrent
transactions.  This is supposed to be used when there was no problem
with the transaction itself, were it to be run when nothing else is
running and the environment isn't failing.  (For example, it isn't
the right code to use if you fail because your network connection is
broken or you're running out of disk space.)  In my view it is a
mistake *not* to use it whenever concurrent transactions cause
failure; we can always use different error message text, details,
and hints to differentiate between the particular ways in which
concurrent transactions caused the failure.

The advantage of consistently using the standard value for this is
that there is software which recognizes this and automatically
retries the transaction from the beginning in a manner which is
transparent to the users and the application code.  Our users never
see anything but a delay when hitting this; it's effectively the
same as blocking, with no application programmer effort needed.

Having a separate deadlock_detected SQLSTATE is arguably a bug.  (In
fact, I have argued that; but haven't persuaded the community of
it.)  Coming up with yet another SQLSTATE to indicate that there's
nothing wrong with the transaction except its timing in relation to
other transactions would, in my view, compound the error.

That said, our shop is resourceful enough (in more than one sense)
to overcome this if need be.  I just don't see why the community
would choose to create barriers which cause failures for software
which would otherwise Just Work, and the corresponding programming
burden on shops using PostgreSQL to keep a list of "nothing really
wrong with the transaction, you can probably just reschedule it"
SQLSTATE values for the product.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioning/inherited tables vs FKs

2010-05-06 Thread Tom Lane

Florian Pflug  writes:
> Anyway, I was wondering why we need guaranteed uniqueness for FK
> relationships anyway.

It's required by spec, and the semantics aren't terribly sensible
without it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_transaction patch

2010-05-06 Thread Alvaro Herrera

Excerpts from Joel Jacobson's message of jue may 06 09:51:41 -0400 2010:
> Hi,
> 
> I propose a set of new statistics functions and system views.

Hi,

Please add your patch to the next commitfest: http://commitfest.postgresql.org

Thanks
-- 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioning/inherited tables vs FKs

2010-05-06 Thread Florian Pflug

On May 6, 2010, at 16:38 , Tom Lane wrote:
> Florian Pflug  writes:
>> What lies at the heart of this problem is the lack of multi-table
>> indices and hence multi-table unique constraints in postgres. AFAIK
>> with those in place the rest amounts to the removal of ONLY from the
>> constraint check queries plus some code to propagate constraint
>> triggers to child tables.
> 
> Well, the lack of multi-table indexes certainly is the heart of the
> problem, but I'm not sure that inventing such a thing is the solution.
> Quite aside from the implementation difficulties involved in it,
> doing things that way would destroy some of the major reasons to
> partition tables at all:
> 
> * the index grows as the size of the total data set, it's not limited
> by partition size
> 
> * can't cheaply drop one partition any more, you have to vacuum the
> (big) index first
> 
> * probably some other things I'm not thinking of at the moment.
> 
> I think the real solution is to upgrade the partitioning infrastructure
> so that we can understand that columns are unique across the whole
> partitioned table, when the partitioning is done on that column and each
> partition has a unique index.

True, for partitioned tables multi-table indices reintroduce some of the 
performance problems that partitioning is supposed to avoid.

But OTOH if you use table inheritance as a means to map data models (e.g. EER) 
more naturally to SQL, then multi-table indices have advantages over the 
partitioning-friendly solution you sketched above.

With a multi-table index, SELECT * FROM PARENT WHERE ID=?? has complexity 
LOG(N*M) where M is the number of tables inheriting from PARENT (including 
PARENT itself), and N the average number of rows in these tables. With one 
index per child, the complexity is M*LOG(N) which is significantly higher if M 
is large. Constraint exclusion could reduce that to LOG(N), but only if each 
child is has it's own private ID range which precludes ID assignment from a 
global sequence and hence makes ID assignment much more complex and error-prone.

Anyway, I was wondering why we need guaranteed uniqueness for FK relationships 
anyway. Because if we don't (which I didn't check prior to posting this I must 
admit), then why can't we simply remove the "ONLY" from the RI queries and let 
ALTER TABLE attach the RI triggers not only to the parent but also to all 
children. What am I missing?

best regards,
Florian Pflug

smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] pg_migrator to /contrib in a later 9.0 beta

2010-05-06 Thread Tom Lane

Alvaro Herrera  writes:
> Excerpts from Bruce Momjian's message of jue may 06 11:19:27 -0400 2010:
>> It seems copying over pg_statistic would require preservation of
>> pg_class.oid.  Right now we only preserve pg_class.relfilenode.

> That could be fixed easily by creating a throwaway table which included the
> qualified table name instead of the OID, and reloading it into pg_statistic
> after the migration.

Yeah.  Casting to and from regclass would do the trick too.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding xpath_exists function

2010-05-06 Thread Robert Haas

On Thu, May 6, 2010 at 10:10 AM, Mike Fowler  wrote:
> Hi hackers,
>
> Although this is a very small change I figured I'd practice the policy of
> outlining your change before you write the code and attempt a patch
> submission. Essentially I see the function as a convenience function that
> exposes the results of the xpath built in exists() function as a boolean for
> easier SQL. The syntax will match the xpath function already present:
>
> xpath_exists(xpath, xml[, nsarray])
>
> The implementation will check that the xpath value contains 'exists(.)'
> and add it if not present for added usability. I can't blindly wrap the
> value with exists(...) as exists(exists(.)) will always return true in
> xpath. I've not allocated the oid yet, but will try to earn my bonus points
> for getting it close to xpath()'s oid. :) Appropriate regression tests will
> be added after I've ensured that 'make check' is happy I've not regressed
> anything.

I'm not sure I understand how this more convenient than just using
xpath() with exists()?

> Can I confirm that contrib/xml2 is deprecated and I should be carrying out
> my work in backend/utils/adt/xml.c?

Yes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Robert Haas

On Mon, May 3, 2010 at 11:37 AM, Tom Lane  wrote:
> I'm inclined to think that we should throw away all this logic and just
> have the slave cancel competing queries if the replay process waits
> more than max_standby_delay seconds to acquire a lock.  This is simple,
> understandable, and behaves the same whether we're reading live data or
> not.

Now that I've realized what the real problem is with max_standby_delay
(namely, that inactivity on the master can use up the delay), I think
we should do what Tom originally suggested here.  It's not as good as
a really working max_standby_delay, but we're not going to have that
for 9.0, and it's clearly better than a boolean.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator to /contrib in a later 9.0 beta

2010-05-06 Thread Alvaro Herrera

Excerpts from Bruce Momjian's message of jue may 06 11:19:27 -0400 2010:

> It seems copying over pg_statistic would require preservation of
> pg_class.oid.  Right now we only preserve pg_class.relfilenode.

That could be fixed easily by creating a throwaway table which included the
qualified table name instead of the OID, and reloading it into pg_statistic
after the migration.

Perhaps this could be done as a regular task in the old database before
bringing the server down.
-- 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator to /contrib in a later 9.0 beta

2010-05-06 Thread Bruce Momjian

Tom Lane wrote:
> Bruce Momjian  writes:
> > Jesper Krogh wrote:
> >> I should have written:
> >> Why isn't statistics copied over or why doesnt pg_migrator run analyze by
> >> itself?
> 
> > Yeah, the statistics are part of the system tables, and system tables
> > are fully handled by pg_dumpall --schema-only (except for statistics). 
> > There might be changes in the system table statistics format that would
> > break if pg_migrator tried to migrate the statistics.
> 
> Right, it's not really practical for pg_migrator to just copy over the
> statistics without any intelligence.  We might at some time choose to
> teach it which stats could be copied safely, but that hardly seems like
> something to do in version 1.0 --- and anyway it could not be a 100%
> solution.

It seems copying over pg_statistic would require preservation of
pg_class.oid.  Right now we only preserve pg_class.relfilenode.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator to /contrib in a later 9.0 beta

2010-05-06 Thread Bruce Momjian

Tom Lane wrote:
> Bruce Momjian  writes:
> > Jesper Krogh wrote:
> >> I should have written:
> >> Why isn't statistics copied over or why doesnt pg_migrator run analyze by
> >> itself?
> 
> > Yeah, the statistics are part of the system tables, and system tables
> > are fully handled by pg_dumpall --schema-only (except for statistics). 
> > There might be changes in the system table statistics format that would
> > break if pg_migrator tried to migrate the statistics.
> 
> Right, it's not really practical for pg_migrator to just copy over the
> statistics without any intelligence.  We might at some time choose to
> teach it which stats could be copied safely, but that hardly seems like
> something to do in version 1.0 --- and anyway it could not be a 100%
> solution.
> 
> > And if pg_migrator ran analyze itself, it would greatly increase its
> > great migration times!
> 
> Exactly.  It's not a win for pg_migrator to not give you back control of
> the database until everything has been ANALYZEd.  That's a task that can
> likely be done in background.

I have to keep those sub-minute migration times for bragging rights. :-)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Simon Riggs

On Thu, 2010-05-06 at 17:39 +0300, Heikki Linnakangas wrote:

> Not the same plugin. A hook for stop/resume would need to be called
> before and/or after each record, the one for conflict resolution would
> need to be called at each conflict. Designing a good interface for a
> plugin is hard, you need at least a couple of samples ideas for plugins
> that would use the hook, before you know the interface is flexible enough.

* current behaviour of max_standby_delay
* pause for X time if conflict, then cancel - which is the suggested
behaviour upthread
* pause-if-conflict, explicit resume needed

Info passed to plugin
* conflict type
* relfilenode
* latestRemovedXid

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Heikki Linnakangas

Simon Riggs wrote:
> On Thu, 2010-05-06 at 16:09 +0200, Dimitri Fontaine wrote:
>> Simon Riggs  writes:
>>> It would be easier to implement a conflict resolution plugin that is
>>> called when a conflict occurs, allowing users to have a customisable
>>> mechanism. Again, I have no objection to that proposal.
>> To implement, if you say so, no doubt. To use, that means you need to
>> install a contrib module after validation that the trade offs there are
>> the one you're interested into, or you have to code it yourself. In C.
>>
>> I don't see that as an improvement over what we have now. Our main
>> problem seems to be the documentation of the max_standby_delay, where we
>> give the impression it's doing things the code can not do. IIUC.
> 
> I meant "easier to implement than what Florian suggested".
> 
> The plugin would also allow you to have the pause/resume capability.

Not the same plugin. A hook for stop/resume would need to be called
before and/or after each record, the one for conflict resolution would
need to be called at each conflict. Designing a good interface for a
plugin is hard, you need at least a couple of sample ideas for plugins
that would use the hook, before you know the interface is flexible enough.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Heikki Linnakangas

Simon Riggs wrote:
> On Thu, 2010-05-06 at 16:09 +0200, Dimitri Fontaine wrote:
>> Simon Riggs  writes:
>>> It would be easier to implement a conflict resolution plugin that is
>>> called when a conflict occurs, allowing users to have a customisable
>>> mechanism. Again, I have no objection to that proposal.
>> To implement, if you say so, no doubt. To use, that means you need to
>> install a contrib module after validation that the trade offs there are
>> the one you're interested into, or you have to code it yourself. In C.
>>
>> I don't see that as an improvement over what we have now. Our main
>> problem seems to be the documentation of the max_standby_delay, where we
>> give the impression it's doing things the code can not do. IIUC.
> 
> I meant "easier to implement than what Florian suggested".
> 
> The plugin would also allow you to have the pause/resume capability.

Not the same plugin. A hook for stop/resume would need to be called
before and/or after each record, the one for conflict resolution would
need to be called at each conflict. Designing a good interface for a
plugin is hard, you need at least a couple of samples ideas for plugins
that would use the hook, before you know the interface is flexible enough.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] LD_LIBRARY_PATH versus rpath

2010-05-06 Thread Peter Eisentraut

On tor, 2010-05-06 at 09:38 -0400, Tom Lane wrote:
> Funny point here: in the Fedora/RHEL RPMs, I use --disable-rpath
> because "don't use RPATH" is part of the standard packaging guidelines
> for that platform.  However, pl/perl has to double back and use rpath
> anyway because libperl.so doesn't exist in the ldconfig path; it's in
> some version-numbered directory and they don't provide any link or
> ldconfig entry so you could find it otherwise.  Annoying as heck.
> I've always wondered how many other packagers have to carry patches
> similar to
> http://cvs.fedoraproject.org/viewvc/rpms/postgresql/devel/postgresql-perl-rpath.patch

Debian has libperl in /usr/lib, so there is no issue.  But if there
were, there is a relatively new policy that you can should use rpath if
you need a library that is installed in a nonstandard path.  (Should
actually use this new runpath thing, perhaps.)  The same new policy
prohibits packages from modifying /etc/ld.so.conf.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioning/inherited tables vs FKs

2010-05-06 Thread Tom Lane

Florian Pflug  writes:
> What lies at the heart of this problem is the lack of multi-table
> indices and hence multi-table unique constraints in postgres. AFAIK
> with those in place the rest amounts to the removal of ONLY from the
> constraint check queries plus some code to propagate constraint
> triggers to child tables.

Well, the lack of multi-table indexes certainly is the heart of the
problem, but I'm not sure that inventing such a thing is the solution.
Quite aside from the implementation difficulties involved in it,
doing things that way would destroy some of the major reasons to
partition tables at all:

* the index grows as the size of the total data set, it's not limited
by partition size

* can't cheaply drop one partition any more, you have to vacuum the
(big) index first

* probably some other things I'm not thinking of at the moment.

I think the real solution is to upgrade the partitioning infrastructure
so that we can understand that columns are unique across the whole
partitioned table, when the partitioning is done on that column and each
partition has a unique index.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] possible memory leak with SRFs

2010-05-06 Thread Tom Lane

Nikhil Sontakke  writes:
> Can someone please explain why we do not reset the expression context
> if an SRF is involved during execution?

Consider
srf(foo(col))
where foo returns a pass-by-reference datatype.  Your proposed patch
would cut the knees out from under argument values that the SRF could
reasonably expect to still be there on subsequent calls.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Simon Riggs

On Thu, 2010-05-06 at 16:09 +0200, Dimitri Fontaine wrote:
> Simon Riggs  writes:
> > It would be easier to implement a conflict resolution plugin that is
> > called when a conflict occurs, allowing users to have a customisable
> > mechanism. Again, I have no objection to that proposal.
> 
> To implement, if you say so, no doubt. To use, that means you need to
> install a contrib module after validation that the trade offs there are
> the one you're interested into, or you have to code it yourself. In C.
> 
> I don't see that as an improvement over what we have now. Our main
> problem seems to be the documentation of the max_standby_delay, where we
> give the impression it's doing things the code can not do. IIUC.

I meant "easier to implement than what Florian suggested".

The plugin would also allow you to have the pause/resume capability.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator to /contrib in a later 9.0 beta

2010-05-06 Thread Tom Lane

Bruce Momjian  writes:
> Jesper Krogh wrote:
>> I should have written:
>> Why isn't statistics copied over or why doesnt pg_migrator run analyze by
>> itself?

> Yeah, the statistics are part of the system tables, and system tables
> are fully handled by pg_dumpall --schema-only (except for statistics). 
> There might be changes in the system table statistics format that would
> break if pg_migrator tried to migrate the statistics.

Right, it's not really practical for pg_migrator to just copy over the
statistics without any intelligence.  We might at some time choose to
teach it which stats could be copied safely, but that hardly seems like
something to do in version 1.0 --- and anyway it could not be a 100%
solution.

> And if pg_migrator ran analyze itself, it would greatly increase its
> great migration times!

Exactly.  It's not a win for pg_migrator to not give you back control of
the database until everything has been ANALYZEd.  That's a task that can
likely be done in background.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Adding xpath_exists function

2010-05-06 Thread Mike Fowler


Hi hackers,

Although this is a very small change I figured I'd practice the policy 
of outlining your change before you write the code and attempt a patch 
submission. Essentially I see the function as a convenience function 
that exposes the results of the xpath built in exists() function as a 
boolean for easier SQL. The syntax will match the xpath function already 
present:


xpath_exists(xpath, xml[, nsarray])

The implementation will check that the xpath value contains 
'exists(.)' and add it if not present for added usability. I can't 
blindly wrap the value with exists(...) as exists(exists(.)) will 
always return true in xpath. I've not allocated the oid yet, but will 
try to earn my bonus points for getting it close to xpath()'s oid. :) 
Appropriate regression tests will be added after I've ensured that 'make 
check' is happy I've not regressed anything.


Can I confirm that contrib/xml2 is deprecated and I should be carrying 
out my work in backend/utils/adt/xml.c?


I shall be coding it up over the next day or two, work & family permitting!

Thanks,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Dimitri Fontaine

Simon Riggs  writes:
> It would be easier to implement a conflict resolution plugin that is
> called when a conflict occurs, allowing users to have a customisable
> mechanism. Again, I have no objection to that proposal.

To implement, if you say so, no doubt. To use, that means you need to
install a contrib module after validation that the trade offs there are
the one you're interested into, or you have to code it yourself. In C.

I don't see that as an improvement over what we have now. Our main
problem seems to be the documentation of the max_standby_delay, where we
give the impression it's doing things the code can not do. IIUC.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] SQLSTATE for Hot Standby cancellation

2010-05-06 Thread Simon Riggs

On Thu, 2010-05-06 at 12:08 +0200, Yeb Havinga wrote:

> That's funny because when I was reading this thread, I was thinking the 
> exact opposite: having max_standby_delay always set to 0 so I know the 
> standby server is as up-to-date as possible. The application that 
> accesses the hot standby has to be 'special' anyway because it might 
> deliver not-up-to-date data. If that information about specialties 
> regarding querying the standby server includes the warning that queries 
> might get cancelled, they can opt for a retry themselves (is there a 
> special return code to catch that case? like PGRES_RETRY_LATER) or a 
> message to the user that their report is currently unavailable and they 
> should retry in a few minutes.

Very sensible. Kevin Grittner already asked for that and I alread
agreed, yet it has not been implemented that way
http://archives.postgresql.org/pgsql-hackers/2008-12/msg01691.php

Can anyone remember a specific objection to that 'cos it still sounds
very sensible to me and is a quick, low impact change.

Currently the SQLSTATE is ERRCODE_ADMIN_SHUTDOWN or
ERRCODE_QUERY_CANCELED if not idle. That makes it even harder to
determine the retryability, since it can come back in one of two states.

Proposed SQLSTATE for HS cancellation is 
 case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN:
 case PROCSIG_RECOVERY_CONFLICT_LOCK:
 case PROCSIG_RECOVERY_CONFLICT_SNAPSHOT:
 case PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK:
recovery_conflict_errcode = ERRCODE_T_R_SERIALIZATION_FAILURE;
break;
 case PROCSIG_RECOVERY_CONFLICT_DATABASE:
 case PROCSIG_RECOVERY_CONFLICT_TABLESPACE:
recovery_conflict_errcode = ERRCODE_ADMIN_SHUTDOWN;
break;

We don't have a retryable SQLSTATE, so perhaps we should document that
serialization_failure and deadlock_detected are both retryable error
conditions. As the above implies HS can throw some errors that are
non-retyable, so we use ERRCODE_ADMIN_SHUTDOWN.

Comments?

-- 
 Simon Riggs   www.2ndQuadrant.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] SQLSTATE for Hot Standby cancellation

2010-05-06 Thread Simon Riggs

On Thu, 2010-05-06 at 12:08 +0200, Yeb Havinga wrote:

> That's funny because when I was reading this thread, I was thinking the 
> exact opposite: having max_standby_delay always set to 0 so I know the 
> standby server is as up-to-date as possible. The application that 
> accesses the hot standby has to be 'special' anyway because it might 
> deliver not-up-to-date data. If that information about specialties 
> regarding querying the standby server includes the warning that queries 
> might get cancelled, they can opt for a retry themselves (is there a 
> special return code to catch that case? like PGRES_RETRY_LATER) or a 
> message to the user that their report is currently unavailable and they 
> should retry in a few minutes.

Very sensible. Kevin Grittner already asked for that and I alread
agreed, yet it has not been implemented that way
http://archives.postgresql.org/pgsql-hackers/2008-12/msg01691.php

Can anyone remember a specific objection to that 'cos it still sounds
very sensible to me and is a quick, low impact change.

Currently the SQLSTATE is ERRCODE_ADMIN_SHUTDOWN or
ERRCODE_QUERY_CANCELED if not idle. That makes it even harder to
determine the retryability, since it can come back in one of two states.

Proposed SQLSTATE for HS cancellation is 
 case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN:
 case PROCSIG_RECOVERY_CONFLICT_LOCK:
 case PROCSIG_RECOVERY_CONFLICT_SNAPSHOT:
 case PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK:
recovery_conflict_errcode = ERRCODE_T_R_SERIALIZATION_FAILURE;
break;
 case PROCSIG_RECOVERY_CONFLICT_DATABASE:
 case PROCSIG_RECOVERY_CONFLICT_TABLESPACE:
recovery_conflict_errcode = ERRCODE_ADMIN_SHUTDOWN;
break;

We don't have a retryable SQLSTATE, so perhaps we should document that
serialization_failure and deadlock_detected are both retryable error
conditions. As the above implies HS can throw some errors that are
non-retyable, so we use ERRCODE_ADMIN_SHUTDOWN.

Comments?

-- 
 Simon Riggs   www.2ndQuadrant.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_stat_transaction patch

2010-05-06 Thread Joel Jacobson

Hi,

I propose a set of new statistics functions and system views.

I need these functions in order to do automated testing of our system,
consisting of hundreds of stored procedures in plpgsql.
My plan is to develop some additional functions to pgTAP, benefiting from
the new system tables I've added.

The patch should apply to 9.0beta or HEAD, but I created it using 8.4.3
because that's the version I'm using.

I'm thankful for your feedback.

My apologies if the packaging of the patch does not conform to your
guidelines, feedback on this is also welcome.

-- 
Best regards,

Joel Jacobson
Glue Finance

E: j...@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden


README:

Background
==
The views pg_stat_user_tables and pg_stat_user_functions shows statistics on
tables and functions.
The underlying functions named pg_stat_get_* fetches recent data from the
statistics collector, and returns the requested value for the given "oid"
(i.e. "tableid/relationid" or "functionid").
In the end of each transaction[1], the collected statistics are sent to the
statistics collector[2].

[1] upon COMMIT/ROLLBACK, or a bit later (the report frequency is controlled
by the PGSTAT_STAT_INTERVAL setting, default value is 500 ms)
[2] if you do a ps aux, it is the process named "postgres: stats collector
process"

Problem
===
Within a current transaction, there was no way of accessing the internal
data structures which contains the so far collected statistics.
I wanted to check exactly what data changes my functions made and what
functions they called, without having to commit the transaction
and without mixing the statistics data with all the other simultaneously
running transactions.

Solution

I have exported get accessor methods to the internal data structure
containing so far collected statistics for the current transaction.

I have also exported the method pgstat_report_stat to make it possible to
force a "report and reset" of the so far collected statistics.
This was necessary to avoid not-yet-reported statistics for a previous
transaction to affect the current transaction.

I used the unused_oids script to find unused oids and choosed the range
between 3030-3044 for the new functions.

Functions
=
test=# \df+ pg_catalog.pg_stat_get_transaction_*

   List of functions
   Schema   |Name| Result data type
| Argument data types |  Type  | Volatility | Owner | Language |
   Source code |   Description

++--+-+++---+--++-
 pg_catalog | pg_stat_get_transaction_blocks_fetched | bigint
| oid | normal | stable | joel  | internal |
pg_stat_get_transaction_blocks_fetched | statistics: number of blocks
fetched in current transaction
 pg_catalog | pg_stat_get_transaction_blocks_hit | bigint
| oid | normal | stable | joel  | internal |
pg_stat_get_transaction_blocks_hit | statistics: number of blocks
found in cache in current transaction
 pg_catalog | pg_stat_get_transaction_dead_tuples| bigint
| oid | normal | stable | joel  | internal |
pg_stat_get_transaction_dead_tuples| statistics: number of dead
tuples in current transaction
 pg_catalog | pg_stat_get_transaction_function_calls | bigint
| oid | normal | stable | joel  | internal |
pg_stat_get_transaction_function_calls | statistics: number of function
calls in current transaction
 pg_catalog | pg_stat_get_transaction_function_self_time | bigint
| oid | normal | stable | joel  | internal |
pg_stat_get_transaction_function_self_time | statistics: self execution time
of function in current transaction
 pg_catalog | pg_stat_get_transaction_function_time  | bigint
| oid | normal | stable | joel  | internal |
pg_stat_get_transaction_function_time  | statistics: execution time of
function in current transaction
 pg_catalog | pg_stat_get_transaction_live_tuples| bigint
| oid | normal | stable | joel  | internal |
pg_stat_get_transaction_live_tuples| statistics: number of live
tuples in current transaction
 pg_catalog | pg_stat_get_transaction_numscans   | bigint
| oid | normal | stable | joel  | internal |
pg_stat_get_transaction_numscans   | statistics: number of scans
done for table/index in current transaction
 pg_catalog | pg_stat_get_transaction_tuples_deleted | bigint
| oid | normal | stable | joel  | i

Re: [HACKERS] LD_LIBRARY_PATH versus rpath

2010-05-06 Thread Tom Lane

Andrew Dunstan  writes:
> Greg Stark wrote
>> We only set RPATH if the install location isn't part of the default ld
>> library path specified by /etc/ld.so.conf right? Setting it if it is
>> in the default path would be antisocial.

> How are we going to know at build time what is in the ld.soconf of the 
> installation machine?

Exactly.  In practice, it's on the packager's head to specify
--disable-rpath if he intends to install into the platform's
regular library search path.

Funny point here: in the Fedora/RHEL RPMs, I use --disable-rpath
because "don't use RPATH" is part of the standard packaging guidelines
for that platform.  However, pl/perl has to double back and use rpath
anyway because libperl.so doesn't exist in the ldconfig path; it's in
some version-numbered directory and they don't provide any link or
ldconfig entry so you could find it otherwise.  Annoying as heck.
I've always wondered how many other packagers have to carry patches
similar to
http://cvs.fedoraproject.org/viewvc/rpms/postgresql/devel/postgresql-perl-rpath.patch

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator to /contrib in a later 9.0 beta

2010-05-06 Thread Bruce Momjian

Bruce Momjian wrote:
> > The database (of a reasonable size) is useless until statistics is 
> > available.
> > 
> > I guess it is because pg_dump/restore doesn't do it either.
> 
> Yeah, the statistics are part of the system tables, and system tables
> are fully handled by pg_dumpall --schema-only (except for statistics). 
> There might be changes in the system table statistics format that would
> break if pg_migrator tried to migrate the statistics.  Right now
> pg_migrator is immune from any system table changes, and I would like to
> keep it that way.
> 
> And if pg_migrator ran analyze itself, it would greatly increase its
> great migration times!

Forgot the :-).

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator to /contrib in a later 9.0 beta

2010-05-06 Thread Bruce Momjian

Jesper Krogh wrote:
> On 2010-05-06 06:41, Alvaro Herrera wrote:
> > Excerpts from Jesper Krogh's message of jue may 06 00:32:09 -0400 2010:
> >
> >
> >> Q: I read you pdf, why isn't statistics copied over? It seems to be the 
> >> last
> >> part missing from doing an upgrade in a few minutes.
> >>  
> > Seems fraught with peril, and a bit pointless.  What's so bad about having 
> > to
> > run ANALYZE afterwards?
> >
> 
> There is nothing directly "bad" about it.. but:
> 
> It's just "an extra step, that might be overseen and is absolutely 
> required".
> 
> I should have written:
> Why isn't statistics copied over or why doesnt pg_migrator run analyze by
> itself?
> 
> The database (of a reasonable size) is useless until statistics is 
> available.
> 
> I guess it is because pg_dump/restore doesn't do it either.

Yeah, the statistics are part of the system tables, and system tables
are fully handled by pg_dumpall --schema-only (except for statistics). 
There might be changes in the system table statistics format that would
break if pg_migrator tried to migrate the statistics.  Right now
pg_migrator is immune from any system table changes, and I would like to
keep it that way.

And if pg_migrator ran analyze itself, it would greatly increase its
great migration times!

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Greg Smith


Yeb Havinga wrote:

Rob Wultsch wrote:

I can not imagine setting this value to anything other than a bool and
most of the time that bool would be -1.
That's funny because when I was reading this thread, I was thinking 
the exact opposite: having max_standby_delay always set to 0 so I know 
the standby server is as up-to-date as possible.


If you ask one person about this, you'll discover they only consider one 
behavior here sane, and any other setting is crazy.  Ask five people, 
and you'll likely find someone who believes the complete opposite.  Ask 
ten and carefully work out the trade-offs they're willing to make given 
the fundamental limitations of replication, and you'll arrive at the 
range of behaviors available right now, plus some more that haven't been 
built yet.  There are a lot of different types of database applications 
out there, each with their own reliability and speed requirements to 
balance.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] LD_LIBRARY_PATH versus rpath

2010-05-06 Thread Andrew Dunstan



Greg Stark wrote


We only set RPATH if the install location isn't part of the default ld
library path specified by /etc/ld.so.conf right? Setting it if it is
in the default path would be antisocial.

  


How are we going to know at build time what is in the ld.soconf of the 
installation machine?


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioning/inherited tables vs FKs

2010-05-06 Thread Robert Haas

On Thu, May 6, 2010 at 6:37 AM, Jaime Casanova  wrote:
> i would call it a bug, but this is a known issue
>
>>
>> The only solution currently is that the referring table has to be
>> partitioned the same way as the referred table in the FK, and
>> its parent table has to be queried.
>>
>
> no, you can install a trigger on the child table that verifies the
> existence of the id on your partitioned parent table, the SELECT
> you'll use inside that trigger will look at the entire set of tables
> (as long as you don't use FROM ONLY)
>
> also could be useful to put an index (even a PK) on every child to
> ensure uniqueness and make the SELECT more efficient, and of course a
> check constraint in every child emulating a partition key

The referential integrity triggers contain some extra magic that isn't
easily simulatable in userland, and that is necessary to make the
foreign key constraints airtight.  We've discussed this previously but
I don't remember which thread it was or the details of when things
blow up.  I think it's something like this: the parent has a tuple
that is not referenced by any child.  Transaction 1 begins, deletes
the parent tuple (checking that it has no children), and pauses.
Transaction 2 begins, adds a child tuple that references the parent
tuple (checking that the parent exists, which it does), and commits.
Transaction 1 commits.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Greg Smith


Heikki Linnakangas wrote:

Robert Haas wrote:
  

I am not convinced it will be unpredictable.  The only caveats that
I've seen so far are:

- You need to run ntpd.
- Queries will get cancelled like crazy if you're not using steaming
replication.



And also in situations where the master is idle for a while and then
starts doing stuff. That's the most significant source of confusion,
IMHO, I wouldn't mind the requirement of ntpd so much.
  


I consider it mandatory to include an documentation update here that 
says "if you set max_standby_delay > 0, and do not run something that 
regularly generates activity to the master like [example], you will get 
unnecessary query cancellation on the standby".  As well as something 
like what Josh was suggesting, adding warnings that this is "for 
advanced users only", to borrow his wording.  This is why my name has 
been on the open items list for a while now--to make sure I follow 
through on that.


I haven't written it yet because there were still changes to the 
underlying code being made up until moments before beta started, then 
this discussion started without a break between.  There are a clear set 
of user land things that can be done to make up the deficiencies in the 
state of the server code, but we won't even get to see how they work out 
in the field (feedback needed to improve the 9.1 design) if this 
capability goes away altogether.


Is it not clear that there are some people who consider the occasional 
bit of cancellation OK, because they can correct for at the application 
layer and they're willing to factor it in to their design if it allows 
using the otherwise idle HA standby?  I'm fine with expanding that 
section of the documentation too, to make it more obvious that's the 
only situation this aspect of HS is aimed at and suitable for.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Simon Riggs

On Thu, 2010-05-06 at 13:46 +0200, Florian Pflug wrote:
> On May 6, 2010, at 12:48 , Simon Riggs wrote:
> > On Thu, 2010-05-06 at 11:36 +0200, Florian Pflug wrote:
> >> If there was an additional SQL-callable function that returned the 
> >> backends the recovery process is currently waiting for, plus one that 
> >> reported that last timestamp seen in the WAL, than all those different 
> >> cancellation policies could be implemented as daemons that monitor 
> >> recovery and kill backends as needed, no?
> >> 
> >> That would allow people to experiment with different cancellation 
> >> policies, and maybe shed some light on what the useful policies are in 
> >> practice.
> > 
> > It would be easier to implement a conflict resolution plugin that is
> > called when a conflict occurs, allowing users to have a customisable
> > mechanism. Again, I have no objection to that proposal.
> 
> True, providing a plugin API would be even better, since no SQL callable API 
> would have to be devised, and possible algorithms wouldn't be constrained by 
> such an API's limitations.
> 
> The existing max_standby_delay logic could be moved to such a plugin, living 
> in contrib. Since it was already established (I believe) that the existing 
> max_standby_delay logic is sufficiently fragile to require significant 
> knowledge on the user's side about potential pitfalls, asking those users to 
> install the plugin from contrib shouldn't be too much to ask for.
> 
> This way, users who really need something more sophisticated than recovery 
> wins always or standby wins always are given the tools they need *if* they're 
> willing to put in the extra effort. For those who don't, offering 
> max_standby_delay probably does more harm than good anyway, so nothing is 
> lost by not offering it in the first place.

No problem from me with that approach.

As long as 9.0 ships with the current capability to enforce
max_standby_delay, I have no problem.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] possible memory leak with SRFs

2010-05-06 Thread Nikhil Sontakke

> Patch attached with this mail.
>

The previous patch was WIP. Please take a look at this one instead. I
hope this handles the cases where results are not just base datums but
palloc'ed datums like string types too.

Regards,
Nikhils

> The SRF has its own long-lived "SRF multi-call context" anyways. And
> AIUI, SRFs return tuples one-by-one or do we materialize the same into
> a tuplestore in some cases?
>
> Regards,
> Nikhils
>
> On Wed, May 5, 2010 at 7:23 PM, Nikhil Sontakke
>  wrote:
>> Hi,
>>
>> I saw this behavior with latest GIT head:
>>
>> create table xlarge(val numeric(19,0));
>> insert into xlarge values(generate_series(1,5));
>>
>> The above generate series will return an int8 which will then be
>> casted to numeric (via int8_to_numericvar) before being inserted into
>> the table. I observed that the ExprContext memory associated with
>> econtext->ecxt_per_tuple_memory is slowly bloating up till the end of
>> the insert operation.
>>
>> This becomes significant the moment we try to insert a significant
>> number of entries using this SRF. I can see the memory being consumed
>> by the PG backend slowly grow to a large percentage.
>>
>> I see that the executor (take ExecResult as an example) does not reset
>> the expression context early if an SRF is churning out tuples. What
>> could be a good way to fix this?
>>
>> Regards,
>> Nikhils
>> --
>> http://www.enterprisedb.com
>>
>
>
>
> --
> http://www.enterprisedb.com
>



-- 
http://www.enterprisedb.com
diff --git a/src/backend/executor/nodeResult.c b/src/backend/executor/nodeResult.c
index 94796d5..ed437a0 100644
--- a/src/backend/executor/nodeResult.c
+++ b/src/backend/executor/nodeResult.c
@@ -91,6 +91,9 @@ ExecResult(ResultState *node)
 		}
 	}
 
+	/* release memory associated with the previous tuple */
+	ResetExprContext(econtext);
+
 	/*
 	 * Check to see if we're still projecting out tuples from a previous scan
 	 * tuple (because there is a function-returning-set in the projection
@@ -106,13 +109,6 @@ ExecResult(ResultState *node)
 	}
 
 	/*
-	 * Reset per-tuple memory context to free any expression evaluation
-	 * storage allocated in the previous tuple cycle.  Note this can't happen
-	 * until we're done projecting out tuples from a scan tuple.
-	 */
-	ResetExprContext(econtext);
-
-	/*
 	 * if rs_done is true then it means that we were asked to return a
 	 * constant tuple and we already did the last time ExecResult() was
 	 * called, OR that we failed the constant qual check. Either way, now we

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Florian Pflug

On May 6, 2010, at 12:48 , Simon Riggs wrote:
> On Thu, 2010-05-06 at 11:36 +0200, Florian Pflug wrote:
>> If there was an additional SQL-callable function that returned the backends 
>> the recovery process is currently waiting for, plus one that reported that 
>> last timestamp seen in the WAL, than all those different cancellation 
>> policies could be implemented as daemons that monitor recovery and kill 
>> backends as needed, no?
>> 
>> That would allow people to experiment with different cancellation policies, 
>> and maybe shed some light on what the useful policies are in practice.
> 
> It would be easier to implement a conflict resolution plugin that is
> called when a conflict occurs, allowing users to have a customisable
> mechanism. Again, I have no objection to that proposal.

True, providing a plugin API would be even better, since no SQL callable API 
would have to be devised, and possible algorithms wouldn't be constrained by 
such an API's limitations.

The existing max_standby_delay logic could be moved to such a plugin, living in 
contrib. Since it was already established (I believe) that the existing 
max_standby_delay logic is sufficiently fragile to require significant 
knowledge on the user's side about potential pitfalls, asking those users to 
install the plugin from contrib shouldn't be too much to ask for.

This way, users who really need something more sophisticated than recovery wins 
always or standby wins always are given the tools they need *if* they're 
willing to put in the extra effort. For those who don't, offering 
max_standby_delay probably does more harm than good anyway, so nothing is lost 
by not offering it in the first place.

best regards,
Florian Pflug

smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Simon Riggs

On Thu, 2010-05-06 at 11:36 +0200, Florian Pflug wrote:

> If there was an additional SQL-callable function that returned the backends 
> the recovery process is currently waiting for, plus one that reported that 
> last timestamp seen in the WAL, than all those different cancellation 
> policies could be implemented as daemons that monitor recovery and kill 
> backends as needed, no?
> 
> That would allow people to experiment with different cancellation policies, 
> and maybe shed some light on what the useful policies are in practice.

It would be easier to implement a conflict resolution plugin that is
called when a conflict occurs, allowing users to have a customisable
mechanism. Again, I have no objection to that proposal.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioning/inherited tables vs FKs

2010-05-06 Thread Jaime Casanova

2010/5/6 Boszormenyi Zoltan :
>
> =# insert into refer (parent_id) values (1);
> ERROR:  insert or update on table "refer" violates foreign key
> constraint "refer_parent_id_fkey"
> DETAIL:  Key (parent_id)=(1) is not present in table "parent".
>
> The use case for this was there were different news items,
> and there were another table for summaries, that could point
> to any of the news items table. Another use case could be
> a large partitioned table with an FK to the main table where
> the referring table might only contain very few "interesting" data.
>
> No matter what are the semantics, the parent table in the
> inheritance chain cannot be used as and endpoint for FKs.
>
> Is it a bug, or intentional?

i would call it a bug, but this is a known issue

>
> The only solution currently is that the referring table has to be
> partitioned the same way as the referred table in the FK, and
> its parent table has to be queried.
>

no, you can install a trigger on the child table that verifies the
existence of the id on your partitioned parent table, the SELECT
you'll use inside that trigger will look at the entire set of tables
(as long as you don't use FROM ONLY)

also could be useful to put an index (even a PK) on every child to
ensure uniqueness and make the SELECT more efficient, and of course a
check constraint in every child emulating a partition key

-- 
Jaime Casanova www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] LD_LIBRARY_PATH versus rpath

2010-05-06 Thread Peter Eisentraut

On tor, 2010-05-06 at 11:20 +0100, Greg Stark wrote:
> We only set RPATH if the install location isn't part of the default ld
> library path specified by /etc/ld.so.conf right?

No.  How would you determine that?

>  Setting it if it is
> in the default path would be antisocial. 

That's why there is --disable-rpath.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] LD_LIBRARY_PATH versus rpath

2010-05-06 Thread Peter Eisentraut

On ons, 2010-05-05 at 19:20 -0400, Tom Lane wrote:
> Over at
> http://archives.postgresql.org/pgsql-general/2010-05/msg00091.php
> we have a complaint about "make check" failing when the install is
> intended to overwrite existing libraries (in particular, replacing
> 8.4 with 9.0 libpq).  I've done some off-list investigation and
> found that this appears to be a generic issue on Linux.  pg_regress
> invokes psql, which depends on libpq.so, and if psql fails due to
> picking up the wrong libpq.so then you get behavior as described.

Yeah, that's been broken since forever.

>  The shared libraries needed by the program are searched for in the fol-
>lowing order:
> 
>o  (ELF only) Using the directories specified in the  DT_RPATH  dynamic
>   section  attribute of the binary if present and DT_RUNPATH attribute
>   does not exist.  Use of DT_RPATH is deprecated.
> 
>o  Using the environment variable LD_LIBRARY_PATH.  Except if the  exe-
>   cutable  is  a  set-user-ID/set-group-ID binary, in which case it is
>   ignored.
> 
>o  (ELF only) Using the directories specified in the DT_RUNPATH dynamic
>   section attribute of the binary if present.

Ah, that sounds good.

> So the question is, should we modify Makefile.linux along the lines of
> 
> -rpath = -Wl,-rpath,'$(rpathdir)'
> +rpath = -Wl,-rpath,'$(rpathdir)',--enable-new-dtags

I see this feature was added in 2001, so it should be OK to use.

> My inclination is to try this in HEAD only and see if any problems
> emerge during the beta cycle.

I wouldn't consider backpatching it at all.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Andres Freund

Hi,

On Thursday 06 May 2010 07:35:49 Heikki Linnakangas wrote:
> Robert Haas wrote:
> > On Wed, May 5, 2010 at 11:52 PM, Bruce Momjian  wrote:
> >> I am afraid the current setting is tempting for users to enable, but
> >> will be so unpredictable that it will tarnish the repuation of HS and
> >> Postgres.  We don't want to be thinking in 9 months, "Wow, we shouldn't
> >> have shipped that features.  It is causing all kinds of problems."  We
> >> have done that before (rarely), and it isn't a good feeling.
> > 
> > I am not convinced it will be unpredictable.  The only caveats that
> > I've seen so far are:
> > 
> > - You need to run ntpd.
> > - Queries will get cancelled like crazy if you're not using steaming
> > replication.
> 
> And also in situations where the master is idle for a while and then
> starts doing stuff. That's the most significant source of confusion,
> IMHO, I wouldn't mind the requirement of ntpd so much.
Personally I would much rather like to keep that configurability and manually 
generate a record a second. Or possibly do something akin to 
archive_timeout...

That may be not as important once there are less sources of conflict 
resolutions - but thats something *definitely* not going to happen for 9.0...

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] LD_LIBRARY_PATH versus rpath

2010-05-06 Thread Greg Stark

On Thu, May 6, 2010 at 12:20 AM, Tom Lane  wrote:
> Now, pg_regress tries to ensure that the temporary installation
> will work as desired by setting LD_LIBRARY_PATH to point at the
> temp installation's lib/ directory.  However, the psql executable
> will by default get built with a DT_RPATH entry pointing at the
> intended final installation lib/.  And DT_RPATH overrides
> LD_LIBRARY_PATH, in the Linux dynamic loader.  man ld.so says:
>

We only set RPATH if the install location isn't part of the default ld
library path specified by /etc/ld.so.conf right? Setting it if it is
in the default path would be antisocial.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Robert Haas

On Thu, May 6, 2010 at 1:35 AM, Heikki Linnakangas
 wrote:
> Robert Haas wrote:
>> On Wed, May 5, 2010 at 11:52 PM, Bruce Momjian  wrote:
>>> I am afraid the current setting is tempting for users to enable, but
>>> will be so unpredictable that it will tarnish the repuation of HS and
>>> Postgres.  We don't want to be thinking in 9 months, "Wow, we shouldn't
>>> have shipped that features.  It is causing all kinds of problems."  We
>>> have done that before (rarely), and it isn't a good feeling.
>>
>> I am not convinced it will be unpredictable.  The only caveats that
>> I've seen so far are:
>>
>> - You need to run ntpd.
>> - Queries will get cancelled like crazy if you're not using steaming
>> replication.
>
> And also in situations where the master is idle for a while and then
> starts doing stuff. That's the most significant source of confusion,
> IMHO, I wouldn't mind the requirement of ntpd so much.

Oh.  Ouch.  OK, sorry, I missed that part.  Wow, that's awful.  OK, I
agree: we can't ship that as-is.

/me feels embarrassed for completely failing to understand the root of
the issue until 84 emails into the thread.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Yeb Havinga


Rob Wultsch wrote:

I manage a bunch of different environments and I am pretty sure that
in any of them if the db started seemingly randomly killing queries I
would have application teams followed quickly by executives coming
after me with torches and pitchforks.

I can not imagine setting this value to anything other than a bool and
most of the time that bool would be -1. I would only be unleashing a
kill storm in utter desperation and I would probably need to explain
myself in detail after. Utter desperation means I am sure I am going
to have to do a impactful failover at any moment and need a slave
completely up to date NOW.
  
That's funny because when I was reading this thread, I was thinking the 
exact opposite: having max_standby_delay always set to 0 so I know the 
standby server is as up-to-date as possible. The application that 
accesses the hot standby has to be 'special' anyway because it might 
deliver not-up-to-date data. If that information about specialties 
regarding querying the standby server includes the warning that queries 
might get cancelled, they can opt for a retry themselves (is there a 
special return code to catch that case? like PGRES_RETRY_LATER) or a 
message to the user that their report is currently unavailable and they 
should retry in a few minutes.


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator to /contrib in a later 9.0 beta

2010-05-06 Thread Dimitri Fontaine

Jesper Krogh  writes:
> I did go the painful way (dump+restore) at that point in time.
> It was an 8.1 - 8.3 migration. Since then data has grown and the dump
> restore is even less favorable on the 8.3 -> 9.0 migration.
>
> So in general the pg_migrator way seems to be the only way to aviod
> the slony way which is orders of magnitude more complicated.

Right in the middle there's the Londiste way. It works like Slony but
the setup is much easier.

Regards,
-- 
Dimitri Fontaine
PostgreSQL DBA, Architecte

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Florian Pflug

On May 6, 2010, at 11:26 , Dimitri Fontaine wrote:
> Greg Smith  writes:
>> If you need a script that involves changing a server setting to do
>> something, that translates into "you can't do that" for a typical DBA.  The
>> idea of a program regularly changing a server configuration setting on a
>> production system is one you just can't sell.  That makes this idea
>> incredibly more difficult to use in the field than any of the workarounds
>> that cope with the known max_standby_delay issues.
> 
> I still think that the best API we can do in a timely fashion for 9.0
> is:
> 
>  standby_conflict_winner = replay|queries
> 
>  pg_pause_recovery() / pg_resume_recovery()
> 
> It seems to me those two functions are only exposing existing facilities
> in the code, so that's more an API change that a new feature
> inclusion. Of course I'm certainly wrong. But the code has already been
> written.

If there was an additional SQL-callable function that returned the backends the 
recovery process is currently waiting for, plus one that reported that last 
timestamp seen in the WAL, than all those different cancellation policies could 
be implemented as daemons that monitor recovery and kill backends as needed, no?

That would allow people to experiment with different cancellation policies, and 
maybe shed some light on what the useful policies are in practice.

best regards,
Florian Pflug

smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] Partitioning/inherited tables vs FKs

2010-05-06 Thread Florian Pflug

On May 6, 2010, at 10:52 , Boszormenyi Zoltan wrote:
> =# create table parent (id serial primary key, t text);
> ...
> =# create table child () inherits (parent);
> ...
> =# create table refer (id serial primary key, parent_id integer
> ...
> =# insert into child (t) values ('a') returning id;
> ...
> =# select * from parent;
> id | t
> +---
>  1 | a
> (1 sor)
> 
> =# insert into refer (parent_id) values (1);
> ERROR:  insert or update on table "refer" violates foreign key
> constraint "refer_parent_id_fkey"
> DETAIL:  Key (parent_id)=(1) is not present in table "parent".
> 
> The use case for this was there were different news items,
> and there were another table for summaries, that could point
> to any of the news items table. Another use case could be
> a large partitioned table with an FK to the main table where
> the referring table might only contain very few "interesting" data.

Yeah, this is a long-standing issue with inheritance. Table inheritance in 
postgres isn't much more than an implicit UNION done on selects plus some logic 
in ALTER TABLE to keep propagate structural changes. Indices and constraints 
basically always behave as if ONLY had been specified. I'm not even sure if the 
ids are globally unique in your example - it might be that each child's "id 
serial" column gets its very own sequence.

One possible workaround is no create a table, say referred_ids, that contains 
all the ids from parent and all of its children, kept up-to-date via triggers, 
and point the FK constraint to that table. That also allows for a global unique 
constraint on the ids by definition a suitable unique or primary key constraint 
on referred_ids.

What lies at the heart of this problem is the lack of multi-table indices and 
hence multi-table unique constraints in postgres. AFAIK with those in place the 
rest amounts to the removal of ONLY from the constraint check queries plus some 
code to propagate constraint triggers to child tables.

best regards,
Florian Pflug

smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Dimitri Fontaine

Greg Smith  writes:
> If you need a script that involves changing a server setting to do
> something, that translates into "you can't do that" for a typical DBA.  The
> idea of a program regularly changing a server configuration setting on a
> production system is one you just can't sell.  That makes this idea
> incredibly more difficult to use in the field than any of the workarounds
> that cope with the known max_standby_delay issues.

I still think that the best API we can do in a timely fashion for 9.0
is:

  standby_conflict_winner = replay|queries

  pg_pause_recovery() / pg_resume_recovery()

It seems to me those two functions are only exposing existing facilities
in the code, so that's more an API change that a new feature
inclusion. Of course I'm certainly wrong. But the code has already been
written.

I don't think we'll find any better to offer our users in the right time
frame. Now I'll try to step back and stop repeating myself in the void :)

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Partitioning/inherited tables vs FKs

2010-05-06 Thread Boszormenyi Zoltan

Hi,

we came across an interesting problem.

=# create table parent (id serial primary key, t text);
NOTICE:  CREATE TABLE will create implicit sequence "parent_id_seq" for
serial column "parent.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"parent_pkey" for table "parent"
CREATE TABLE
=# create table child () inherits (parent);
CREATE TABLE
=# create table refer (id serial primary key, parent_id integer
references parent (id));
NOTICE:  CREATE TABLE will create implicit sequence "refer_id_seq" for
serial column "refer.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"refer_pkey" for table "refer"
CREATE TABLE
=# begin;
BEGIN
=# insert into child (t) values ('a') returning id;
 id

  1
(1 sor)

INSERT 0 1
=# select * from parent;
 id | t
+---
  1 | a
(1 sor)

=# insert into refer (parent_id) values (1);
ERROR:  insert or update on table "refer" violates foreign key
constraint "refer_parent_id_fkey"
DETAIL:  Key (parent_id)=(1) is not present in table "parent".

The use case for this was there were different news items,
and there were another table for summaries, that could point
to any of the news items table. Another use case could be
a large partitioned table with an FK to the main table where
the referring table might only contain very few "interesting" data.

No matter what are the semantics, the parent table in the
inheritance chain cannot be used as and endpoint for FKs.

Is it a bug, or intentional?

The only solution currently is that the referring table has to be
partitioned the same way as the referred table in the FK, and
its parent table has to be queried.

Best regards,
Zoltán Böszörményi

-- 
Bible has answers for everything. Proof:
"But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil." (Matthew 5:37) - basics of digital technology.
"May your kingdom come" - superficial description of plate tectonics

--
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Simon Riggs

On Wed, 2010-05-05 at 23:15 -0700, Rob Wultsch wrote:

> I manage a bunch of different environments and I am pretty sure that
> in any of them if the db started seemingly randomly killing queries I
> would have application teams followed quickly by executives coming
> after me with torches and pitchforks.

Fully understood and well argued, thanks for your input.

HS doesn't randomly kill queries and there are documented work-arounds
to control this behaviour.

Removing the parameter won't help the situation at all, it will make the
situation *worse* by removing control from where it's clearly needed and
removing all hope of making the HS feature work in practice. There is no
consensus to remove the parameter.

-- 
 Simon Riggs   www.2ndQuadrant.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] LogStandbySnapshot (was another thread)

2010-05-06 Thread Simon Riggs

On Wed, 2010-05-05 at 09:12 +0300, Heikki Linnakangas wrote:

> I concur that the idea is that we deal at replay with the fact that the
> snapshot lags behind. At replay, any locks/XIDs in the snapshot that
> have already been committed/aborted are ignored. For any locks/XIDs
> taken just after the snapshot was taken, the replay will see the other
> WAL records with that information.
> 
> We need to add comments explaining all that.

The attached comments are proposed.

Reviewing this information again to propose a fix for the two minor
other bugs pointed out by Tom show that they are both related and need
one combined fix that would work like this:

Currently we handle the state STANDBY_INITIALIZED incorrectly. We need
to run RecordKnownAssignedXids() during this mode, so that we both
extend the clog and record known xids. That means that two other callers
of RecordKnownAssignedXids also need to call it at that time.

In ProcArrayApplyRecoveryInfo() we run KnownAssignedXidsAdd(), though
this will fail if there are existing xids in there, now it is sorted. So
we need to: run KnownAssignedXidsRemovePreceding(latestObservedXid) to
remove extraneous xids, then extract any xids that remain and add them
to the ones arriving with the running xacts record. We then need to sort
the combined array and re-insert into KnownAssignedXids.

Previously, I had imagined that the gap between the logical checkpoint
and the physical checkpoint was small. With spread checkpoints this
isn't the case any longer. So I propose adding a special WAL record that
is inserted during LogStandbySnapshot() immediately before
GetRunningTransactionLocks(), so that we minimise the time window
between deriving snapshot data and recording it in WAL.

Those changes are not especially invasive.

-- 
 Simon Riggs   www.2ndQuadrant.com
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index ab4ef62..434fffb 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -86,6 +86,58 @@ InitRecoveryTransactionEnvironment(void)
 	vxid.localTransactionId = GetNextLocalTransactionId();
 	VirtualXactLockTableInsert(vxid);
 
+	/*
+	 * We can only move directly to STANDBY_SNAPSHOT_READY at startup if we
+	 * start from a shutdown checkpoint. In the case of starting from an
+	 * online checkpoint the situation is more complex and requires a two
+	 * or sometimes a three stage process.
+	 *
+	 * standbyState starts here at STANDBY_INITIALIZED and changes state to
+	 * either STANDBY_SNAPSHOT_PENDING or STANDBY_SNAPSHOT_READY. If we are
+	 * at STANDBY_SNAPSHOT_PENDING state we can only change to
+	 * STANDBY_SNAPSHOT_READY at which we stay until shutdown.
+	 * 
+	 * The initial snapshot must contain all running xids and all current
+	 * AccessExclusiveLocks at a point in time on the standby. Assembling
+	 * that information requires many and various LWLocks, so we choose to
+	 * derive that information piece by piece and then re-assemble that info
+	 * on the standby. When that information is fully assembled we move to
+	 * STANDBY_SNAPSHOT_READY.
+	 *
+	 * Since locking on the primary when we derive the information is not
+	 * strict, we note that there is a time window between the derivation and
+	 * writing to WAL of the derived information. That allows race conditions
+	 * that we must resolve, since xids and locks may enter or leave the
+	 * snapshot during that window. This creates the issue that an xid or
+	 * lock may start *after* the snapshot has been derived yet *before* the
+	 * snapshot is logged in the running xacts WAL record. We resolve this by
+	 * starting to accumulate changes at a point immediately before we derive
+	 * the snapshot on the primary and ignore duplicates when we later apply
+	 * the snapshot from the running xacts record. This is implemented during
+	 * CreateCheckpoint() where we use the logical checkpoint location as
+	 * our starting point and then write the running xacts record immediately
+	 * before writing the main checkpoint WAL record. Since we always start
+	 * up from a checkpoint and we are immediately at our starting point, so
+	 * we unconditionally move to STANDBY_INITIALIZED. After this point we
+	 * must do 4 things: 
+	 *  * move shared nextXid forwards as we see new xids
+	 *  * extend the clog and subtrans with the new xid
+	 *  * keep track of uncommitted known assigned xids
+	 *  * keep track of uncommitted AccessExclusiveLocks
+	 *
+	 * When we see a commit/abort we must remove known assigned xids and locks
+	 * from the completing transaction. Attempted removals that cannot locate
+	 * an entry are expected and must not cause an error when we are in state
+	 * STANDBY_INITIALIZED. This is implemented in StandbyReleaseLocks() and
+	 * KnownAssignedXidsRemove().
+	 * 
+	 * Later, when we apply the running xact data we must be careful to ignore
+	 * transactions already committed, since those commits raced ahead when
+	 * making WAL en

Re: [HACKERS] max_standby_delay considered harmful

2010-05-06 Thread Simon Riggs

On Thu, 2010-05-06 at 00:47 -0400, Robert Haas wrote: 

> That just doesn't sound that bad to me, especially since the proposed
> alternative is:
> 
> - Queries will get cancelled like crazy, period.
> 
> Or else:
> 
> - Replication can fall infinitely far behind and you can write a
> tedious and error-prone script to try to prevent it if you like.
> 
> I think THAT is going to tarnish our reputation.

Yes, that will.

There is no consensus to remove max_standby_delay.

It could be improved with minor adjustments and it makes more sense to
allow a few of those, treating them as bugs.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] possible memory leak with SRFs

2010-05-06 Thread Nikhil Sontakke

Hi,

Continuing on this:

Can someone please explain why we do not reset the expression context
if an SRF is involved during execution? Once the current result from
the SRF has been consumed, I would think that the
ecxt_per_tuple_memory context should be reset. As its name suggests,
it is supposed to a per tuple context and is not meant to be
long-lived. To test this out I shifted the call to ResetExprContext to
just before returning from the SRF inside ExecResult and I do not see
the memleak at all. Patch attached with this mail.

The SRF has its own long-lived "SRF multi-call context" anyways. And
AIUI, SRFs return tuples one-by-one or do we materialize the same into
a tuplestore in some cases?

Regards,
Nikhils

On Wed, May 5, 2010 at 7:23 PM, Nikhil Sontakke
 wrote:
> Hi,
>
> I saw this behavior with latest GIT head:
>
> create table xlarge(val numeric(19,0));
> insert into xlarge values(generate_series(1,5));
>
> The above generate series will return an int8 which will then be
> casted to numeric (via int8_to_numericvar) before being inserted into
> the table. I observed that the ExprContext memory associated with
> econtext->ecxt_per_tuple_memory is slowly bloating up till the end of
> the insert operation.
>
> This becomes significant the moment we try to insert a significant
> number of entries using this SRF. I can see the memory being consumed
> by the PG backend slowly grow to a large percentage.
>
> I see that the executor (take ExecResult as an example) does not reset
> the expression context early if an SRF is churning out tuples. What
> could be a good way to fix this?
>
> Regards,
> Nikhils
> --
> http://www.enterprisedb.com
>



-- 
http://www.enterprisedb.com
diff --git a/src/backend/executor/nodeResult.c b/src/backend/executor/nodeResult.c
index 94796d5..bddc632 100644
--- a/src/backend/executor/nodeResult.c
+++ b/src/backend/executor/nodeResult.c
@@ -100,7 +100,10 @@ ExecResult(ResultState *node)
 	{
 		resultSlot = ExecProject(node->ps.ps_ProjInfo, &isDone);
 		if (isDone == ExprMultipleResult)
+		{
+			ResetExprContext(econtext);
 			return resultSlot;
+		}
 		/* Done with that source tuple... */
 		node->ps.ps_TupFromTlist = false;
 	}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

72 matches

Mail list logo