date:20110302

On Tue, 2011-03-01 at 15:25 +0900, Fujii Masao wrote:

 No, I've never wished wait-forever option for now. I'd like to make
 the primary work alone when there is no connected standby, for
 high-availability. 

allow_standalone_primary seems to need to be better through than it is
now, yet neither of us think its worth having.

If the people that want it can think it through a little better then it
might make this release, but I propose to remove it from this current
patch to allow us to commit with greater certainty and fewer bugs.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alpha4 release blockers (was Re: wrapping up this CommitFest)

2011-03-02 Thread Jan Urbański

On 02/03/11 01:05, Andrew Dunstan wrote:
 
 
 On 03/01/2011 05:19 PM, Jan Urbański wrote:
 On 01/03/11 22:07, Andrew Dunstan wrote:

 On 03/01/2011 03:53 PM, Jan Urbański wrote:
 On 01/03/11 21:35, Tom Lane wrote:
 Josh Berkusj...@agliodbs.com   writes:
 I'm ok with closing things as of the end of the 15 days, say
 Thursday or
 Friday.
 It might be a good idea to make a list of what we have left to do
 before
 we can wrap an alpha.  Here are some things on my list.  Not all of
 them
 are necessarily release blockers, but we need to discuss which ones
 are:

 * Regression test failures from recent plpython patches.  These are
 affecting enough machines to make them must fix before alpha, IMO.
 There are some variations in error message wording, which are not too
 terrible but also not exactly hard to fix.  The python assert failure
 that some Fedora machines are reporting is considerably more
 disturbing.
 I agree.

 I'm looking into the crash, no luck so long.


 Is there anything you need that would help you?
 Could you try this patch and see if it fixes the failures?

 I'm at a loss as to why this happens, but judging from the traceback the
 spiexceptions module is getting unreffed somewhere and when garbage
 collection kicks it it barfs on an object with refcount 0. So I'm
 forcing an incref of the module to confirm that.

 I tried various tricks on 32 bit Debian, with Python 2.6, 2.7, Python
 compiled from Fedora's SRPM and I never saw anything wrong. Will keep on
 trying, but tommorrow evening, time to sleep :(
 
 Thanks.
 
 That seems to have fixed it, so I have applied the patch. Would you like
 to supply some comments to got with it?

The comment would be something like

/* XXX it appears that in some circumstantes the reference count of the
spiexceptions module drops to zero causing a Python assert failure when
the garbage collector visits the module. This has been observed on the
buildfarm. To fix this, add an additional ref for the module here. */

I have no idea why the refcount of the module becomes zero, debug prints
I added on my system were always showing 1.

Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Tue, 2011-03-01 at 15:25 +0900, Fujii Masao wrote:
 On Tue, Mar 1, 2011 at 9:19 AM, Simon Riggs si...@2ndquadrant.com wrote:
  On Mon, 2011-02-28 at 18:40 +, Simon Riggs wrote:
   SyncRepReleaseWaiters should be called when walsender exits. Otherwise,
   if the standby crashes while a transaction is waiting for replication,
   it waits infinitely.
 
  Will think on this.
 
  The behaviour seems correct to me:

  If allow_standalone_primary = on then we sit and wait until we hit
  client timeout, which occurs even after last standby has gone.
 
 In that case, why do backends need to wait until the timeout occurs?
 We can make those backends resume their transaction as soon as
 the last standby has gone. No?

I'm getting a little confused as to what you're asking for regarding
shutdowns and WALSender disconnects.

The current behaviour is that when a reply is received we attempt to
wake up backends waiting for an LSN. If that reply is not received
within a timeout then we just return to user anyway. You can wait
forever, if you choose, by setting the timeout to 0.

The WALSender deliberately does *not* wake waiting users if the standby
disconnects. Doing so would break the whole reason for having sync rep
in the first place. What we do is allow a potential standby to takeover
the role of sync standby, if one is available. Or the failing standby
can reconnect and then release waiters.

If we shutdown, then we wait for the shutdown commit record to be
transferred to our standby, so a normal or fast shutdown of the master
always leaves all connected standbys up to date. We already do that, so
sync rep doesn't touch that behaviour. If a standby is disconnected,
then it doesn't receive the shutdown checkpoint record.

The wait state for a commit does not persist when we shutdown and
restart.

Can you restate which bits of the above you think need to be changed?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...

2011-03-02 Thread Bernd Helmle




--On 28. Februar 2011 15:02:30 -0500 Tom Lane t...@sss.pgh.pa.us wrote:


Because it's fifty times more mechanism than we need here?  We don't
want a SQL interface (not even a lightweight one) and it's unclear that
we ever want the data to go to disk at all.


I wonder wether a library like librrd would be a solution for this.

--
Thanks

Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

2011-03-02 Thread Yeb Havinga


On 2011-03-02 11:40, Simon Riggs wrote:


allow_standalone_primary seems to need to be better through than it is
now, yet neither of us think its worth having.

If the people that want it can think it through a little better then it
might make this release, but I propose to remove it from this current
patch to allow us to commit with greater certainty and fewer bugs.
As somebody who actually thought having it was a good idea, +1 for 
remove. I can monitor having one or two sync standbys at all times and 
let bells ring when they fail, but especially when things might break 
and the standbys are gone, having allow_standalone_primary set to off is 
a very efficient way to limit your options then to effectively zero with 
the master.


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alpha4 release blockers (was Re: wrapping up this CommitFest)

On Wed, Mar 2, 2011 at 6:14 AM, Jan Urbański wulc...@wulczer.org wrote:
 That seems to have fixed it, so I have applied the patch. Would you like
 to supply some comments to got with it?

 The comment would be something like

 /* XXX it appears that in some circumstantes the reference count of the
 spiexceptions module drops to zero causing a Python assert failure when
 the garbage collector visits the module. This has been observed on the
 buildfarm. To fix this, add an additional ref for the module here. */

 I have no idea why the refcount of the module becomes zero, debug prints
 I added on my system were always showing 1.

But does bumping the ref count then create a leak the rest of the time?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alpha4 release blockers (was Re: wrapping up this CommitFest)

2011-03-02 Thread Jan Urbański

On 02/03/11 14:25, Robert Haas wrote:
 On Wed, Mar 2, 2011 at 6:14 AM, Jan Urbański wulc...@wulczer.org wrote:
 That seems to have fixed it, so I have applied the patch. Would you like
 to supply some comments to got with it?

 The comment would be something like

 /* XXX it appears that in some circumstantes the reference count of the
 spiexceptions module drops to zero causing a Python assert failure when
 the garbage collector visits the module. This has been observed on the
 buildfarm. To fix this, add an additional ref for the module here. */

 I have no idea why the refcount of the module becomes zero, debug prints
 I added on my system were always showing 1.
 
 But does bumping the ref count then create a leak the rest of the time?

Not really, because you never want to garbage collect the spiexceptions
module (just like you don't want to GC th plpy module, or the plpy.info
function etc.). So the reference count of that module should never drop
to zero, but apparently on some machines it does. So just reffing
artificailly is kind of a valid solution, I'm just uneasy with not
knowing why it fails on some machines and does not on others.

Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Wed, Mar 2, 2011 at 6:22 AM, Simon Riggs si...@2ndquadrant.com wrote:
 The WALSender deliberately does *not* wake waiting users if the standby
 disconnects. Doing so would break the whole reason for having sync rep
 in the first place. What we do is allow a potential standby to takeover
 the role of sync standby, if one is available. Or the failing standby
 can reconnect and then release waiters.

If the transaction would have been allowed to commit without waiting
had the standby not been connected in the first place, then presumably
it should also be allowed to commit if the standby disconnects later,
too.  Otherwise, it doesn't seem very consistent.  A commit should
either wait for a disconnected standby to reconnect, or it should not
wait.  It shouldn't wait in some situations but not others, I think.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Wed, Mar 2, 2011 at 8:22 PM, Simon Riggs si...@2ndquadrant.com wrote:
 The WALSender deliberately does *not* wake waiting users if the standby
 disconnects. Doing so would break the whole reason for having sync rep
 in the first place. What we do is allow a potential standby to takeover
 the role of sync standby, if one is available. Or the failing standby
 can reconnect and then release waiters.

If there is potential standby when synchronous standby has gone, I agree
that it's not good idea to release the waiting backends soon. In this case,
those backends should wait for next synchronous standby.

On the other hand, if there is no potential standby, I think that the waiting
backends should not wait for the timeout and should wake up as soon as
synchronous standby has gone. Otherwise, those backends suspend for
a long time (i.e., until the timeout expires), which would decrease the
high-availability, I'm afraid.

Keeping those backends waiting for the failed standby to reconnect is an
idea. But this looks like the behavior for allow_standalone_primary = off.
If allow_standalone_primary = on, it looks more natural to make the
primary work alone without waiting the timeout.

 If we shutdown, then we wait for the shutdown commit record to be
 transferred to our standby, so a normal or fast shutdown of the master
 always leaves all connected standbys up to date. We already do that, so
 sync rep doesn't touch that behaviour. If a standby is disconnected,
 then it doesn't receive the shutdown checkpoint record.

 The wait state for a commit does not persist when we shutdown and
 restart.

 Can you restate which bits of the above you think need to be changed?

What I'm thinking is: when the waiting backends are released because
of the timeout while the fast shutdown is being done in the master,
those backends should not return the success indication to the client.
Of course, in that case, WAL has already been flushed in the master,
but I think that those backends should exit with FATAL error before
returning the success. This is for avoiding breaking the synchronous
replication rule, i.e., all the transaction which the client knows as
committed must be committed in the synchronous standby after failover.

If we allow those backends to return the success in that situation, the
following scenario which can cause a data loss can happen.

1. The primary is running with allow_standalone_primary = on. There
is only one (synchronous) standby connected.
2. The replication connection is closed because of the network outage.
3. While some backends are waiting for replication, the user requests
fast shutdown in the master.
4. Since the timeout expires, those backends stop waiting and return
the success indication to the client (but not replicated to the standby).
5. Since there is no backend waiting for replication, fast shutdown
completes.
6. The clusterware like pacemaker detects the death of the primary
and triggers the failover.
7. New primary doesn't have some transactions committed to the
client, i.e., transaction lost happens!!

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Wed, Mar 2, 2011 at 7:40 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Tue, 2011-03-01 at 15:25 +0900, Fujii Masao wrote:

 No, I've never wished wait-forever option for now. I'd like to make
 the primary work alone when there is no connected standby, for
 high-availability.

 allow_standalone_primary seems to need to be better through than it is
 now, yet neither of us think its worth having.

 If the people that want it can think it through a little better then it
 might make this release, but I propose to remove it from this current
 patch to allow us to commit with greater certainty and fewer bugs.

+1 to remove the wait-forever behavior and the parameter for 9.1.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17


On 02.03.2011 12:40, Simon Riggs wrote:

allow_standalone_primary seems to need to be better through than it is
now, yet neither of us think its worth having.

If the people that want it can think it through a little better then it
might make this release, but I propose to remove it from this current
patch to allow us to commit with greater certainty and fewer bugs.


If you leave it out, then let's rename the feature to semi-synchronous 
replication or such. The point of synchronous replication is 
zero-data-loss, and you don't achieve that with allow_standalone_primary=on.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Wed, Mar 2, 2011 at 9:30 AM, Fujii Masao masao.fu...@gmail.com wrote:
 What I'm thinking is: when the waiting backends are released because
 of the timeout while the fast shutdown is being done in the master,
 those backends should not return the success indication to the client.
 Of course, in that case, WAL has already been flushed in the master,
 but I think that those backends should exit with FATAL error before
 returning the success. This is for avoiding breaking the synchronous
 replication rule, i.e., all the transaction which the client knows as
 committed must be committed in the synchronous standby after failover.

That seems like an extremely bad idea.  Now any client that assumes
that FATAL means his transaction didn't commit is broken.  Clients
should be entitled to assume that a successful COMMIT means the
transaction committed (with whatever the operative durability
guarantee is) and that an error means it rolled back.  If the
connection is closed before either one of those things happens, the
client can't assume anything.

It might be reasonable to COMMIT but also issue a warning message, or
to just close the connection without telling the client what happened,
but sending an error seems poor.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Wed, Mar 2, 2011 at 9:53 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 02.03.2011 12:40, Simon Riggs wrote:

 allow_standalone_primary seems to need to be better through than it is
 now, yet neither of us think its worth having.

 If the people that want it can think it through a little better then it
 might make this release, but I propose to remove it from this current
 patch to allow us to commit with greater certainty and fewer bugs.

 If you leave it out, then let's rename the feature to semi-synchronous
 replication or such. The point of synchronous replication is
 zero-data-loss, and you don't achieve that with allow_standalone_primary=on.

I think that'd just be adding confusion.  Replication will still be
synchronous; it'll just be more likely to be not happening when you
think it is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17


On 02.03.2011 17:07, Robert Haas wrote:

On Wed, Mar 2, 2011 at 9:30 AM, Fujii Masaomasao.fu...@gmail.com  wrote:

What I'm thinking is: when the waiting backends are released because
of the timeout while the fast shutdown is being done in the master,
those backends should not return the success indication to the client.
Of course, in that case, WAL has already been flushed in the master,
but I think that those backends should exit with FATAL error before
returning the success. This is for avoiding breaking the synchronous
replication rule, i.e., all the transaction which the client knows as
committed must be committed in the synchronous standby after failover.


That seems like an extremely bad idea.  Now any client that assumes
that FATAL means his transaction didn't commit is broken.  Clients
should be entitled to assume that a successful COMMIT means the
transaction committed (with whatever the operative durability
guarantee is) and that an error means it rolled back.  If the
connection is closed before either one of those things happens, the
client can't assume anything.


To achieve the effect Fujii is looking for, we would have to silently 
drop the connection. That would correctly leave the client not knowing 
whether the transaction committed or not.



It might be reasonable to COMMIT but also issue a warning message, or
to just close the connection without telling the client what happened,
but sending an error seems poor.


Yeah, I guess that would work too, if the client knows to watch out for 
those warnings.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17


On 02.03.2011 17:07, Robert Haas wrote:

On Wed, Mar 2, 2011 at 9:53 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

On 02.03.2011 12:40, Simon Riggs wrote:


allow_standalone_primary seems to need to be better through than it is
now, yet neither of us think its worth having.

If the people that want it can think it through a little better then it
might make this release, but I propose to remove it from this current
patch to allow us to commit with greater certainty and fewer bugs.


If you leave it out, then let's rename the feature to semi-synchronous
replication or such. The point of synchronous replication is
zero-data-loss, and you don't achieve that with allow_standalone_primary=on.


I think that'd just be adding confusion.  Replication will still be
synchronous; it'll just be more likely to be not happening when you
think it is.


The defining property of synchronous replication is that when a 
transaction is acknowledged as committed to the client, it has also been 
replicated to the standby. You don't achieve that with 
allow_standalone_primary=on, plain and simple. That's fine for a lot of 
people, most people don't actually want synchronous replication because 
they're not willing to pay the availability penalty. But IMHO it would 
be disingenuous to call it synchronous replication if you can't achieve 
zero data loss with it.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alpha4 release blockers (was Re: wrapping up this CommitFest)

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes:
 On 02/03/11 14:25, Robert Haas wrote:
 But does bumping the ref count then create a leak the rest of the time?

 Not really, because you never want to garbage collect the spiexceptions
 module (just like you don't want to GC th plpy module, or the plpy.info
 function etc.). So the reference count of that module should never drop
 to zero, but apparently on some machines it does. So just reffing
 artificailly is kind of a valid solution, I'm just uneasy with not
 knowing why it fails on some machines and does not on others.

Yeah, that last point makes me nervous too.  A look into the Fedora
repository shows that the python version shipped in F13 is rather
heavily patched:
http://pkgs.fedoraproject.org/gitweb/?p=python.git;a=tree;h=refs/heads/f13/master;hb=refs/heads/f13/master
It's not clear to me which of their changes from a stock build might
be at issue, though, and even less clear whether they introduced a
bug or did something to expose a bug of ours.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alpha4 release blockers (was Re: wrapping up this CommitFest)

2011-03-02 Thread Jan Urbański

On 02/03/11 16:28, Tom Lane wrote:
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes:
On 02/03/11 14:25, Robert Haas wrote:
But does bumping the ref count then create a leak the rest of the time?

Not really, because you never want to garbage collect the spiexceptions
module (just like you don't want to GC th plpy module, or the plpy.info
function etc.). So the reference count of that module should never drop
to zero, but apparently on some machines it does. So just reffing
artificailly is kind of a valid solution, I'm just uneasy with not
knowing why it fails on some machines and does not on others.

Yeah, that last point makes me nervous too. A look into the Fedora
repository shows that the python version shipped in F13 is rather
heavily patched:
http://pkgs.fedoraproject.org/gitweb/?p=python.git;a=tree;h=refs/heads/f13/master;hb=refs/heads/f13/master
It's not clear to me which of their changes from a stock build might
be at issue, though, and even less clear whether they introduced a
bug or did something to expose a bug of ours.

FWIW I looked at these patches yesterday when I was trying to reproduce
the bug, but did not find anything interesting. It's mostly for stuff in
the standard library. I haven't tried building Python with all of of
these patches though, and did not find an easy way to rebuild a SRPM on
a Debian system. I'm also wondering if it can be a 32 vs 64 bit issue?...

Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

2011-03-02 Thread Dimitri Fontaine

Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 To achieve the effect Fujii is looking for, we would have to silently drop
 the connection. That would correctly leave the client not knowing whether
 the transaction committed or not.

+1

 It might be reasonable to COMMIT but also issue a warning message, or
 to just close the connection without telling the client what happened,
 but sending an error seems poor.

 Yeah, I guess that would work too, if the client knows to watch out for
 those warnings.

-1

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 
 The defining property of synchronous replication is that when a 
 transaction is acknowledged as committed to the client, it has
 also been replicated to the standby. You don't achieve that with 
 allow_standalone_primary=on, plain and simple. That's fine for a
 lot of people, most people don't actually want synchronous
 replication because they're not willing to pay the availability
 penalty. But IMHO it would be disingenuous to call it synchronous
 replication if you can't achieve zero data loss with it.
 
Right.  While there may be more people who favor high availability
than the guarantees of synchronous replication, let's not blur the
lines by mislabeling things.  It's not synchronous replication if a
commit returns successfully without the data being persisted on a
second server.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

2011-03-02 Thread Aidan Van Dyk

On Wed, Mar 2, 2011 at 2:30 PM, Fujii Masao masao.fu...@gmail.com wrote:

 1. The primary is running with allow_standalone_primary = on. There
    is only one (synchronous) standby connected.

OK.  Explicitly configured to allow the master to report as commited
stuff which isn't on a/any slave.

 7. New primary doesn't have some transactions committed to the
    client, i.e., transaction lost happens!!

And this is a surprise?

I'm not saying there isn't a better way to to sequence/control a
shutdown to make this risk less, but isn't that the whole point of the
allow_standalone_primary debate/option?

If there isn't a sync slave for whatever reason, just march on, I'll
deal with the transactions that are committed and not replicated some
other way.

I guess complaining that it shouldn't be possible to just march on
when no sync slave is available is one possible way oof dealing
with them ;-)

a.

-- 
Aidan Van Dyk                                             Create like a god,
ai...@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alpha4 release blockers (was Re: wrapping up this CommitFest)

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes:
 FWIW I looked at these patches yesterday when I was trying to reproduce
 the bug, but did not find anything interesting. It's mostly for stuff in
 the standard library. I haven't tried building Python with all of of
 these patches though, and did not find an easy way to rebuild a SRPM on
 a Debian system. I'm also wondering if it can be a 32 vs 64 bit issue?...

Well, we can eliminate that last theory, because there were both 32 and
64 bit buildfarm machines showing the crash, cf bobcat and crake.
BTW, I see the former is now running F14, not F13 as claimed on the
buildfarm dashboard, so it doesn't look to be specific to a particular
Fedora version either.  I'm a bit tempted to install F15 alpha and see
if it's still there ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alpha4 release blockers (was Re: wrapping up this CommitFest)




On 03/02/2011 11:49 AM, Tom Lane wrote:

Well, we can eliminate that last theory, because there were both 32 and
64 bit buildfarm machines showing the crash, cf bobcat and crake.
BTW, I see the former is now running F14, not F13 as claimed on the
buildfarm dashboard,



That's because David apparently hasn't run update_personality.pl, 
although he has in the past.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alpha4 release blockers (was Re: wrapping up this CommitFest)

2011-03-02 Thread David Fetter

On Wed, Mar 02, 2011 at 12:02:30PM -0500, Andrew Dunstan wrote:
 On 03/02/2011 11:49 AM, Tom Lane wrote:
 Well, we can eliminate that last theory, because there were both 32 and
 64 bit buildfarm machines showing the crash, cf bobcat and crake.
 BTW, I see the former is now running F14, not F13 as claimed on the
 buildfarm dashboard,
 
 That's because David apparently hasn't run update_personality.pl,
 although he has in the past.

Oops!  Sorry about that!  Done :)

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Problem with composite type creation in C under Linux

2011-03-02 Thread Marios Vodas

I have developed some custom composite and base types in PostgreSQL 9 which
you can find in the code I provide below.
I compile my C library using GCC 4.5 under Linux and Visual Studio 2010
under Windows.

The problem is when I run this command: *SELECT to_composite('((1, 2), (3,
4))'::m_segment_base)*.
This is the result I get in Windows and Linux respectively:
Windows: ((1,2),(3,4))
Linux: ((1,),)

Probably the composite type is not created well in function to_composite,
but I guess the code should be OK since it works under MSVC...
*So my question is why this doesn't work under Linux? And how can I fix it
of course.*

Here is my code in SQL and C:

***SQL code***

CREATE TYPE m_point_composite AS
 (
 x double precision,
 y double precision
 );

 CREATE TYPE m_segment_composite AS
 (
 i m_point_composite,
 e m_point_composite
 );

 CREATE TYPE m_point_base
 (
 INTERNALLENGTH = 16,
 INPUT = m_point_base_in,
 OUTPUT = m_point_base_out,
 RECEIVE = m_point_base_recv,
 SEND = m_point_base_send,
 ALIGNMENT = double
 );


 CREATE TYPE m_segment_base
 (
 INTERNALLENGTH = 32,
 INPUT = m_segment_base_in,
 OUTPUT = m_segment_base_out,
 RECEIVE = m_segment_base_recv,
 SEND = m_segment_base_send,
 ALIGNMENT = double
 );


 CREATE OR REPLACE FUNCTION to_composite(m_segment_base)
 RETURNS m_segment_composite AS
 '$libdir/myLib','to_composite'
 LANGUAGE 'C' IMMUTABLE STRICT;

 ***C code***

struct m_point_base {
 float8 x;
 float8 y;
 };


 struct m_segment_base {
 m_point_base i;
 m_point_base e;
 };


 PGDLLEXPORT Datum to_composite(PG_FUNCTION_ARGS) {

m_segment_base *in = (m_segment_base *) PG_GETARG_POINTER(0);

 TupleDesc tupdesc;
 bool isnull;

 Datum vi[2];
 HeapTuple i;

 Datum ve[2];
 HeapTuple e;

 Datum vout[2];
 HeapTuple out;

 TupleDesc tupdesc_m_point_composite =
 RelationNameGetTupleDesc(m_point_composite);

 vi[0] = Float8GetDatum(in-i.x);
 vi[1] = Float8GetDatum(in-i.y);
 i = heap_form_tuple(tupdesc_m_point_composite, vi, isnull);

 ve[0] = Float8GetDatum(in-e.x);
 ve[1] = Float8GetDatum(in-e.y);
 e = heap_form_tuple(tupdesc_m_point_composite, ve, isnull);

 vout[0] = HeapTupleGetDatum(i);
 vout[1] = HeapTupleGetDatum(e);

 if (get_call_result_type(fcinfo, NULL, tupdesc) != TYPEFUNC_COMPOSITE)
 {
 ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg(function returning record called in context that cannot accept type
 record)));
 }

 BlessTupleDesc(tupdesc);
 out = heap_form_tuple(tupdesc, vout, isnull);

 PG_RETURN_DATUM(HeapTupleGetDatum(out));

}
 PG_FUNCTION_INFO_V1(to_composite)
 ;

Re: [HACKERS] Problem with composite type creation in C under Linux

Marios Vodas mvo...@gmail.com writes:
 I have developed some custom composite and base types in PostgreSQL 9 which
 you can find in the code I provide below.
 I compile my C library using GCC 4.5 under Linux and Visual Studio 2010
 under Windows.

 The problem is when I run this command: *SELECT to_composite('((1, 2), (3,
 4))'::m_segment_base)*.
 This is the result I get in Windows and Linux respectively:
 Windows: ((1,2),(3,4))
 Linux: ((1,),)

You're passing a single isnull flag to heap_form_tuple calls that are
expecting arrays of two flags ... and even the one value that's there
according to the code isn't being initialized, so it's remarkable that
this code works at all on any platform.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

2011-03-02 Thread Merlin Moncure

On Wed, Mar 2, 2011 at 9:58 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 To achieve the effect Fujii is looking for, we would have to silently drop
 the connection. That would correctly leave the client not knowing whether
 the transaction committed or not.

 +1

 It might be reasonable to COMMIT but also issue a warning message, or
 to just close the connection without telling the client what happened,
 but sending an error seems poor.

 Yeah, I guess that would work too, if the client knows to watch out for
 those warnings.

 -1

yeah, unless by warning, you meant 'error'.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alpha4 release blockers (was Re: wrapping up this CommitFest)

On Mar 2, 2011, at 9:02 AM, Andrew Dunstan wrote:

 That's because David apparently hasn't run update_personality.pl, although he 
 has in the past.

Is this something we can run against crazier community members?

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Wed, Mar 2, 2011 at 12:39 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Wed, Mar 2, 2011 at 9:58 AM, Dimitri Fontaine dimi...@2ndquadrant.fr 
 wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 To achieve the effect Fujii is looking for, we would have to silently drop
 the connection. That would correctly leave the client not knowing whether
 the transaction committed or not.

 +1

 It might be reasonable to COMMIT but also issue a warning message, or
 to just close the connection without telling the client what happened,
 but sending an error seems poor.

 Yeah, I guess that would work too, if the client knows to watch out for
 those warnings.

 -1

 yeah, unless by warning, you meant 'error'.

Well, as mentioned upthread, throwing an error when the transaction is
actually committed seems poor.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Native XML

2011-03-02 Thread Nicolas Barbier

2011/3/1 Andrew Dunstan and...@dunslane.net:

 I think hierarchical data really only scratches the surface of the problem.
 It would be nice to be able to specify all sorts of context for searches:

   * foo after bar
   * foo near bar
   * foo and bar in the same paragraph
   * foo as a parent/child/ancestor/descendent/sibling/cousin of bar

I wonder whether you are deliberately describing XPath here? :-)

Nicolas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Problem with composite type creation in C under Linux

2011-03-02 Thread Marios Vodas

Thank you! now I understand it...

On Wed, Mar 2, 2011 at 7:35 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Marios Vodas mvo...@gmail.com writes:
  I have developed some custom composite and base types in PostgreSQL 9
 which
  you can find in the code I provide below.
  I compile my C library using GCC 4.5 under Linux and Visual Studio 2010
  under Windows.

  The problem is when I run this command: *SELECT to_composite('((1, 2),
 (3,
  4))'::m_segment_base)*.
  This is the result I get in Windows and Linux respectively:
  Windows: ((1,2),(3,4))
  Linux: ((1,),)

 You're passing a single isnull flag to heap_form_tuple calls that are
 expecting arrays of two flags ... and even the one value that's there
 according to the code isn't being initialized, so it's remarkable that
 this code works at all on any platform.

regards, tom lane

[HACKERS] WAL segments pile up during standalone mode

2011-03-02 Thread Alvaro Herrera

I noticed that in standalone mode, WAL segments don't seem to be
recycled.  This could get problematic if you're forced to vacuum large
tables in that mode and space for WAL is short.

I can reproduce in HEAD easily by doing a large bulk insertion in
standalone mode.  If I stop the server, start in normal mode and
shutdown (or checkpoint), the extra segments go away as expected.

This was reported in pgsql-es-ayuda yesterday.

-- 
Álvaro Herrera alvhe...@alvh.no-ip.org

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [PATCH] Add tab completion support for JOIN

2011-03-02 Thread Andrey Popp

Hello,

I've produced a dumb patch for psql which allow to use tab completion after 
JOIN keyword.
Patch was done against 2f6c8453cf3f38a70adbcb59489630cd5be92570 revision from 
GitHub mirror.



join_completion.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alpha4 release blockers (was Re: wrapping up this CommitFest)

2011-03-02 Thread Alvaro Herrera

Excerpts from Andrew Dunstan's message of mié mar 02 14:02:30 -0300 2011:
 
 On 03/02/2011 11:49 AM, Tom Lane wrote:
  Well, we can eliminate that last theory, because there were both 32 and
  64 bit buildfarm machines showing the crash, cf bobcat and crake.
  BTW, I see the former is now running F14, not F13 as claimed on the
  buildfarm dashboard,
 
 
 That's because David apparently hasn't run update_personality.pl, 
 although he has in the past.

Does this also explain that moa reports being GCC while it's actually
Sun Studio?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alpha4 release blockers (was Re: wrapping up this CommitFest)

2011-03-02 Thread Dave Page

On Thu, Mar 3, 2011 at 12:42 AM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Excerpts from Andrew Dunstan's message of mié mar 02 14:02:30 -0300 2011:

 On 03/02/2011 11:49 AM, Tom Lane wrote:
  Well, we can eliminate that last theory, because there were both 32 and
  64 bit buildfarm machines showing the crash, cf bobcat and crake.
  BTW, I see the former is now running F14, not F13 as claimed on the
  buildfarm dashboard,


 That's because David apparently hasn't run update_personality.pl,
 although he has in the past.

 Does this also explain that moa reports being GCC while it's actually
 Sun Studio?

moa has never changed - but there was a mixup with huia's keys when
they were first registered on the buildfarm. I wonder if it wasn't the
keys, but the rest of the info that was actually confused.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Porting PostgreSQL to DragonFly BSD

2011-03-02 Thread Peter Eisentraut

On ons, 2011-03-02 at 09:10 +0100, Rumko wrote:
 What about this patch ( 
 http://www.rumko.net/0001-DragonFly-BSD-support-linked-nbsd.patch )?
 instead of linking to freebsd, it's linked to netbsd and It still
 compiles due to the two templates being similar enough.

Looks good.  Committed.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alpha4 release blockers (was Re: wrapping up this CommitFest)




On 03/02/2011 02:12 PM, Alvaro Herrera wrote:

Excerpts from Andrew Dunstan's message of mié mar 02 14:02:30 -0300 2011:

On 03/02/2011 11:49 AM, Tom Lane wrote:

Well, we can eliminate that last theory, because there were both 32 and
64 bit buildfarm machines showing the crash, cf bobcat and crake.
BTW, I see the former is now running F14, not F13 as claimed on the
buildfarm dashboard,


That's because David apparently hasn't run update_personality.pl,
although he has in the past.

Does this also explain that moa reports being GCC while it's actually
Sun Studio?


No, that's just plain wrong. update_personality lets you change the OS 
and compiler version, but not the name or either. If you change OS or 
compiler you need a new animal.


In this case it just looks like it was misdescribed from the start, so 
I'll fix it in the database.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alpha4 release blockers (was Re: wrapping up this CommitFest)




On 03/02/2011 02:16 PM, Dave Page wrote:

On Thu, Mar 3, 2011 at 12:42 AM, Alvaro Herrera
alvhe...@commandprompt.com  wrote:

Excerpts from Andrew Dunstan's message of mié mar 02 14:02:30 -0300 2011:

On 03/02/2011 11:49 AM, Tom Lane wrote:

Well, we can eliminate that last theory, because there were both 32 and
64 bit buildfarm machines showing the crash, cf bobcat and crake.
BTW, I see the former is now running F14, not F13 as claimed on the
buildfarm dashboard,


That's because David apparently hasn't run update_personality.pl,
although he has in the past.

Does this also explain that moa reports being GCC while it's actually
Sun Studio?

moa has never changed - but there was a mixup with huia's keys when
they were first registered on the buildfarm. I wonder if it wasn't the
keys, but the rest of the info that was actually confused.


Oh, ugh. So if I switch the compiler names and versions on these two 
they will be correct?


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Wed, 2011-03-02 at 17:23 +0200, Heikki Linnakangas wrote:
 On 02.03.2011 17:07, Robert Haas wrote:
  On Wed, Mar 2, 2011 at 9:53 AM, Heikki Linnakangas
  heikki.linnakan...@enterprisedb.com  wrote:
  On 02.03.2011 12:40, Simon Riggs wrote:
 
  allow_standalone_primary seems to need to be better through than it is
  now, yet neither of us think its worth having.
 
  If the people that want it can think it through a little better then it
  might make this release, but I propose to remove it from this current
  patch to allow us to commit with greater certainty and fewer bugs.
 
  If you leave it out, then let's rename the feature to semi-synchronous
  replication or such. The point of synchronous replication is
  zero-data-loss, and you don't achieve that with 
  allow_standalone_primary=on.
 
  I think that'd just be adding confusion.  Replication will still be
  synchronous; it'll just be more likely to be not happening when you
  think it is.
 
 The defining property of synchronous replication is that when a 
 transaction is acknowledged as committed to the client, it has also been 
 replicated to the standby. You don't achieve that with 
 allow_standalone_primary=on, plain and simple. That's fine for a lot of 
 people, most people don't actually want synchronous replication because 
 they're not willing to pay the availability penalty. But IMHO it would 
 be disingenuous to call it synchronous replication if you can't achieve 
 zero data loss with it.

I agree with some of what you say, but that focuses on one corner case
and not on the whole solution.

Yes, if we pick the allow_standalone_primary=on behaviour AND you are
stupid enough to lose *all* of your sync standbys then you may get data
loss.

What I've spent a lot of time doing is trying to ensure that we never
lose all our sync standbys, via clear recommendation to use multiple
servers AND functionality to allow that the standbys work together to
give true high availability without data loss. The current design allows
for an arbitrary number of potential standby servers so your risk of
data loss can be as many 9s as you care to make it.

Not really bothered what we call it, but if you intend to make a song
and dance about whether it is proper or not, then I would object to
that because you're not presenting the full situation.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perl 5.12 complains about ecpg parser-hacking scripts

2011-03-02 Thread Andy Colson


On 1/23/2011 5:11 AM, Michael Meskes wrote:

On Sat, Jan 22, 2011 at 08:40:13PM -0500, Andrew Dunstan wrote:

I think these really need to be rewritten from scratch. They look
like they were written by someone who never heard of Perl 5 (it's
only about 16 years old).


You might remember that we had this discussion before. The script was written
in awk and then translated to perl by a2p. We knew that this code was less than
optimal, but at least it works.

As I already said when the script was introduced, I would love to have a real
perl solution, but I'm not a perl programmer by any means.

Michael


I thought Kris was going to work on this, but saw no progress, and I was 
bored the other day, so I started working on it.


Here is a parse.pl, with some major refactoring.

I named it with a 2 so I could run it beside the original and diff em:

time perl parse.pl  .  ../../../backend/parser/gram.y  preproc.y

real0m10.636s
user0m8.793s
sys 0m0.050s


time perl  parse2.pl  .  ../../../backend/parser/gram.y  preproc2.y

real0m0.289s
user0m0.214s
sys 0m0.009s


diff preproc.y preproc2.y

I am sure there are new bugs.  I have not run it on anything but 9.0.1. 
 Are there other .y files you might feed it? (something other than 
backend/parser/gram.y?)


I touched pretty much everything, and I broke a few things and had to go 
back and fix 'em, so it would not surprise me at all if there were still 
a few bugs.


Anyway, feedback, suggestions, etc are welcomed.  (oh, yeah, I probably 
need to go add some documentation)


-Andy
#!/usr/bin/perl
# src/interfaces/ecpg/preproc/parse.pl
# parser generater for ecpg
# call with backend parser as stdin
#
# Copyright (c) 2007-2010, PostgreSQL Global Development Group
#
# Written by Mike Aubury mike.aub...@aubit.com
#Michael Meskes mes...@postgresql.org
#
# Placed under the same license as PostgreSQL.
#

use strict;
use warnings;
no warnings 'uninitialized';

my $path = shift @ARGV;
$path = . unless $path;

my $copymode  = 0;
my $brace_indent  = 0;
my $yaccmode  = 0;
my $header_included   = 0;
my $feature_not_supported = 0;
my $tokenmode = 0;

my(%buff, $infield, $comment, %tokens, %addons );
my($stmt_mode, @fields);
my($line, $non_term_id);


# some token have to be replaced by other symbols
# either in the rule
my %replace_token = (
'BCONST' = 'ecpg_bconst',
'FCONST' = 'ecpg_fconst',
'Sconst' = 'ecpg_sconst',
'IDENT'  = 'ecpg_ident',
'PARAM'  = 'ecpg_param',
);

# or in the block
my %replace_string = (
'WITH_TIME'= 'with time',
'NULLS_FIRST'  = 'nulls first',
'NULLS_LAST'   = 'nulls last',
'TYPECAST' = '::',
'DOT_DOT'  = '..',
'COLON_EQUALS' = ':=',
);

# specific replace_types for specific non-terminals - never include the ':'
# ECPG-only replace_types are defined in ecpg-replace_types
my %replace_types = (
'PrepareStmt'  = 'prep',
'opt_array_bounds' = 'index',

# ignore means: do not create type and rules for this non-term-id
'stmtblock'  = 'ignore',
'stmtmulti'  = 'ignore',
'CreateAsStmt'   = 'ignore',
'DeallocateStmt' = 'ignore',
'ColId'  = 'ignore',
'type_function_name' = 'ignore',
'ColLabel'   = 'ignore',
'Sconst' = 'ignore',
);

# these replace_line commands excise certain keywords from the core keyword
# lists.  Be sure to account for these in ColLabel and related productions.
my %replace_line = (
'unreserved_keywordCONNECTION' = 'ignore',
'unreserved_keywordCURRENT_P'  = 'ignore',
'unreserved_keywordDAY_P'  = 'ignore',
'unreserved_keywordHOUR_P' = 'ignore',
'unreserved_keywordINPUT_P'= 'ignore',
'unreserved_keywordMINUTE_P'   = 'ignore',
'unreserved_keywordMONTH_P'= 'ignore',
'unreserved_keywordSECOND_P'   = 'ignore',
'unreserved_keywordYEAR_P' = 'ignore',
'col_name_keywordCHAR_P'   = 'ignore',
'col_name_keywordINT_P'= 'ignore',
'col_name_keywordVALUES'   = 'ignore',
'reserved_keywordTO'   = 'ignore',
'reserved_keywordUNION'= 'ignore',

# some other production rules have to be ignored or replaced
'fetch_argsFORWARDopt_from_incursor_name'  = 'ignore',
'fetch_argsBACKWARDopt_from_incursor_name' = 'ignore',
opt_array_boundsopt_array_bounds'['Iconst']' = 'ignore',
'VariableShowStmtSHOWvar_name' = 'SHOW var_name 
ecpg_into',
'VariableShowStmtSHOWTIMEZONE' = 'SHOW TIME ZONE ecpg_into',
'VariableShowStmtSHOWTRANSACTIONISOLATIONLEVEL' = 'SHOW TRANSACTION 
ISOLATION LEVEL ecpg_into',
'VariableShowStmtSHOWSESSIONAUTHORIZATION' = 'SHOW SESSION 
AUTHORIZATION ecpg_into',

Re: [HACKERS] Alpha4 release blockers (was Re: wrapping up this CommitFest)

2011-03-02 Thread Dave Page

On Thu, Mar 3, 2011 at 12:54 AM, Andrew Dunstan and...@dunslane.net wrote:


 On 03/02/2011 02:16 PM, Dave Page wrote:

 On Thu, Mar 3, 2011 at 12:42 AM, Alvaro Herrera
 alvhe...@commandprompt.com  wrote:

 Excerpts from Andrew Dunstan's message of mié mar 02 14:02:30 -0300 2011:

 On 03/02/2011 11:49 AM, Tom Lane wrote:

 Well, we can eliminate that last theory, because there were both 32 and
 64 bit buildfarm machines showing the crash, cf bobcat and crake.
 BTW, I see the former is now running F14, not F13 as claimed on the
 buildfarm dashboard,

 That's because David apparently hasn't run update_personality.pl,
 although he has in the past.

 Does this also explain that moa reports being GCC while it's actually
 Sun Studio?

 moa has never changed - but there was a mixup with huia's keys when
 they were first registered on the buildfarm. I wonder if it wasn't the
 keys, but the rest of the info that was actually confused.

 Oh, ugh. So if I switch the compiler names and versions on these two they
 will be correct?

Should be. Moa is definitely Sun Studio:

-bash-3.00$ /opt/sunstudio12.1/bin/cc -V
cc: Sun C 5.10 SunOS_i386 2009/06/03
usage: cc [ options] files.  Use 'cc -flags' for details

And Huia is GCC:

-bash-3.00$ /usr/sfw/bin/gcc --version
gcc (GCC) 3.4.3 (csl-sol210-3_4-20050802)

Thanks.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

Teodor Sigaev teo...@sigaev.ru writes:
 [ builtin_knngist_contrib_btree_gist-0.12 patch ]

Applied with some corrections --- mostly, that the upgrade script was
all wet.  I added some documentation too.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Wed, 2011-03-02 at 16:53 +0200, Heikki Linnakangas wrote:
 On 02.03.2011 12:40, Simon Riggs wrote:
  allow_standalone_primary seems to need to be better through than it is
  now, yet neither of us think its worth having.
 
  If the people that want it can think it through a little better then it
  might make this release, but I propose to remove it from this current
  patch to allow us to commit with greater certainty and fewer bugs.
 
 If you leave it out, then let's rename the feature to semi-synchronous 
 replication or such. The point of synchronous replication is 
 zero-data-loss, and you don't achieve that with allow_standalone_primary=on.

The reason I have suggested leaving that parameter out is because the
behaviour is not fully specified and Yeb has reported cases that don't
(yet) make sense. If you want to fully specify it then we could yet add
it, assuming we have time.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Testing extension upgrade scripts

It occurred to me that it might be a good idea to describe how
I've been testing extension upgrade scripts.  So:

1. Install the 9.0 version of the module in an empty 9.0 database.
pg_dump this database.

2. Load the pg_dump script into an empty 9.1 database, with the
underlying shared library (if any) available in $libdir but not
installed as SQL.  (If this fails, probably you removed a C
function from the library.)

3. Execute CREATE EXTENSION whatever FROM unpackaged;.  (If this
fails, obviously you have work to do.)

4. pg_dump --binary-upgrade from the 9.1 database.  (You need
--binary-upgrade or pg_dump won't show the member objects of the
extension.)

5. Install the 9.1 version of the extension in another empty
database, using plain CREATE EXTENSION whatever.  Then
pg_dump --binary-upgrade from that.

6. Diff the two 9.1 dump scripts.  They should be the same except
for OID and frozenxid numbers.  If not, the upgrade script has
missed something it needs to do to update the catalog entries.


Of course, a similar approach will be useful for testing scripts
that are meant to update from one extension version to another,
once we arrive at the point of needing to do that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17


On 02.03.2011 21:48, Simon Riggs wrote:

On Wed, 2011-03-02 at 16:53 +0200, Heikki Linnakangas wrote:

On 02.03.2011 12:40, Simon Riggs wrote:

allow_standalone_primary seems to need to be better through than it is
now, yet neither of us think its worth having.

If the people that want it can think it through a little better then it
might make this release, but I propose to remove it from this current
patch to allow us to commit with greater certainty and fewer bugs.


If you leave it out, then let's rename the feature to semi-synchronous
replication or such. The point of synchronous replication is
zero-data-loss, and you don't achieve that with allow_standalone_primary=on.


The reason I have suggested leaving that parameter out is because the
behaviour is not fully specified and Yeb has reported cases that don't
(yet) make sense. If you want to fully specify it then we could yet add
it, assuming we have time.


Fair enough. All I'm saying is that if we end up shipping without that 
parameter (implying allow_standalone_primary=on), we need to call the 
feature something else. The GUCs and code can probably stay as it is, 
but we shouldn't use the term synchronous replication in the 
documentation, and release notes and such.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] ALTER TABLE deadlock with concurrent INSERT

2011-03-02 Thread Joe Conway

I'm working with a client on an application upgrade script which
executes a function to conditionally do an:

  ALTER TABLE foo ALTER COLUMN bar SET DATA TYPE baz

If this is run while the application is concurrently doing inserts into
foo, we are occasionally seeing deadlocks. Aside from the fact that they
are better off not altering the table amid concurrent inserts, I'm
trying to understand why this is even able to happen. I expect one to
block the other, not a deadlock.

This is 8.4.1 (I know, I know, I have advised strongly that they upgrade
to 8.4.latest).

We have not been able to repeat this forcibly. Here is what the log shows:
--
2011-02-25 14:38:07 PST [31686]: [1-1] ERROR:  deadlock detected
2011-02-25 14:38:07 PST [31686]: [2-1] DETAIL:  Process 31686 waits for
AccessExclusiveLock on relation 16896 of database 16386; blocked by
process 31634.
Process 31634 waits for RowExclusiveLock on relation 16902 of
database 16386; blocked by process 31686.
Process 31686: SELECT change_column_type('attribute_summary',
'sequence_number', 'numeric');
Process 31634: insert into attribute_summary (attribute_value,
sequence_number, attribute_id) values ($1, $2, $3)
2011-02-25 14:38:07 PST [31686]: [3-1] HINT:  See server log for query
details.
2011-02-25 14:38:07 PST [31686]: [4-1] CONTEXT:  SQL statement ALTER
TABLE attribute_summary ALTER COLUMN sequence_number SET DATA TYPE numeric
PL/pgSQL function change_column_type line 18 at EXECUTE statement
2011-02-25 14:38:07 PST [31686]: [5-1] STATEMENT:  SELECT
change_column_type('attribute_summary', 'sequence_number', 'numeric');
--

Reviewing the release notes, I see some marginally related commits, but
nothing that jumps out to me as a specific fix. Thoughts?

Thanks,

Joe


-- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] Sync Rep v17

Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 
 All I'm saying is that if we end up shipping without that
 parameter (implying allow_standalone_primary=on), we need to call
 the feature something else. The GUCs and code can probably stay as
 it is, but we shouldn't use the term synchronous replication in
 the documentation, and release notes and such.
 
I think including synchronous is OK as long as it's properly
qualified.  Off-hand thoughts in no particular order:
 
semi-synchronous
conditionally synchronous
synchronous with automatic failover to standalone
 
Perhaps the qualifications can be relaxed in some places but not
others?  The documentation should certainly emphasize that there is
no guarantee that a successful commit means that the data is on at
least two separate servers, if no such guarantee exists.
 
If we expect that losing all replicas is such a terribly thin long
shot that we don't need to worry about this difference, it is such a
long shot that we don't need to worry about wait forever behavior,
either; and we should just implement *that* so that no qualification
is needed.  I think that is an odd assumption, though; and I think a
HA failover to weaker persistence guarantees in exchange for
increased up-time would be popular.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perl 5.12 complains about ecpg parser-hacking scripts

Andy Colson a...@squeakycode.net writes:
 Here is a parse.pl, with some major refactoring.

 I am sure there are new bugs.  I have not run it on anything but 9.0.1. 
   Are there other .y files you might feed it? (something other than 
 backend/parser/gram.y?)

That's the only file it has to work for.  You could try it against 8.4
and HEAD versions as well as 9.0, but I'm not sure how much additional
test coverage that will get you.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

2011-03-02 Thread Joshua D. Drake

On Wed, 2011-03-02 at 14:26 -0600, Kevin Grittner wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
  
  All I'm saying is that if we end up shipping without that
  parameter (implying allow_standalone_primary=on), we need to call
  the feature something else. The GUCs and code can probably stay as
  it is, but we shouldn't use the term synchronous replication in
  the documentation, and release notes and such.
  
 I think including synchronous is OK as long as it's properly
 qualified.  Off-hand thoughts in no particular order:
  
 semi-synchronous 

You mean asynchronous

 conditionally synchronous

You mean asynchronous

JD




-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Wed, 2011-03-02 at 22:10 +0200, Heikki Linnakangas wrote:
 On 02.03.2011 21:48, Simon Riggs wrote:
  On Wed, 2011-03-02 at 16:53 +0200, Heikki Linnakangas wrote:
  On 02.03.2011 12:40, Simon Riggs wrote:
  allow_standalone_primary seems to need to be better through than it is
  now, yet neither of us think its worth having.
 
  If the people that want it can think it through a little better then it
  might make this release, but I propose to remove it from this current
  patch to allow us to commit with greater certainty and fewer bugs.
 
  If you leave it out, then let's rename the feature to semi-synchronous
  replication or such. The point of synchronous replication is
  zero-data-loss, and you don't achieve that with 
  allow_standalone_primary=on.
 
  The reason I have suggested leaving that parameter out is because the
  behaviour is not fully specified and Yeb has reported cases that don't
  (yet) make sense. If you want to fully specify it then we could yet add
  it, assuming we have time.
 
 Fair enough. All I'm saying is that if we end up shipping without that 
 parameter (implying allow_standalone_primary=on), we need to call the 
 feature something else. The GUCs and code can probably stay as it is, 
 but we shouldn't use the term synchronous replication in the 
 documentation, and release notes and such.

allow_standalone_primary=off means wait forever. It does nothing to
reduce data loss since you can't replicate to a server that isn't there.

As we discussed it, allow_standalone_primary=off was not a persistent
state, so shutting down the database would simply leave the data
committed. Which gives the same problem, but implicitly.

Truly synchronous requires two-phase commit, which this never was. So
the absence or presence of the poorly specified parameter called
allow_standalone_primary should have no bearing on what we call this
feature.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ALTER TABLE deadlock with concurrent INSERT

Joe Conway m...@joeconway.com writes:
 I'm working with a client on an application upgrade script which
 executes a function to conditionally do an:

   ALTER TABLE foo ALTER COLUMN bar SET DATA TYPE baz

 If this is run while the application is concurrently doing inserts into
 foo, we are occasionally seeing deadlocks. Aside from the fact that they
 are better off not altering the table amid concurrent inserts, I'm
 trying to understand why this is even able to happen. I expect one to
 block the other, not a deadlock.

Looks like the process trying to do the ALTER has already got some
lower-level lock on the table.  It evidently hasn't got
AccessExclusiveLock, but nonetheless has something strong enough to
block an INSERT, such as ShareLock.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17




On 03/02/2011 03:39 PM, Simon Riggs wrote:

Truly synchronous requires two-phase commit, which this never was. So
the absence or presence of the poorly specified parameter called
allow_standalone_primary should have no bearing on what we call this
feature.



I haven't been following this very closely, but to me this screams out 
that we simply must not call it synchronous.


Just my $0.02 worth.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ALTER TYPE COLLATABLE?

2011-03-02 Thread Peter Eisentraut

On tis, 2011-03-01 at 16:31 -0500, Tom Lane wrote:
 I can't say that this makes me think any better of the design here.
 If a boolean true/false is a sufficient representation of a type's
 collation property, why isn't the column in pg_type just a boolean?
 If the idea of storing an OID is to allow reference to a choice of
 collations, why are we painting ourselves into a corner by dumping
 it as a boolean?

The same column is used for base types, which can only have default
collation or nothing, and domains, which can have any collation.  We
could of course also have two separate columns, one typcollatable
boolean, and the typcollation only used by domains, and an earlier patch
had that, but as it turned out the code that ends up using this is
simplest if there is only one column.  We could also (probably) support
arbitrary nondefault collations on base types, but that sounds a bit
odd, so I wouldn't want to support it yet unless there is a real use
case.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

Simon Riggs si...@2ndquadrant.com wrote:
 
 allow_standalone_primary=off means wait forever. It does nothing
 to reduce data loss since you can't replicate to a server that
 isn't there.
 
Unless you're pulling from some persistent source which will then
feel free to discard what you have retrieved.  You can't assume
otherwise; it's not that rare a pattern for interfaces.  We use such
a pattern for accepting criminal complaints from district attorneys
and sending warrant information to police agencies.  Waiting a long
time (it won't *actually* be forever) is better than losing
information.
 
In other words, making a persistence promise which is not kept can
lose data on the client side, if the clients actually trust your
guarantees.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Wed, Mar 2, 2011 at 3:45 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Simon Riggs si...@2ndquadrant.com wrote:

 allow_standalone_primary=off means wait forever. It does nothing
 to reduce data loss since you can't replicate to a server that
 isn't there.

 Unless you're pulling from some persistent source which will then
 feel free to discard what you have retrieved.  You can't assume
 otherwise; it's not that rare a pattern for interfaces.  We use such
 a pattern for accepting criminal complaints from district attorneys
 and sending warrant information to police agencies.  Waiting a long
 time (it won't *actually* be forever) is better than losing
 information.

 In other words, making a persistence promise which is not kept can
 lose data on the client side, if the clients actually trust your
 guarantees.

I agree.  I assumed that when Simon was talking about removing
allow_standalone_primary, he meant making the code always behave as if
it were turned OFF.  The common scenario here is bound to be:

1. Everything is humming along.
2. The network link between the master and standby drops.
3. Then it comes back up again.

After (2) and before (3), what should the behavior the master be?  It
seems clear to me that it should WAIT.  Otherwise, a crash on the
master now leaves you with transactions that were confirmed committed
but not actually replicated to the standby.  If you were OK with that
scenario, you would have used asynchronous replication in the first
place.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ALTER TABLE deadlock with concurrent INSERT

2011-03-02 Thread Joe Conway

On 03/02/2011 12:41 PM, Tom Lane wrote:
 Looks like the process trying to do the ALTER has already got some
 lower-level lock on the table.  It evidently hasn't got
 AccessExclusiveLock, but nonetheless has something strong enough to
 block an INSERT, such as ShareLock.

Hmmm, is it possible that the following might do that, whereas a simple
ALTER TABLE would not?

8---
BEGIN;

CREATE OR REPLACE FUNCTION change_column_type
(
  tablename text,
  columnname text,
  newtype text
) RETURNS text AS $$
  DECLARE
newtypeid   oid;
tableoidoid;
curtypeid   oid;
  BEGIN
SELECT INTO newtypeid oid FROM pg_type WHERE oid =
  newtype::regtype::oid;
SELECT INTO tableoid oid FROM pg_class WHERE relname = tablename;
IF NOT FOUND THEN
  RETURN 'TABLE NOT FOUND';
END IF;

SELECT INTO curtypeid atttypid FROM pg_attribute WHERE
  attrelid = tableoid AND attname::text = columnname;
IF NOT FOUND THEN
  RETURN 'COLUMN NOT FOUND';
END IF;

IF curtypeid != newtypeid THEN
  EXECUTE 'ALTER TABLE ' || tablename || ' ALTER COLUMN ' ||
  columnname || ' SET DATA TYPE ' || newtype;
  RETURN 'CHANGE SUCCESSFUL';
ELSE
  RETURN 'CHANGE SKIPPED';
END IF;
  EXCEPTION
WHEN undefined_object THEN
  RETURN 'INVALID TARGET TYPE';
  END;
$$ LANGUAGE plpgsql;

SELECT change_column_type('attribute_summary',
  'sequence_number',
  'numeric');

COMMIT;
8---

This text is in a file being run from a shell script with something like:

  psql dbname  script.sql

The concurrent INSERTs are being done by the main application code
(running on Tomcat).

Joe

-- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] ALTER TYPE COLLATABLE?

Peter Eisentraut pete...@gmx.net writes:
 On tis, 2011-03-01 at 16:31 -0500, Tom Lane wrote:
 If a boolean true/false is a sufficient representation of a type's
 collation property, why isn't the column in pg_type just a boolean?
 If the idea of storing an OID is to allow reference to a choice of
 collations, why are we painting ourselves into a corner by dumping
 it as a boolean?

 The same column is used for base types, which can only have default
 collation or nothing, and domains, which can have any collation.

That seems like a 100% arbitrary distinction between base types and
domains, to the detriment of base types, which is odd since in most
other ways base types are much more flexible than domains.

 We
 could of course also have two separate columns, one typcollatable
 boolean, and the typcollation only used by domains, and an earlier patch
 had that, but as it turned out the code that ends up using this is
 simplest if there is only one column.  We could also (probably) support
 arbitrary nondefault collations on base types, but that sounds a bit
 odd, so I wouldn't want to support it yet unless there is a real use
 case.

Well, I think a use case will pop up PDQ --- contrib/citext seems like
the most likely first candidate.

I guess that since the CREATE TYPE parameter is named COLLATABLE,
we could extend in an upward-compatible way by adding a parameter
COLLATION name, but I would just as soon not have a parameter
that's got such an obviously short time-to-live.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Wed, 2011-03-02 at 15:50 -0500, Robert Haas wrote:

 I assumed that when Simon was talking about removing
 allow_standalone_primary, he meant making the code always behave as if
 it were turned OFF. 

That is the part that is currently not fully specified, so no that is
not currently included in the patch.

That isn't double-talk for and I will not include it.

What I mean is I'd rather have something than nothing, whatever we
decide to call it.

But the people that want it had better come up with a clear definition
of how it will actually work, covering all cases, not just splish
splash, it kinda works and we'll let Simon will work out the rest.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

Simon Riggs si...@2ndquadrant.com wrote:
 On Wed, 2011-03-02 at 15:50 -0500, Robert Haas wrote:
 
 I assumed that when Simon was talking about removing
 allow_standalone_primary, he meant making the code always behave
 as if it were turned OFF. 
 
 That is the part that is currently not fully specified, so no that
 is not currently included in the patch.
 
 That isn't double-talk for and I will not include it.
 
 What I mean is I'd rather have something than nothing, whatever we
 decide to call it.
 
+1 on that.
 
 But the people that want it had better come up with a clear
 definition of how it will actually work
 
What is ill-defined?  I would have thought that the commit request
would hang indefinitely until the server was able to provide its
usual guarantees.  I'm not clear on what cases aren't covered by
that.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Wed, 2011-03-02 at 15:44 -0500, Andrew Dunstan wrote:
 
 On 03/02/2011 03:39 PM, Simon Riggs wrote:
  Truly synchronous requires two-phase commit, which this never was. So
  the absence or presence of the poorly specified parameter called
  allow_standalone_primary should have no bearing on what we call this
  feature.
 
 
 I haven't been following this very closely, but to me this screams out 
 that we simply must not call it synchronous.

As long as we describe it via its characteristics, then I'll be happy:

* significantly reduces the possibility of data loss in a sensibly
configured cluster

* allow arbitrary N+k resilience that can meet and easily exceed
 5 nines data durability

* isn't two phase commit

* isn't a magic bullet that will protect your data even after your
hardware fails or is disconnected

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Quick Extensions Question

It's about dependences.

If my extension requires a procedural language, will adding that language to 
the `requires` control key do what I think it should do?

If not, how should one require a PL? Come to think of it, how might I require 
other features that might not be included in a particular build, like xpath()?

Thanks,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

Simon Riggs si...@2ndquadrant.com writes:
 On Wed, 2011-03-02 at 22:10 +0200, Heikki Linnakangas wrote:
 Fair enough. All I'm saying is that if we end up shipping without that 
 parameter (implying allow_standalone_primary=on), we need to call the 
 feature something else. The GUCs and code can probably stay as it is, 
 but we shouldn't use the term synchronous replication in the 
 documentation, and release notes and such.

 allow_standalone_primary=off means wait forever. It does nothing to
 reduce data loss since you can't replicate to a server that isn't there.

This is irrelevant to the point.  The point is that sync rep implies
that we will not *tell a client* its data is committed unless the commit
is down to disk in two places.  I agree with the people saying that not
having this parameter makes it not real sync rep.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Testing extension upgrade scripts

You should blog this.

David

On Mar 2, 2011, at 11:58 AM, Tom Lane wrote:

 It occurred to me that it might be a good idea to describe how
 I've been testing extension upgrade scripts.  So:
 
 1. Install the 9.0 version of the module in an empty 9.0 database.
 pg_dump this database.
 
 2. Load the pg_dump script into an empty 9.1 database, with the
 underlying shared library (if any) available in $libdir but not
 installed as SQL.  (If this fails, probably you removed a C
 function from the library.)
 
 3. Execute CREATE EXTENSION whatever FROM unpackaged;.  (If this
 fails, obviously you have work to do.)
 
 4. pg_dump --binary-upgrade from the 9.1 database.  (You need
 --binary-upgrade or pg_dump won't show the member objects of the
 extension.)
 
 5. Install the 9.1 version of the extension in another empty
 database, using plain CREATE EXTENSION whatever.  Then
 pg_dump --binary-upgrade from that.
 
 6. Diff the two 9.1 dump scripts.  They should be the same except
 for OID and frozenxid numbers.  If not, the upgrade script has
 missed something it needs to do to update the catalog entries.
 
 
 Of course, a similar approach will be useful for testing scripts
 that are meant to update from one extension version to another,
 once we arrive at the point of needing to do that.
 
   regards, tom lane
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

2011-03-02 Thread Dimitri Fontaine

Robert Haas robertmh...@gmail.com writes:
 1. Everything is humming along.
 2. The network link between the master and standby drops.
 3. Then it comes back up again.

 After (2) and before (3), what should the behavior the master be?  It
 seems clear to me that it should WAIT.  Otherwise, a crash on the

That just means you want data high availability, not service HA.  Some
people want the *service* to stay available in such a situation.

 master now leaves you with transactions that were confirmed committed
 but not actually replicated to the standby.  If you were OK with that
 scenario, you would have used asynchronous replication in the first
 place.

What is so hard to understand in worst case scenario being different
than expected conditions.  We all know that getting the last percent
is more expensive than getting the 99 first one.  We have no reason to
force people into building for the last percent whatever their context.

So, what cases need to be defined wrt forbidding the primary to continue
alone?

 - in flight commit

   blocked until we can offer the asked for durability, wait forever

 - shutdown request

   blocked until standby acknowledge the final checkpoint

   are immediate shutdown requests permitted? what do they do?

What other cases are to be fully designed here?  Please note that above
list is just a way to start the design, not a definitive proposal from
me.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Testing extension upgrade scripts

David E. Wheeler da...@kineticode.com writes:
 You should blog this.

[ shrug... ]  I don't own a blog, and if I did the entries in it would
not be included in the pgsql archives, which is where material like this
probably ought to be.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17




On 03/02/2011 04:13 PM, Simon Riggs wrote:

On Wed, 2011-03-02 at 15:44 -0500, Andrew Dunstan wrote:

On 03/02/2011 03:39 PM, Simon Riggs wrote:

Truly synchronous requires two-phase commit, which this never was. So
the absence or presence of the poorly specified parameter called
allow_standalone_primary should have no bearing on what we call this
feature.


I haven't been following this very closely, but to me this screams out
that we simply must not call it synchronous.

As long as we describe it via its characteristics, then I'll be happy:

* significantly reduces the possibility of data loss in a sensibly
configured cluster

* allow arbitrary N+k resilience that can meet and easily exceed
  5 nines data durability

* isn't two phase commit

* isn't a magic bullet that will protect your data even after your
hardware fails or is disconnected




Ok, so let's call it enhanced safety or something else that isn't a 
term of art.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Testing extension upgrade scripts

2011-03-02 Thread Dimitri Fontaine

David E. Wheeler da...@kineticode.com writes:
 You should blog this.

He just did, didn't he? :)

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

On Tue, Mar 01, 2011 at 01:20:43PM -0800, daveg wrote:
 On Tue, Mar 01, 2011 at 12:00:54AM +0200, Heikki Linnakangas wrote:
  On 28.02.2011 23:28, daveg wrote:
  On Wed, Jan 12, 2011 at 10:46:14AM +0200, Heikki Linnakangas wrote:
  We'll likely need to go back and forth a few times with various
  debugging patches until we get to the heart of this..
  
  Anything new on this? I'm seeing at on one of my clients production boxes.
  
  I haven't heard anything from the OP since.
  
  Also, what is the significance, ie what is the risk or damage potential if
  this flag is set incorrectly?
  
  Sequential scans will honor the flag, so you might see some dead rows 
  incorrectly returned by a sequential scan. That's the only damage, but 
  an incorrectly set flag could be a sign of something more sinister, like 
  corrupt tuple headers. The flag should never be set incorrectly, so if 
  you see that message you have hit a bug in PostgreSQL, or you have bad 
  hardware.
  
  This flag is quite new, so a bug in PostgreSQL is quite possible. If you 
  still have a backup that contains those incorrectly set flags, I'd like 
  to see what the page looks like.
 
 
 I ran vacuums on all the affected tables last night. I plan to take a downtime
 to clear the buffer cache and then to run vacuums on all the dbs in the
 cluster.
 
 Most but not all the tables involved are catalogs.
 
 However, I could probably pick up your old patch sometime next week if it
 recurrs and send you page images.

After a restart and vacuum of all dbs with no other activity things were
quiet for a couple hours and then we started seeing these PD_ALL_VISIBLE
messages again. 

Going back through the logs we have been getting these since at least before
mid January. Oddly, this only happens on four systems which are all new Dell
32 core Nehalem 512GB machines using iscsi partitions served off a Netapp.
Our older 8 core 64GB hosts have never logged any of these errors. I'm not
saying it is related to the hw, as these hosts are doing a lot more work than
the old hosts so it may be a concurrency problem that just never came up at
lower levels before.

Postgresql version is 8.4.4.

I'll pick up Heikkis page logging patch and run it for a bit to get some
damaged page images. What else could I be doing to track this down?

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Wed, Mar 2, 2011 at 4:19 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 1. Everything is humming along.
 2. The network link between the master and standby drops.
 3. Then it comes back up again.

 After (2) and before (3), what should the behavior the master be?  It
 seems clear to me that it should WAIT.  Otherwise, a crash on the

 That just means you want data high availability, not service HA.  Some
 people want the *service* to stay available in such a situation.

 master now leaves you with transactions that were confirmed committed
 but not actually replicated to the standby.  If you were OK with that
 scenario, you would have used asynchronous replication in the first
 place.

 What is so hard to understand in worst case scenario being different
 than expected conditions.  We all know that getting the last percent
 is more expensive than getting the 99 first one.  We have no reason to
 force people into building for the last percent whatever their context.

I don't understand how synchronous replication with
allow_standalone_primary=on gives you ANY extra nines.  AFAICS, the
only point of having synchronous replication is that you wait to
acknowledge the commit to the client until the commit record has been
replicated.  Doing that only when the standby happens to be connected
doesn't seem like it helps much.

If the master is up, then it doesn't really matter what the standby
does; we don't need high availability in that case, because we have
just plain regular old availability.

If the master goes down, then we need to know that we haven't lost any
confirmed-committed transactions.  With allow_standalone_primary=off,
we don't know that.  They might be, or they might not be.  Even if we
have 100 separate standbys, there is no way of knowing whether there
was a time period just before the crash during which the master
couldn't get out to the Internet, and some commits by clients on the
local network went through.  Maybe with some careful network
engineering you can convince yourself that that isn't very likely, but
I sure wouldn't bet on it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-02 Thread Alvaro Herrera

Excerpts from daveg's message of mié mar 02 18:30:34 -0300 2011:

 After a restart and vacuum of all dbs with no other activity things were
 quiet for a couple hours and then we started seeing these PD_ALL_VISIBLE
 messages again. 
 
 Going back through the logs we have been getting these since at least before
 mid January. Oddly, this only happens on four systems which are all new Dell
 32 core Nehalem 512GB machines using iscsi partitions served off a Netapp.
 Our older 8 core 64GB hosts have never logged any of these errors. I'm not
 saying it is related to the hw, as these hosts are doing a lot more work than
 the old hosts so it may be a concurrency problem that just never came up at
 lower levels before.
 
 Postgresql version is 8.4.4.

I don't see how this could be related, but since you're running on NFS,
maybe it is, somehow:
http://archives.postgresql.org/message-id/4d40ddb7.1010...@credativ.com
(for example what if the visibility map fork's last page is overwritten?)

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

2011-03-02 Thread Yeb Havinga


On 2011-03-02 21:26, Kevin Grittner wrote:


I think including synchronous is OK as long as it's properly
qualified.  Off-hand thoughts in no particular order:

semi-synchronous
conditionally synchronous
synchronous with automatic failover to standalone
It would be good to name the concept equal to how other DBMSses call it, 
if they have a similar concept - don't know if Mysql's semisynchronous 
replication is the same, but after a quick read it sounds like it does.


regards,
Yeb Havinga

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Wed, 2011-03-02 at 16:16 -0500, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On Wed, 2011-03-02 at 22:10 +0200, Heikki Linnakangas wrote:
  Fair enough. All I'm saying is that if we end up shipping without that 
  parameter (implying allow_standalone_primary=on), we need to call the 
  feature something else. The GUCs and code can probably stay as it is, 
  but we shouldn't use the term synchronous replication in the 
  documentation, and release notes and such.
 
  allow_standalone_primary=off means wait forever. It does nothing to
  reduce data loss since you can't replicate to a server that isn't there.
 
 This is irrelevant to the point.  The point is that sync rep implies
 that we will not *tell a client* its data is committed unless the commit
 is down to disk in two places.  I agree with the people saying that not
 having this parameter makes it not real sync rep.

Depends what you think the point is.

Your comments go exactly to *my* point which is that the behaviour I'm
looking to commit maximises data durability *and* availability and is
the real choice that people will make. Claiming it isn't real will
make people scared of it, when its actually what they have been waiting
for. There is nothing half-arsed or unreal about what is being
delivered.

Let's not get hung up on textbook definitions, lets do the useful thing.

And to repeat, I am not against the other way. It has its place, in a
few cases. And I'm not against including it in this release either,
given we have a good definition.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

Yeb Havinga yebhavi...@gmail.com wrote:
 On 2011-03-02 21:26, Kevin Grittner wrote:

 I think including synchronous is OK as long as it's properly
 qualified.  Off-hand thoughts in no particular order:

 semi-synchronous
 conditionally synchronous
 synchronous with automatic failover to standalone
 It would be good to name the concept equal to how other DBMSses
 call it,  if they have a similar concept - don't know if Mysql's
 semisynchronous replication is the same, but after a quick read it
 sounds like it does.
 
I had no idea MySQL used that terminology; it just seemed apt for
describing a setup which is synchronous except when it isn't. Using
the same terminology for equivalent functionality has its pluses,
but might there be an trademark or other IP issues here?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Wed, 2011-03-02 at 16:24 -0500, Andrew Dunstan wrote:
 
 On 03/02/2011 04:13 PM, Simon Riggs wrote:
  On Wed, 2011-03-02 at 15:44 -0500, Andrew Dunstan wrote:
  On 03/02/2011 03:39 PM, Simon Riggs wrote:
  Truly synchronous requires two-phase commit, which this never was. So
  the absence or presence of the poorly specified parameter called
  allow_standalone_primary should have no bearing on what we call this
  feature.
 
  I haven't been following this very closely, but to me this screams out
  that we simply must not call it synchronous.
  As long as we describe it via its characteristics, then I'll be happy:
 
  * significantly reduces the possibility of data loss in a sensibly
  configured cluster
 
  * allow arbitrary N+k resilience that can meet and easily exceed
5 nines data durability
 
  * isn't two phase commit
 
  * isn't a magic bullet that will protect your data even after your
  hardware fails or is disconnected
 
 
 
 Ok, so let's call it enhanced safety or something else that isn't a 
 term of art.

Good plan.

Oracle avoided the whole issue by referring to the two modes as maximum
availability and maximum protection. I'm not sure if that is patented
or copyright etc, but I'm guessing they had this exact same discussion,
just a little less friendly.

Perhaps we can coin the term High Durability. 

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Add KNNGIST support to contrib/btree_gist.

2011-03-02 Thread David Fetter

On Wed, Mar 02, 2011 at 07:45:05PM +, Tom Lane wrote:
 Add KNNGIST support to contrib/btree_gist.
 
 This extends GiST's support for nearest-neighbor searches to many of the
 standard data types.
 
 Teodor Sigaev

Neat!

What stands between where we are and including these in 9.2 core?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

On Wed, Mar 02, 2011 at 06:45:13PM -0300, Alvaro Herrera wrote:
 Excerpts from daveg's message of mié mar 02 18:30:34 -0300 2011:
 
  After a restart and vacuum of all dbs with no other activity things were
  quiet for a couple hours and then we started seeing these PD_ALL_VISIBLE
  messages again. 
  
  Going back through the logs we have been getting these since at least before
  mid January. Oddly, this only happens on four systems which are all new Dell
  32 core Nehalem 512GB machines using iscsi partitions served off a Netapp.
  Our older 8 core 64GB hosts have never logged any of these errors. I'm not
  saying it is related to the hw, as these hosts are doing a lot more work 
  than
  the old hosts so it may be a concurrency problem that just never came up at
  lower levels before.
  
  Postgresql version is 8.4.4.
 
 I don't see how this could be related, but since you're running on NFS,
 maybe it is, somehow:
 http://archives.postgresql.org/message-id/4d40ddb7.1010...@credativ.com
 (for example what if the visibility map fork's last page is overwritten?)

Running on ISCSI, not nfs. But it is still a Netapp, so who knows. I'll look.
Also, we are not seeing any of the unexpected data beyond EOF errors,
just thousands per day of the PD_ALL_VISIBLE error.

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-02 Thread bricklen

On Wed, Mar 2, 2011 at 3:53 PM, daveg da...@sonic.net wrote:
  Postgresql version is 8.4.4.

 I don't see how this could be related, but since you're running on NFS,
 maybe it is, somehow:
 http://archives.postgresql.org/message-id/4d40ddb7.1010...@credativ.com
 (for example what if the visibility map fork's last page is overwritten?)

 Running on ISCSI, not nfs. But it is still a Netapp, so who knows. I'll look.
 Also, we are not seeing any of the unexpected data beyond EOF errors,
 just thousands per day of the PD_ALL_VISIBLE error.

 -dg

FWIW, we had a couple occurrences of that message about a month ago on 9.0.2

http://archives.postgresql.org/pgsql-general/2011-01/msg00887.php

Haven't seen it since we ran a cluster-wide vacuum.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

On Wed, Mar 02, 2011 at 04:20:24PM -0800, bricklen wrote:
 On Wed, Mar 2, 2011 at 3:53 PM, daveg da...@sonic.net wrote:
   Postgresql version is 8.4.4.
 
  I don't see how this could be related, but since you're running on NFS,
  maybe it is, somehow:
  http://archives.postgresql.org/message-id/4d40ddb7.1010...@credativ.com
  (for example what if the visibility map fork's last page is overwritten?)
 
  Running on ISCSI, not nfs. But it is still a Netapp, so who knows. I'll 
  look.
  Also, we are not seeing any of the unexpected data beyond EOF errors,
  just thousands per day of the PD_ALL_VISIBLE error.
 
  -dg
 
 FWIW, we had a couple occurrences of that message about a month ago on 9.0.2
 
 http://archives.postgresql.org/pgsql-general/2011-01/msg00887.php
 
 Haven't seen it since we ran a cluster-wide vacuum.

We did a shutdown and restart to clear the buffer cache (but did not reboot
the host) and a vacuum on all dbs in the cluster last night. That cleared it
up for a couple hours, but we are still getting lots of these messages.

Most of them are pg_statistic and we create and drop hundreds of thousands of
temp tables daily, so there is a good chance there is a concurrancy issue.

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL segments pile up during standalone mode

On Thu, Mar 3, 2011 at 3:22 AM, Alvaro Herrera alvhe...@alvh.no-ip.org wrote:
 I noticed that in standalone mode, WAL segments don't seem to be
 recycled.  This could get problematic if you're forced to vacuum large
 tables in that mode and space for WAL is short.

Checkpoint is required to recycle old WAL segments. Can checkpoint
be executed in standalone mode? even during VACUUM FULL?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Thu, Mar 3, 2011 at 5:50 AM, Robert Haas robertmh...@gmail.com wrote:
 I agree.  I assumed that when Simon was talking about removing
 allow_standalone_primary, he meant making the code always behave as if
 it were turned OFF.

I feel the same thing.. Despite his saying, the patch implements
sync_replication_timeout_client, and if its value is too large,
the primary behaves like wait-forever. Though the primary
behaves like standalone only when it's started first without
connected standbys. So if we have
sync_replication_timeout_client, allow_standalone_primary
looks no longer useful.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Thu, Mar 3, 2011 at 6:33 AM, Robert Haas robertmh...@gmail.com wrote:
 I don't understand how synchronous replication with
 allow_standalone_primary=on gives you ANY extra nines.

When you start the primary (or when there is one connected standby and
it crashes), allow_standalone_primary = on allows the database service
to proceed. OTOH, setting the parameter to off keeps the service stopping
until new standby has connected and has caught up with the primary. This
would cause long service down time, and decrease the availability.

Of course, running the primary alone has the risk. If its disk gets corrupted
before new standby appears, some committed transactions are lost. But
we can decrease this risk to a certain extent by using RAID or something
to the storage. So I think that some systems can accept the risk and prefer
the availability of the database service. Josh explained clearly before why
allow_standalone_primary = off is required for his case.
http://archives.postgresql.org/message-id/4CAE2488.9020207%40agliodbs.com

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Thu, Mar 3, 2011 at 12:11 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 To achieve the effect Fujii is looking for, we would have to silently drop
 the connection. That would correctly leave the client not knowing whether
 the transaction committed or not.

Yeah, this seems to make more sense.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

On Thu, 2011-03-03 at 13:35 +0900, Fujii Masao wrote:
 On Thu, Mar 3, 2011 at 12:11 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
  To achieve the effect Fujii is looking for, we would have to silently drop
  the connection. That would correctly leave the client not knowing whether
  the transaction committed or not.
 
 Yeah, this seems to make more sense.

How do you propose we do that?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Quick Extensions Question

David E. Wheeler da...@kineticode.com writes:
 If my extension requires a procedural language, will adding that language to 
 the `requires` control key do what I think it should do?

No.

Probably in future the standard PLs will be packaged as extensions, and
then it will work.  The main reason that it won't happen for 9.1 is that
right now we require superuser privilege to install an extension, which
would be a regression compared to the privilege requirements for
installing standard PLs in existing releases.  And relaxing that
requirement is a research project :-(

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Quick Extensions Question

On Mar 2, 2011, at 11:00 PM, Tom Lane wrote:

 David E. Wheeler da...@kineticode.com writes:
 If my extension requires a procedural language, will adding that language to 
 the `requires` control key do what I think it should do?
 
 No.
 
 Probably in future the standard PLs will be packaged as extensions, and
 then it will work.  The main reason that it won't happen for 9.1 is that
 right now we require superuser privilege to install an extension, which
 would be a regression compared to the privilege requirements for
 installing standard PLs in existing releases.  And relaxing that
 requirement is a research project :-(

Okay. I guess the to do list should be updated?

  http://wiki.postgresql.org/wiki/Todo

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Add KNNGIST support to contrib/btree_gist.

David Fetter da...@fetter.org writes:
 On Wed, Mar 02, 2011 at 07:45:05PM +, Tom Lane wrote:
 Add KNNGIST support to contrib/btree_gist.

 What stands between where we are and including these in 9.2 core?

Well, the inet case at least is not up to the standards I'd expect
of core code; see previous complaints.  I'm suspicious of the datetime
related cases as well --- mapping intervals to float8s is an exercise
in wishful thinking IMO.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

On Tue, Mar 01, 2011 at 08:40:37AM -0500, Robert Haas wrote:
 On Mon, Feb 28, 2011 at 10:32 PM, Greg Stark gsst...@mit.edu wrote:
  On Tue, Mar 1, 2011 at 1:43 AM, David Christensen da...@endpoint.com 
  wrote:
  Was this cluster upgraded to 8.4.4 from 8.4.0?  It sounds to me like a 
  known bug in 8.4.0 which was fixed by this commit:
 
 
  The reproduction script described was running vacuum repeatedly. A
  single vacuum run out to be sufficient to clean up the problem if it
  was left-over.
 
  I wonder if it would help to write a regression test that runs 100 or
  so vacuums and see if the bulid farm turns up any examples of this
  behaviour.
 
 One other thing to keep in mind here is that the warning message we've
 chosen can be a bit misleading.  The warning is:
 
 WARNING:  PD_ALL_VISIBLE flag was incorrectly set in relation test page 1
 
 ...which implies that the state of the tuples is correct, and that the
 page-level bit is wrong in comparison.  But I recently saw a case
 where the infomask got clobbered, resulting in this warning.  The page
 level bit was correct, at least relative to the intended page
 contents; it was the a tuple on the page that was screwed up.  It
 might have been better to pick a more neutral phrasing, like page is
 marked all-visible but some tuples are not visible.

Yeesh. Yikes. I hope that this is not the case as we are seeing thousands of
these daily on each of 4 large production hosts. Mostly on catalogs,
especially pg_statistic. However it does occur on some high delete/insert
traffic user tables too.

Question: what would be the consequence of simply patching out the setting
of this flag? Assuming that the incorrect PD_ALL_VISIBLE flag is the only
problem (big assumption perhaps) then simply never setting it would at least
avoid the possibility of returning wrong answers, presumably at some
performance cost. We possibly could live with that until we get a handle
on the real cause and fix.

I had a look and don't really see anything except vacuum_lazy that sets it,
so it seems simple to disable.

Or have I understood this incorrectly?

Anything else I can be doing to try to track this down?

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

Fujii Masao masao.fu...@gmail.com writes:
 On Thu, Mar 3, 2011 at 12:11 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 To achieve the effect Fujii is looking for, we would have to silently drop
 the connection. That would correctly leave the client not knowing whether
 the transaction committed or not.

 Yeah, this seems to make more sense.

It was pointed out that sending an ERROR would not do because it would
likely lead to client code assuming the transaction failed, which might
or might not be the case.  But maybe we could send a WARNING and then
close the connection?  That would give humans a clue what had happened,
but not do anything to the state of automated clients.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add tab completion support for JOIN