Re: [HACKERS] Sync Rep v17

2011-02-22 Thread Marti Raudsepp
On Tue, Feb 22, 2011 at 07:38, Fujii Masao masao.fu...@gmail.com wrote:
 +       SpinLockAcquire(WalSndCtl-ctlmutex);
 +       result = WalSndCtl-sync_rep_service_available;
 +       SpinLockRelease(WalSndCtl-ctlmutex);

 volatile pointer needs to be used to prevent code rearrangement.

I don't think that's necessary. Spinlock functions already prevent
reordering using __asm__ __volatile__

Otherwise, surely they would be utterly broken?

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UNLOGGED tables in psql \d

2011-02-22 Thread Cédric Villemain
2011/2/22 Itagaki Takahiro itagaki.takah...@gmail.com:
 psql \d(+) doesn't show any information about UNLOGGED and TEMP attributes
 for the table. So, we cannot know the table is unlogged or not unless
 we directly select from pg_class.relpersistence.  Is this a TODO item?

 The same issue is in TEMP tables, but we can determine them by their
 schema; they are always created in pg_temp_N schema.

I believe it is in the title of the table presented by \d (Table,
Unlogged table, Temp table)

-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep v17

2011-02-22 Thread Andres Freund
On Tuesday 22 February 2011 09:59:21 Marti Raudsepp wrote:
 On Tue, Feb 22, 2011 at 07:38, Fujii Masao masao.fu...@gmail.com wrote:
  +   SpinLockAcquire(WalSndCtl-ctlmutex);
  +   result = WalSndCtl-sync_rep_service_available;
  +   SpinLockRelease(WalSndCtl-ctlmutex);
  
  volatile pointer needs to be used to prevent code rearrangement.
 
 I don't think that's necessary. Spinlock functions already prevent
 reordering using __asm__ __volatile__
 
 Otherwise, surely they would be utterly broken?
Its not the spinlock thats the problem but that result may be already loaded 
into a register. Thats not prohibited by __asm__ __volatile__.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UNLOGGED tables in psql \d

2011-02-22 Thread Itagaki Takahiro
On Tue, Feb 22, 2011 at 18:11, Cédric Villemain
cedric.villemain.deb...@gmail.com wrote:
 2011/2/22 Itagaki Takahiro itagaki.takah...@gmail.com:
 psql \d(+) doesn't show any information about UNLOGGED and TEMP attributes
 for the table. So, we cannot know the table is unlogged or not unless
 we directly select from pg_class.relpersistence.  Is this a TODO item?

 I believe it is in the title of the table presented by \d (Table,
 Unlogged table, Temp table)

Ah, I see. Thank you.  They are shown as Unlogged Table and
Unlogged Index.

- We don't have Temp for temp tables. Should we have?
- The head characters of the second words would be small letters.
  We describe other objects as Composite type, Foreign table, or so.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Void binary patch

2011-02-22 Thread Robert Haas
On Sun, Feb 20, 2011 at 5:20 AM, Radosław Smogura
rsmog...@softperience.eu wrote:
 Just patch for missing procedures for void send/recv

What problem does this fix?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Snapshot synchronization, again...

2011-02-22 Thread Robert Haas
On Tue, Feb 22, 2011 at 1:59 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Really? The idea of the hash is to prevent you from importing arbitrary
 snapshots into the system, allowing only copying snapshots that are in use
 by another backend. The purpose of the hash is to make sure the snapshot the
 client passes in is identical to one used by another backend.

If that's the purpose of the hash, it is utterly useless.  The
computation performed by the backend can easily be replicated by an
adversary.

 If you don't use a cryptographically secure hash, it's easy to construct a
 snapshot with the same hash as an existing snapshot, with more or less
 arbitrary contents.

And if you did use a cryptographically secure hash, it would still be
easy, unless there is some plausible way of keeping the key a secret,
which I doubt.

The original reason why we went with this design (send the snapshot to
the client and then have the client send it back to the server) was
because some people thought it could be useful to take a snapshot from
the master and recreate it on a standby, or perhaps the other way
around.  If that's the goal, checking whether the snapshot being
imported is one that's already in use is missing the point.  We have
to rely on our ability to detect, when importing the snapshot, any
anomalies that could lead to system instability; and allow people to
import an arbitrary snapshot that meets those constraints, even one
that the user just created out of whole cloth.

Now, as I said before, I think that's a bad idea.  I think we should
NOT be designing the system to allow publication of arbitrary
snapshots; instead, the backend that wishes to export its snapshot
should so request, getting back a token (like 1) that other backends
can then pass to START TRANSACTION (SNAPSHOT '1'), or whatever syntax
we end up using.  In that design, there's no need to validate
snapshots because they never leave the server's memory space, which
IMHO is as it should be.  If someone can develop a convincing argument
that transmitting snapshots between the master and standby is a good
idea, we can still accommodate that: the master-standby connection is
now full-duplex.  In fact, we may want to do that anyway for other
reasons (see Kevin Grittner's musings on this point vs. true
serializability).

Another reason I don't like this approach is because it's possible
that the representation of snapshots might change in the future.  Tom
mused a while back about using an LSN as a snapshot (snapshots sees
all transactions committed prior to that LSN) and there are other
plausible changes, too.  If we expose the internal details of the
snapshot mechanism to users, then (1) it becomes a user-visible
feature with backward compatibility implications if we choose to break
it down the road and (2) all future snapshot representations must be
able to be fully sanity checked on import.  Right now it's fairly easy
to verify that the snapshot's xmin isn't too old and that the
remaining XIDs are in a reasonable range and that's probably all we
really need to be certain of, but it's not clear that some future
representation would be equally easy to validate.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SR standby hangs

2011-02-22 Thread Robert Haas
On Sun, Feb 20, 2011 at 12:39 PM, Greg Stark gsst...@mit.edu wrote:
 On Fri, Feb 18, 2011 at 6:59 PM, Andrew Dunstan amduns...@nc.rr.com wrote:
 The server is running as a warm standby, and the client's application tries
 to connect to both the master and the slave, accepting whichever lets it
 connect (hence hot standby is not turned on).
...
   #2  0x005de645 in LockBufferForCleanup () at bufmgr.c:2432
   #3  0x00463733 in heap_xlog_clean (lsn=value optimized out,

 Hm, if hot standby isn't turned on why are we bothering with locks at
 all? But that said is there *anything* else happening in the database
 aside from recovery? Are there any errors in the database log?

 But this still shouldn't block. It's either blocked locking the buffer
 or blocked waiting for the buffer to become unpinned. It would be nice
 to get a backtrace from a debugging build which wouldn't have some of
 the functions inlined. It would be really nice to see the pin count on
 the buffer in question -- perhaps it has gotten out of sync or
 underflowed?

I realize now that it would also be useful to see the state of the
LWLock on the buffer.

A little OT, but ISTM that the buffer pin mechanism by its nature is
prone to lock upgrade hazards.  A cleanup lock is essentially an
access exclusive lock on the buffer, while a buffer pin is an access
share lock.  In the middle, we have the usual share and exclusive
(content) locks.  We regularly try to upgrade buffer pins to any of
the higher locking levels, which is quite unlike what we do with
regular (heavyweight) locks, where we take some pains to avoid lock
upgrades.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Snapshot synchronization, again...

2011-02-22 Thread Heikki Linnakangas

On 22.02.2011 14:24, Robert Haas wrote:

On Tue, Feb 22, 2011 at 1:59 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

If you don't use a cryptographically secure hash, it's easy to construct a
snapshot with the same hash as an existing snapshot, with more or less
arbitrary contents.


And if you did use a cryptographically secure hash, it would still be
easy, unless there is some plausible way of keeping the key a secret,
which I doubt.


This is hashing, not encryption, there is no key. The point is that even 
if the attacker has the hash value and knows the algorithm, he can not 
construct *another* snapshot that has the same hash. A cryptographically 
strong hash function has that property, second preimage resistance, 
whereas for a regular CRC or similar it's easy to create an arbitrary 
input that has any given checksum.



The original reason why we went with this design (send the snapshot to
the client and then have the client send it back to the server) was
because some people thought it could be useful to take a snapshot from
the master and recreate it on a standby, or perhaps the other way
around.  If that's the goal, checking whether the snapshot being
imported is one that's already in use is missing the point.  We have
to rely on our ability to detect, when importing the snapshot, any
anomalies that could lead to system instability; and allow people to
import an arbitrary snapshot that meets those constraints, even one
that the user just created out of whole cloth.


Yes. It would be good to perform those sanity checks anyway.


Now, as I said before, I think that's a bad idea.  I think we should
NOT be designing the system to allow publication of arbitrary
snapshots; instead, the backend that wishes to export its snapshot
should so request, getting back a token (like 1) that other backends
can then pass to START TRANSACTION (SNAPSHOT '1'), or whatever syntax
we end up using.


Well, I'm not sure if we should allow importing arbitrary (as long as 
they're sane) snapshots or not; the arguments for and against that are 
both compelling.


But even if we don't allow it, there's no harm in sending the whole 
snapshot to the client, anyway. Ie. instead of 1 as the identifier, 
use the snapshot itself. That leaves the door open for allowing it in 
the future, should we choose to do so.



 In that design, there's no need to validate
snapshots because they never leave the server's memory space, which
IMHO is as it should be.


I agree that if we don't allow importing arbitrary snapshots, we should 
use some out-of-band communication to get the snapshot to the backend. 
IOW not rely on a hash.



Another reason I don't like this approach is because it's possible
that the representation of snapshots might change in the future.  Tom
mused a while back about using an LSN as a snapshot (snapshots sees
all transactions committed prior to that LSN) and there are other
plausible changes, too.  If we expose the internal details of the
snapshot mechanism to users, then (1) it becomes a user-visible
feature with backward compatibility implications if we choose to break
it down the road


Clients should treat the snapshot as an opaque block.


and (2) all future snapshot representations must be
able to be fully sanity checked on import.  Right now it's fairly easy
to verify that the snapshot's xmin isn't too old and that the
remaining XIDs are in a reasonable range and that's probably all we
really need to be certain of, but it's not clear that some future
representation would be equally easy to validate.


Perhaps, although I can't imagine a representation that wouldn't be 
equally easy to validate.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UNLOGGED tables in psql \d

2011-02-22 Thread Cédric Villemain
2011/2/22 Itagaki Takahiro itagaki.takah...@gmail.com:
 On Tue, Feb 22, 2011 at 18:11, Cédric Villemain
 cedric.villemain.deb...@gmail.com wrote:
 2011/2/22 Itagaki Takahiro itagaki.takah...@gmail.com:
 psql \d(+) doesn't show any information about UNLOGGED and TEMP attributes
 for the table. So, we cannot know the table is unlogged or not unless
 we directly select from pg_class.relpersistence.  Is this a TODO item?

 I believe it is in the title of the table presented by \d (Table,
 Unlogged table, Temp table)

 Ah, I see. Thank you.  They are shown as Unlogged Table and
 Unlogged Index.

 - We don't have Temp for temp tables. Should we have?

Hum, First, I would say yes to be more consistent with the unlogged
case, but 2nd we must refrain from outputting too much information
where it is not relevant.

The fact that you didn''t saw it might be enough to reconsider the way
we display the unlogged state (and temp state) of a relation.

Maybe some a Durability: normal, temp, unlogged  line at bottom of
the \d output  ?

 - The head characters of the second words would be small letters.
  We describe other objects as Composite type, Foreign table, or so.

 --
 Itagaki Takahiro




-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep v17

2011-02-22 Thread Greg Smith

Daniel Farina wrote:

As it will be somewhat hard to prove the durability guarantees of
commit without special heroics, unless someone can suggest a
mechanism.


Could you introduce a hack creating deterministic server side crashes in 
order to test this out?  The simplest thing that comes to mind is a rule 
like kick shared memory in the teeth to force a crash after every 100 
commits, then see if #100 shows up as expected.  Pick two different 
small numbers for the interval and you could probably put that on both 
sides to simulate all sorts of badness.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CopyReadLineText optimization revisited

2011-02-22 Thread Robert Haas
On Fri, Feb 18, 2011 at 9:35 PM, Bruce Momjian br...@momjian.us wrote:
 Was this implemented?  Is it a TODO?

It's not entirely clear to me that there's a meaningful win here.
Speeding up COPY is already on the TODO list, with this link:

http://archives.postgresql.org/pgsql-hackers/2008-02/msg00954.php

...and it may be that some of the ideas proposed there are more
worthwhile than this one.

But maybe that same TODO item could also get a link to the start of this thread.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UNLOGGED tables in psql \d

2011-02-22 Thread Robert Haas
On Tue, Feb 22, 2011 at 8:03 AM, Cédric Villemain
cedric.villemain.deb...@gmail.com wrote:
 The fact that you didn''t saw it might be enough to reconsider the way
 we display the unlogged state (and temp state) of a relation.

 Maybe some a Durability: normal, temp, unlogged  line at bottom of
 the \d output  ?

The term we use internally is persistence, but I'm not really sure
it's worth making the \d output longer.  In fact I'd much rather find
some way of going the other direction.  Note also that the temp-ness
of a table can already be inferred from the schema.

Another thought is that we might eventually have global temporary
tables, where the schema is globally visible (like a permanent or
unlogged table) but each backend sees only its own contents.  So
whatever we come up with here should generalize to that case as well.

Still another point is that temp can apply to any object type, but
unlogged can only apply to tables, indexes, and sequences.  (We
don't currently implement it for sequences, though.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Void binary patch

2011-02-22 Thread rsmogura

On Tue, 22 Feb 2011 07:01:02 -0500, Robert Haas wrote:

On Sun, Feb 20, 2011 at 5:20 AM, Radosław Smogura
rsmog...@softperience.eu wrote:

Just patch for missing procedures for void send/recv


What problem does this fix?


Can not execute stored procedures in JDBC with out arguments, I think 
function retuning void as well, and some other minors. Ofc with binary 
mode.


Regards,
Radek

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Snapshot synchronization, again...

2011-02-22 Thread Robert Haas
On Tue, Feb 22, 2011 at 8:01 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 This is hashing, not encryption, there is no key. The point is that even if
 the attacker has the hash value and knows the algorithm, he can not
 construct *another* snapshot that has the same hash.

What good does that do us?

 Yes. It would be good to perform those sanity checks anyway.

I don't think it's good; I think it's absolutely necessary.  Otherwise
someone can generate arbitrary garbage, hash it, and feed it to us.
No?

 But even if we don't allow it, there's no harm in sending the whole snapshot
 to the client, anyway. Ie. instead of 1 as the identifier, use the
 snapshot itself. That leaves the door open for allowing it in the future,
 should we choose to do so.

The door is open either way, AFAICS: we could eventually allow:

BEGIN TRANSACTION (SNAPSHOT '1');
and also
BEGIN TRANSACTION (SNAPSHOT '{xmin 123 xmax 456 xids 128 149 201}');

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Snapshot synchronization, again...

2011-02-22 Thread Heikki Linnakangas

On 22.02.2011 15:52, Robert Haas wrote:

On Tue, Feb 22, 2011 at 8:01 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

Yes. It would be good to perform those sanity checks anyway.


I don't think it's good; I think it's absolutely necessary.  Otherwise
someone can generate arbitrary garbage, hash it, and feed it to us.
No?


No, the hash is stored in shared memory. The hash of the garbage has to 
match.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Void binary patch

2011-02-22 Thread Merlin Moncure
On Tue, Feb 22, 2011 at 6:01 AM, Robert Haas robertmh...@gmail.com wrote:
 On Sun, Feb 20, 2011 at 5:20 AM, Radosław Smogura
 rsmog...@softperience.eu wrote:
 Just patch for missing procedures for void send/recv

 What problem does this fix?

void returning functions may not be called when binary protocol is
requested currently.  this is annoying: some drivers that wrap libpq
or the protocol directly use the binary mode exclusively and this
causes headaches for them.  put another way, 'void' is the only POD
type missing send/recv.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CopyReadLineText optimization revisited

2011-02-22 Thread Bruce Momjian
Robert Haas wrote:
 On Fri, Feb 18, 2011 at 9:35 PM, Bruce Momjian br...@momjian.us wrote:
  Was this implemented? ?Is it a TODO?
 
 It's not entirely clear to me that there's a meaningful win here.
 Speeding up COPY is already on the TODO list, with this link:
 
 http://archives.postgresql.org/pgsql-hackers/2008-02/msg00954.php
 
 ...and it may be that some of the ideas proposed there are more
 worthwhile than this one.
 
 But maybe that same TODO item could also get a link to the start of
 this thread.

OK, I added a link to this thread to the existing COPY TODO performance
item.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Void binary patch

2011-02-22 Thread rsmogura

On Tue, 22 Feb 2011 08:12:23 -0600, Merlin Moncure wrote:
On Tue, Feb 22, 2011 at 6:01 AM, Robert Haas robertmh...@gmail.com 
wrote:

On Sun, Feb 20, 2011 at 5:20 AM, Radosław Smogura
rsmog...@softperience.eu wrote:

Just patch for missing procedures for void send/recv


What problem does this fix?


void returning functions may not be called when binary protocol is
requested currently.  this is annoying: some drivers that wrap libpq
or the protocol directly use the binary mode exclusively and this
causes headaches for them.  put another way, 'void' is the only POD
type missing send/recv.

merlin


Just curious what POD means?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Snapshot synchronization, again...

2011-02-22 Thread Robert Haas
On Tue, Feb 22, 2011 at 8:58 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 22.02.2011 15:52, Robert Haas wrote:

 On Tue, Feb 22, 2011 at 8:01 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com  wrote:

 Yes. It would be good to perform those sanity checks anyway.

 I don't think it's good; I think it's absolutely necessary.  Otherwise
 someone can generate arbitrary garbage, hash it, and feed it to us.
 No?

 No, the hash is stored in shared memory. The hash of the garbage has to
 match.

Oh.  Well that's really silly.  At that point you might as well just
store the snapshot and an integer identifier in shared memory, right?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Snapshot synchronization, again...

2011-02-22 Thread Heikki Linnakangas

On 22.02.2011 16:29, Robert Haas wrote:

On Tue, Feb 22, 2011 at 8:58 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

On 22.02.2011 15:52, Robert Haas wrote:


On Tue, Feb 22, 2011 at 8:01 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.comwrote:


Yes. It would be good to perform those sanity checks anyway.


I don't think it's good; I think it's absolutely necessary.  Otherwise
someone can generate arbitrary garbage, hash it, and feed it to us.
No?


No, the hash is stored in shared memory. The hash of the garbage has to
match.


Oh.  Well that's really silly.  At that point you might as well just
store the snapshot and an integer identifier in shared memory, right?


Yes, that's the point I was trying to make. I believe the idea of a hash 
was that it takes less memory than storing the whole snapshot (and more 
importantly, a fixed amount of memory per snapshot). But I'm not 
convinced either that dealing with a hash is any less troublesome.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Snapshot synchronization, again...

2011-02-22 Thread Robert Haas
On Tue, Feb 22, 2011 at 9:34 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Oh.  Well that's really silly.  At that point you might as well just
 store the snapshot and an integer identifier in shared memory, right?

 Yes, that's the point I was trying to make. I believe the idea of a hash was
 that it takes less memory than storing the whole snapshot (and more
 importantly, a fixed amount of memory per snapshot). But I'm not convinced
 either that dealing with a hash is any less troublesome.

OK, sorry for taking a while to get the point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OUTER keyword

2011-02-22 Thread Bruce Momjian
Heikki Linnakangas wrote:
 On 04.10.2010 18:23, Tom Lane wrote:
  I wrote:
  Heikki Linnakangasheikki.linnakan...@enterprisedb.com  writes:
  Why is OUTER a type_func_name_keyword? The grammar doesn't require that,
  it could as well be unreserved.
 
  Hm, you sure?  All the JOIN-related keywords used to need to be at least
  that to avoid conflicts, IIRC.
 
 Yes. OUTER is just an optional noise word in LEFT/RIGHT OUTER JOIN.
 
  Actually, on reflection, it's possible that only JOIN itself really
  needs that treatment (because it can be followed by a left paren).
  We might have made the JOIN modifier words the same level for
  consistency or something.  If we can back off both INNER and OUTER
  to unreserved, it might be worth doing.  I'd be a little more worried
  about reducing LEFT/RIGHT/FULL, even if it works at the moment.
 
 No, can't change INNER, that creates conflicts.
 
 SELECT * FROM pg_class inner JOIN pg_namespace nsp ON nsp.oid = 
 relnamespace;
 
 is ambiguous, inner could be either an alias name for pg_class or part 
 of INNER JOIN.
 
 I bumped into the OUTER case because we had a test case in the 
 EnterpriseDB test suite using OUTER as a PL/pgSQL variable name. It used 
 to work, at least in simple cases where you don't try to use LEFT OUTER 
 JOIN, in 8.4 when PL/pgSQL replaced it with $1 in any SQL statements 
 before passing them to the backend. But not anymore in 9.0.

It this a TODO?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OUTER keyword

2011-02-22 Thread Heikki Linnakangas

On 22.02.2011 16:58, Bruce Momjian wrote:

Heikki Linnakangas wrote:

On 04.10.2010 18:23, Tom Lane wrote:

I wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com   writes:

Why is OUTER a type_func_name_keyword? The grammar doesn't require that,
it could as well be unreserved.



Hm, you sure?  All the JOIN-related keywords used to need to be at least
that to avoid conflicts, IIRC.


Yes. OUTER is just an optional noise word in LEFT/RIGHT OUTER JOIN.


Actually, on reflection, it's possible that only JOIN itself really
needs that treatment (because it can be followed by a left paren).
We might have made the JOIN modifier words the same level for
consistency or something.  If we can back off both INNER and OUTER
to unreserved, it might be worth doing.  I'd be a little more worried
about reducing LEFT/RIGHT/FULL, even if it works at the moment.


No, can't change INNER, that creates conflicts.

SELECT * FROM pg_class inner JOIN pg_namespace nsp ON nsp.oid =
relnamespace;

is ambiguous, inner could be either an alias name for pg_class or part
of INNER JOIN.

I bumped into the OUTER case because we had a test case in the
EnterpriseDB test suite using OUTER as a PL/pgSQL variable name. It used
to work, at least in simple cases where you don't try to use LEFT OUTER
JOIN, in 8.4 when PL/pgSQL replaced it with $1 in any SQL statements
before passing them to the backend. But not anymore in 9.0.


It this a TODO?


If we want to change OUTER, we should just do it now. If not, I don't 
see a TODO here.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OUTER keyword

2011-02-22 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 22.02.2011 16:58, Bruce Momjian wrote:
 It this a TODO?

 If we want to change OUTER, we should just do it now. If not, I don't 
 see a TODO here.

I don't see a good reason to change it.  The SQL standard is perfectly
clear that OUTER is a fully reserved word.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OUTER keyword

2011-02-22 Thread Robert Haas
On Tue, Feb 22, 2011 at 10:10 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 22.02.2011 16:58, Bruce Momjian wrote:
 It this a TODO?

 If we want to change OUTER, we should just do it now. If not, I don't
 see a TODO here.

 I don't see a good reason to change it.  The SQL standard is perfectly
 clear that OUTER is a fully reserved word.

My vote would be to change it.  We don't normally reserve keywords
unnecessarily.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Void binary patch

2011-02-22 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 On Tue, Feb 22, 2011 at 6:01 AM, Robert Haas robertmh...@gmail.com wrote:
 What problem does this fix?

 void returning functions may not be called when binary protocol is
 requested currently.  this is annoying: some drivers that wrap libpq
 or the protocol directly use the binary mode exclusively and this
 causes headaches for them.  put another way, 'void' is the only POD
 type missing send/recv.

Yeah, this has been discussed before.

Even though this patch is far past the CF deadline, I'm a bit tempted to
push it into 9.1 anyway, just so we can check off that problem.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OUTER keyword

2011-02-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Feb 22, 2011 at 10:10 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I don't see a good reason to change it.  The SQL standard is perfectly
 clear that OUTER is a fully reserved word.

 My vote would be to change it.  We don't normally reserve keywords
 unnecessarily.

Well, we don't like *upgrading* keywords without real good reason,
but OUTER has had its current classification since forever.  The
argument for trying to downgrade it was to avoid breaking a plpgsql
function that used to work, but I don't have a lot of sympathy for
that.  There are any number of cases where you used to be able
to get away with using reserved words as plpgsql variable names and
now cannot, and most of them are not going to be fixable like this.

The scenario that concerns me is that some future SQL spec addition
uses OUTER in such a way that it has to be reserved again, which
isn't going to bother the committee any since they already think
it's reserved.  Then we have to upgrade it, and all we've accomplished
is to encourage people to write unportable, non-future-proof code.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SR standby hangs

2011-02-22 Thread Greg Stark
On Tue, Feb 22, 2011 at 12:55 PM, Robert Haas robertmh...@gmail.com wrote:
 A little OT, but ISTM that the buffer pin mechanism by its nature is
 prone to lock upgrade hazards.  A cleanup lock is essentially an
 access exclusive lock on the buffer, while a buffer pin is an access
 share lock.  In the middle, we have the usual share and exclusive
 (content) locks.  We regularly try to upgrade buffer pins to any of
 the higher locking levels, which is quite unlike what we do with
 regular (heavyweight) locks, where we take some pains to avoid lock
 upgrades.

Except that pins don't block exclusive locks so there's no deadlock risk.

The oddity here is on Vacuums super-exclusive lock which is the real
equivalent of an exclusive lock. However there's the added bonus
that there can only be one vacuum on a table at a time. That makes it
safe

It might be interesting to have autovacuum skip a block it finds
pinned for too long. Since it's willing to abort entirely if someone
gets a conflicting lock skipping a block doesn't seem like a big deal.
It won't let us advance the frozenxid but it'll speed up a subsequent
vacuum.

At some point I wonder if we'll find ourselves wanting to keep track
of frozenxid per block and keep a kind of priority queue of blocks to
vacuum based on their age. I continue to wonder whether we're
reinventing standard garbage collection processes.

Incidentally, even if we allowed multiple vacuum processes per table I
think it could be coded to be safe as long as each vacuum only needs
to acquire the super exclusive lock on a single block at a time and
doesn't try to acquire other locks while holding it.



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep v17

2011-02-22 Thread Tom Lane
Marti Raudsepp ma...@juffo.org writes:
 On Tue, Feb 22, 2011 at 07:38, Fujii Masao masao.fu...@gmail.com wrote:
 +   SpinLockAcquire(WalSndCtl-ctlmutex);
 +   result = WalSndCtl-sync_rep_service_available;
 +   SpinLockRelease(WalSndCtl-ctlmutex);

 volatile pointer needs to be used to prevent code rearrangement.

 I don't think that's necessary. Spinlock functions already prevent
 reordering using __asm__ __volatile__

You're mistaken.  We started using that volatile-pointer convention
after noting that some compilers would misoptimize the code otherwise.

It's not a problem with LWLock-protected stuff because the LWLock calls
are actual out-of-line function calls, and the compiler knows it doesn't
know what those functions might do.  But gcc is a lot more willing to
reorder stuff around asm operations, so you can't assume that
SpinLockAcquire/SpinLockRelease are equally safe.  The way to prevent
optimization bugs is to make sure that the fetches/stores protected by a
spinlock are done through volatile pointers.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Void binary patch

2011-02-22 Thread Robert Haas
On Tue, Feb 22, 2011 at 10:15 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 On Tue, Feb 22, 2011 at 6:01 AM, Robert Haas robertmh...@gmail.com wrote:
 What problem does this fix?

 void returning functions may not be called when binary protocol is
 requested currently.  this is annoying: some drivers that wrap libpq
 or the protocol directly use the binary mode exclusively and this
 causes headaches for them.  put another way, 'void' is the only POD
 type missing send/recv.

 Yeah, this has been discussed before.

 Even though this patch is far past the CF deadline, I'm a bit tempted to
 push it into 9.1 anyway, just so we can check off that problem.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL FDW update

2011-02-22 Thread Robert Haas
On Mon, Feb 21, 2011 at 10:31 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 I needed something to test the FDW API patch with, and didn't want to
 get involved in the COPY API changes, and also wanted to have something
 that needs real connection management and can push down quals. So I
 updated the postgresql_fdw patch to work with the latest FDW patch.

 Here. It's a bit of a mess, but it works for simple queries..

 I'm going to set this CF item back to Waiting on Author, because (a)
 there doesn't seem to be an actual concrete patch presented at the
 moment, only a stack of hard-to-follow deltas, and (b) the patch
 certainly needs updates for the extension mechanism and committed
 FDW API anyway.

Is anyone actually working on a new version of this patch sufficiently
rapidly that we can expect a new version in the next day or two?

If not, I think we mark this one Returned with Feedback and revisit it for 9.2.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SR standby hangs

2011-02-22 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 On Tue, Feb 22, 2011 at 12:55 PM, Robert Haas robertmh...@gmail.com wrote:
 A little OT, but ISTM that the buffer pin mechanism by its nature is
 prone to lock upgrade hazards.

 Except that pins don't block exclusive locks so there's no deadlock risk.

 The oddity here is on Vacuums super-exclusive lock which is the real
 equivalent of an exclusive lock. However there's the added bonus
 that there can only be one vacuum on a table at a time. That makes it
 safe

We have seen deadlocks arising from this type of scenario:

autovac has vacuum lock on table X
autovac blocks waiting for cleanup lock on buffer B in X
process P has pin on B due to a suspended query (eg cursor)
P tries to get exclusive lock on X, is blocked by autovac's lock

The heavyweight-lock manager fails to recognize deadlock because it
doesn't know about the buffer-level LWLock.

 It might be interesting to have autovacuum skip a block it finds
 pinned for too long.

+1, although as somebody pointed out nearby, this will only be legal if
it's not a vacuum-to-prevent-wraparound situation.

 Incidentally, even if we allowed multiple vacuum processes per table I
 think it could be coded to be safe as long as each vacuum only needs
 to acquire the super exclusive lock on a single block at a time and
 doesn't try to acquire other locks while holding it.

IIRC, it's cleaning the indexes that is problematic.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_resetxlog display bogosity

2011-02-22 Thread Alvaro Herrera
Excerpts from Bruce Momjian's message of vie feb 18 23:41:18 -0300 2011:
 
 Is this a TODO item?

Only to me, it seems.


-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI bug?

2011-02-22 Thread Kevin Grittner
Dan Ports d...@csail.mit.edu wrote:
 
 It looks like CheckTargetForConflictsIn is making the assumption
 that the backend-local lock table is accurate, which was probably
 even true at the time it was written.
 
I remember we decided that it could only be false in certain ways
which allowed us to use it as a lossy first-cut test in a couple
places.  I doubt that we can count on any of that any longer, and
should drop those heuristics.
 
 the new changes for tuple versions make it more likely that this
 will actually come up.
 
Yeah, as far as a I can recall the only divergence was in *page*
level entries for *indexes* until this latest patch.  We now have
*tuple* level entries for *heap* relations, too.
 
 The solution is only slightly more complicated than just removing
 the assertion.
 
That's certainly true for that one spot, but we need an overall
review of where we might be trying to use LocalPredicateLockHash for
first cut tests as an optimization.
 
 Unless Kevin beats me to it, I'll put together a patch later
 tonight or tomorrow. (I'm at the airport right now.)
 
It would be great if you could get this one.  Thanks.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup and wal streaming

2011-02-22 Thread Heikki Linnakangas

On 18.02.2011 12:02, Magnus Hagander wrote:

Better late than never (or?), here's the final cleanup of
pg_streamrecv for moving into the main distribution, per discussion
back in late dec or early jan. It also includes the stream logs in
parallel to backup part that was not completed on pg_basebackup.


Looks reasonable at a quick glance.


+   /* Already know when to stop, compare to the position we got */


That sentence sounds broken.


+ * The background stream will use it's own database connection so we can


s/it's/its/


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_resetxlog display bogosity

2011-02-22 Thread Cédric Villemain
2011/2/22 Alvaro Herrera alvhe...@commandprompt.com:
 Excerpts from Bruce Momjian's message of vie feb 18 23:41:18 -0300 2011:

 Is this a TODO item?

 Only to me, it seems.

looks like you suggestion get positive impact so far :-)

+1 to fix the bogosity output rather than waiting for 9.2 via a todo 




 --
 Álvaro Herrera alvhe...@commandprompt.com
 The PostgreSQL Company - Command Prompt, Inc.
 PostgreSQL Replication, Consulting, Custom Development, 24x7 support

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI bug?

2011-02-22 Thread YAMAMOTO Takashi
hi,

 Kevin Grittner wrote:
  
 I'm proceeding on this basis.
  
 Result attached. I found myself passing around the tuple xmin value
 just about everywhere that the predicate lock target tag was being
 passed, so it finally dawned on me that this logically belonged as
 part of the target tag. That simplified the changes, and the net
 result of following Heikki's suggestion here is the reduction of
 total lines of code by 178 while adding coverage for missed corner
 cases and fixing bugs.
  
 Thanks again, Heikki!
  
 I will test this some more tomorrow.  So far I haven't done more than
 ensure it passes the standard regression tests and the isolation
 tests added for SSI.  The latter took awhile because the hash_any
 function was including uninitialized bytes past the length of the tag
 in its calculations.  We should probably either fix that or document
 it.  I had to add another couple bytes to the tag to get it to a four
 byte boundary to fix it.  Easy once you know that's how it works...
  
 The attached patch can also be viewed here:
  
 http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=46fd5ea6728b566c521ec83048bc00a207289dd9
  
 If this stands up to further testing, the only issue I know of for
 the 9.1 release is to update the documentation of shared memory usage
 to include the new structures.
  
 -Kevin

i tested ede45e90dd1992bfd3e1e61ce87bad494b81f54d + ssi-multi-update-1.patch
with my application and got the following assertion failure.

YAMAMOTO Takashi

Core was generated by `postgres'.
Program terminated with signal 6, Aborted.
#0  0xbbba4cc7 in _lwp_kill () from /usr/lib/libc.so.12
(gdb) bt
#0  0xbbba4cc7 in _lwp_kill () from /usr/lib/libc.so.12
#1  0xbbba4c85 in raise (s=6) at /siro/nbsd/src/lib/libc/gen/raise.c:48
#2  0xbbba445a in abort () at /siro/nbsd/src/lib/libc/stdlib/abort.c:74
#3  0x0833d9f2 in ExceptionalCondition (
conditionName=0x8493f6d !(locallock != ((void *)0)), 
errorType=0x8370451 FailedAssertion, fileName=0x8493ec3 predicate.c, 
lineNumber=3657) at assert.c:57
#4  0x0827977e in CheckTargetForConflictsIn (targettag=0xbfbfce78)
at predicate.c:3657
#5  0x0827bd74 in CheckForSerializableConflictIn (relation=0x99b5bad4, 
tuple=0xbfbfcf78, buffer=234) at predicate.c:3772
#6  0x080a5be8 in heap_update (relation=0x99b5bad4, otid=0xbfbfd038, 
newtup=0x99a0aa08, ctid=0xbfbfd03e, update_xmax=0xbfbfd044, cid=1, 
crosscheck=0x0, wait=1 '\001') at heapam.c:2593
#7  0x081c81ca in ExecModifyTable (node=0x99a0566c) at nodeModifyTable.c:624
#8  0x081b2153 in ExecProcNode (node=0x99a0566c) at execProcnode.c:371
#9  0x081b0f82 in standard_ExecutorRun (queryDesc=0x99b378f0, 
direction=ForwardScanDirection, count=0) at execMain.c:1247
#10 0xbbaaf352 in pgss_ExecutorRun (queryDesc=0x99b378f0, 
direction=ForwardScanDirection, count=0) at pg_stat_statements.c:541
#11 0xbbab5ee5 in explain_ExecutorRun (queryDesc=0x99b378f0, 
direction=ForwardScanDirection, count=0) at auto_explain.c:201
#12 0x08288ae3 in ProcessQuery (plan=0x99bb404c, 
sourceText=0x99b3781c UPDATE file SET ctime = $1 WHERE fileid = $2, 
params=value optimized out, dest=0x84d1f38, completionTag=0xbfbfd2f0 )
at pquery.c:197
#13 0x08288d0a in PortalRunMulti (portal=0x99b3f01c, 
isTopLevel=value optimized out, dest=dwarf2_read_address: Corrupted DWARF 
expression.
) at pquery.c:1268
#14 0x0828984a in PortalRun (portal=0x99b3f01c, count=2147483647, 
isTopLevel=1 '\001', dest=0x99b07428, altdest=0x99b07428, 
completionTag=0xbfbfd2f0 ) at pquery.c:822
#15 0x08286c22 in PostgresMain (argc=2, argv=0xbb9196a4, 
username=0xbb9195f8 takashi) at postgres.c:2004
#16 0x082413f6 in ServerLoop () at postmaster.c:3590
#17 0x082421a8 in PostmasterMain (argc=3, argv=0xbfbfe594) at postmaster.c:1110
#18 0x081e0d09 in main (argc=3, argv=0xbfbfe594) at main.c:199
(gdb) fr 4
#4  0x0827977e in CheckTargetForConflictsIn (targettag=0xbfbfce78)
at predicate.c:3657
3657Assert(locallock != NULL);
(gdb) list
3652
3653locallock = (LOCALPREDICATELOCK 
*)
3654
hash_search_with_hash_value(LocalPredicateLockHash,
3655
targettag, targettaghash,
3656
HASH_FIND, NULL);
3657Assert(locallock != NULL);
3658Assert(locallock-held);
3659locallock-held = false;
3660
3661if (locallock-childLocks == 0)
(gdb) 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: [HACKERS] Void binary patch

2011-02-22 Thread Merlin Moncure
On Tue, Feb 22, 2011 at 8:22 AM, rsmogura rsmog...@softperience.eu wrote:
 On Tue, 22 Feb 2011 08:12:23 -0600, Merlin Moncure wrote:

 On Tue, Feb 22, 2011 at 6:01 AM, Robert Haas robertmh...@gmail.com
 wrote:

 On Sun, Feb 20, 2011 at 5:20 AM, Radosław Smogura
 rsmog...@softperience.eu wrote:

 Just patch for missing procedures for void send/recv

 What problem does this fix?

 void returning functions may not be called when binary protocol is
 requested currently.  this is annoying: some drivers that wrap libpq
 or the protocol directly use the binary mode exclusively and this
 causes headaches for them.  put another way, 'void' is the only POD
 type missing send/recv.

 merlin

 Just curious what POD means?

POD = 'Plain Old Data' -- one of the core types.  :-).

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Void binary patch

2011-02-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Feb 22, 2011 at 10:15 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah, this has been discussed before.
 
 Even though this patch is far past the CF deadline, I'm a bit tempted to
 push it into 9.1 anyway, just so we can check off that problem.

 +1.

Done.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make a hard state change from catchup to streaming mode.

2011-02-22 Thread Robert Haas
On Fri, Feb 18, 2011 at 11:50 AM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Feb 18, 2011 at 10:41 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Feb 18, 2011 at 10:09 AM, Simon Riggs si...@2ndquadrant.com wrote:
 Make a hard state change from catchup to streaming mode.
 More useful state change for monitoring purposes, plus a
 required change for synchronous replication patch.

 As far as I can see, this patch was not posted or discussed before
 commit, and I'm not sure it's the behavior everyone wants.  It has the
 effect of preventing the system from ever going backwards from
 streaming to catchup.  Is that what we want?

 That seems like a very bad idea from here.  Being able to go back to
 catchup after loss of the streaming connection is essential for
 robustness.  If we now have to restart the slave for that to happen,
 it's not an improvement.

 No, that's not the case where it matters.  The state would get reset
 on reconnect.  The problem is when, say, the master server is
 generating WAL at a rate which exceeds the network bandwidth of the
 link between the master and the standby.  The previous coding will
 make us flip back into the catchup state when that happens.

 Actually, the old code is awfully sensitive, and knowing that you are
 not caught up is not really enough information: you need to know how
 far behind you are, and how long you've been behind for.  I'm guessing
 that Simon intended this patch to deal with that problem, but it's not
 the right fix.

So am I the only one who thinks this needs to be reverted?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep v17

2011-02-22 Thread Jaime Casanova
On Sat, Feb 19, 2011 at 11:26 PM, Robert Haas robertmh...@gmail.com wrote:

 DEBUG:  write 0/3027BC8 flush 0/3014690 apply 0/3014690
 DEBUG:  released 0 procs up to 0/3014690
 DEBUG:  write 0/3027BC8 flush 0/3027BC8 apply 0/3014690
 DEBUG:  released 2 procs up to 0/3027BC8
 WARNING:  could not locate ourselves on wait queue
 server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
 The connection to the server was lost. Attempting reset: DEBUG:

you can make this happen more easily, i just run pgbench -n -c10 -j10
test and qot that warning and sometimes a segmentation fault and
sometimes a failed assertion

and the problematic code starts at
src/backend/replication/syncrep.c:277, here my suggestions on that
code.
still i get a failed assertion because of the second Assert (i think
we should just remove that one)

*** SyncRepRemoveFromQueue(void)
*** 288,299 

if (proc-lwWaitLink == NULL)
elog(WARNING, could not locate
ourselves on wait queue);
!   proc = proc-lwWaitLink;
}

if (proc-lwWaitLink == NULL)   /* At tail */
{
!   Assert(proc == MyProc);
/* Remove ourselves from tail of queue */
Assert(queue-tail == MyProc);
queue-tail = proc;
--- 288,300 

if (proc-lwWaitLink == NULL)
elog(WARNING, could not locate
ourselves on wait queue);
!   else
!   proc = proc-lwWaitLink;
}

if (proc-lwWaitLink == NULL)   /* At tail */
{
!   Assert(proc != MyProc);
/* Remove ourselves from tail of queue */
Assert(queue-tail == MyProc);
queue-tail = proc;

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep v17

2011-02-22 Thread Daniel Farina
On Tue, Feb 22, 2011 at 5:04 AM, Greg Smith g...@2ndquadrant.com wrote:
 Daniel Farina wrote:

 As it will be somewhat hard to prove the durability guarantees of
 commit without special heroics, unless someone can suggest a
 mechanism.

 Could you introduce a hack creating deterministic server side crashes in
 order to test this out?  The simplest thing that comes to mind is a rule
 like kick shared memory in the teeth to force a crash after every 100
 commits, then see if #100 shows up as expected.  Pick two different small
 numbers for the interval and you could probably put that on both sides to
 simulate all sorts of badness.

I probably could via function, would a kill -9 also be of interest to you?

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] WIP: cross column correlation ...

2011-02-22 Thread PostgreSQL - Hans-Jürgen Schönig
hello everbody,

we have spent some time in finally attacking cross column correlation. as this 
is an issue which keeps bugging us for a couple of applications (some years). 
this is a WIP patch which can do:

special cross column correlation specific syntax:

CREATE CROSS COLUMN STATISTICS ON tablename (field, ...);
DROP CROSS COLUMN STATISTICS ON tablename (field, ...);

we use specific syntax because we simply cannot keep track of all possible 
correlations in the DB so the admi can take care of things explicitly. some 
distant day somebody might want to write a mechanism to derive the desired 
stats automatically but this is beyond the scope of our project for now.

as far as the patch is concerned:
it is patched nicely into clauselist_selectivity(), but has some rough edges, 
even when a cross-col stat is found, the single col selectivities are still 
counted ( = lovering the selectivity even more), this is a TODO.
this patch adds the grammar and the start of planner integration with a static 
selectivity value for now, the previous discussion about cross-column 
statistics can be continued and perhaps comes to fruition soon.

how does it work? we try to find suitable statistics for an arbitrary length 
list of conditions so that the planner can use it directly rather than 
multiplying all the selectivities. this should make estimates a lot more 
precise. 
the current approach can be extended to work with expressions and well as 
straight conditions.

goal: to make cross column correlation work for 9.2 ...

the purpose of this mail is mostly to get the race for a patch going and to see 
if the approach as such is reasonable / feasible.

many thanks,

hans



cross-column-v5.patch
Description: Binary data






--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Binary in/out for aclitem

2011-02-22 Thread Radosław Smogura
Hi,

Actaully one more POD left it's aclitem :). In Java for e.g. it is used to 
obtain column priviliges, I assume some folks may want to use it too.

I tested only recv :-(

Acually I don't know if idea of such format is OK, but my intention was to 
send roles names, so driver don't need to ask for name (like for regproc) and 
to keep together human-based approch (text) and binary approch.

Kind regards,
Radek
diff --git a/.gitignore b/.gitignore
index 1be11e8..0d594f9 100644
--- a/.gitignore
+++ b/.gitignore
@@ -17,3 +17,5 @@ objfiles.txt
 /GNUmakefile
 /config.log
 /config.status
+/nbproject/private/
+/nbproject
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index 691ba3b..33877cb 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -33,6 +33,7 @@
 #include utils/lsyscache.h
 #include utils/memutils.h
 #include utils/syscache.h
+#include libpq/pqformat.h
 
 
 typedef struct
@@ -78,6 +79,10 @@ static void putid(char *p, const char *s);
 static Acl *allocacl(int n);
 static void check_acl(const Acl *acl);
 static const char *aclparse(const char *s, AclItem *aip);
+
+/** Assigns default grantor and send warning. */
+static void aclitem_assign_default_grantor(AclItem *aip);
+
 static bool aclitem_match(const AclItem *a1, const AclItem *a2);
 static int	aclitemComparator(const void *arg1, const void *arg2);
 static void check_circularity(const Acl *old_acl, const AclItem *mod_aip,
@@ -209,6 +214,14 @@ putid(char *p, const char *s)
 	*p = '\0';
 }
 
+/** Assigns default grantor and send warning. */
+void aclitem_assign_default_grantor(AclItem *aip) {
+aip-ai_grantor = BOOTSTRAP_SUPERUSERID;
+ereport(WARNING,
+(errcode(ERRCODE_INVALID_GRANTOR),
+ errmsg(defaulting grantor to user ID %u,
+BOOTSTRAP_SUPERUSERID)));
+}
 /*
  * aclparse
  *		Consumes and parses an ACL specification of the form:
@@ -343,11 +356,7 @@ aclparse(const char *s, AclItem *aip)
 	}
 	else
 	{
-		aip-ai_grantor = BOOTSTRAP_SUPERUSERID;
-		ereport(WARNING,
-(errcode(ERRCODE_INVALID_GRANTOR),
- errmsg(defaulting grantor to user ID %u,
-		BOOTSTRAP_SUPERUSERID)));
+aclitem_assign_default_grantor(aip);
 	}
 
 	ACLITEM_SET_PRIVS_GOPTIONS(*aip, privs, goption);
@@ -643,6 +652,143 @@ aclitemout(PG_FUNCTION_ARGS)
 
 	PG_RETURN_CSTRING(out);
 }
+/** Do binary read of aclitem. Input format is same as {@link aclitem_recv}, but
+ * special algorithm is used to determine grantee's and grantor's OID. The reason
+ * is to keep backward information compatiblity with text mode - typical
+ * client (which gets instructions from user)
+ * may be much more interested in sending grantee and grantors name then
+ * OID. Detailed rule is as follow:br/
+ * If message contains bonly/b oids and privs (no name and names' length) then
+ * use passed OIDs (message may be truncated, we accept this, 
+ * but both, two last fields must be not present).br/
+ * If grantee's name len or grantor's name len is {@code -1} then use respecitve
+ * OIDs.br/
+ * If name length is not {@code -1} then find OID for given part, and
+ * ensure that respective OID is {@code 0} or is equal to found OID./br
+ * If grantor's OID is {@code 0} and grantor's name lenght is {@code -1} or 
+ * truncated then assign superuser as grantor.
+ */
+Datum
+aclitemrecv(PG_FUNCTION_ARGS) {
+StringInfo	buf = (StringInfo) PG_GETARG_POINTER(0);
+AclItem*aip;
+intgRawLen;
+char   *gVal = NULL;
+intgValLen;
+OidgOid;
+
+aip = (AclItem *) palloc(sizeof(AclItem));
+aip-ai_grantee = pq_getmsgint(buf, 4);
+aip-ai_grantor = pq_getmsgint(buf, 4);
+aip-ai_privs = pq_getmsgint(buf, 4);
+
+/* Client has passed names. */
+if (buf-cursor  buf-len) {
+/* Read grantee. */
+gRawLen = pq_getmsgint(buf, 4);
+if (gRawLen != -1) {
+gVal = pq_getmsgtext(buf, gRawLen, gValLen);
+gOid = get_role_oid(gVal, false);
+if (aip-ai_grantee != 0  aip-ai_grantee != gOid) {
+ereport(ERROR,
+(errcode(ERRCODE_INVALID_BINARY_REPRESENTATION),
+errmsg(grantee's OID is not 0 and passed grantee's name and grantee's OID doesn't match)));
+
+}
+}
+
+/* Read grantor. */
+gRawLen = pq_getmsgint(buf, 4);
+if (gRawLen != -1) {
+gVal = pq_getmsgtext(buf, gRawLen, gValLen);
+gOid = get_role_oid(gVal, false);
+if (aip-ai_grantor != 0  aip-ai_grantor != gOid) {
+ereport(ERROR,
+(errcode(ERRCODE_INVALID_BINARY_REPRESENTATION),
+errmsg(grantor's OID is not 0 and passed grantor's name and grantor's OID doesn't match)));
+}
+}else {
+gVal = NULL;
+}
+}
+
+if 

[HACKERS] PgEast talks and trainings up

2011-02-22 Thread Joshua D. Drake
Hello,

Per the customary URL:

https://www.postgresqlconference.org/

JD

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pl/python quoting functions

2011-02-22 Thread Peter Eisentraut
Committed this, with two changes:  Changed some things around with the
way const char * is propagated.  Just casting it away is not nice.  Also
dropped the error tests in the _quote.sql regression test.  This
generates three different wordings of error messages from Python with
2.6, 3.1, and 3.2, which I don't care to maintain.  Maybe one day we'll
have a better solution for this.





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Binary in/out for aclitem

2011-02-22 Thread Tom Lane
=?utf-8?q?Rados=C5=82aw_Smogura?= rsmog...@softperience.eu writes:
 Actaully one more POD left it's aclitem :). In Java for e.g. it is used to 
 obtain column priviliges, I assume some folks may want to use it too.

I think this one has got far less use-case than the other, and I don't
want to expose the internal representation of ACLITEM to the world.
So, nope, not excited about it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Binary in/out for aclitem

2011-02-22 Thread Andrew Dunstan



On 02/22/2011 05:04 PM, Tom Lane wrote:

=?utf-8?q?Rados=C5=82aw_Smogura?=rsmog...@softperience.eu  writes:

Actaully one more POD left it's aclitem :). In Java for e.g. it is used to
obtain column priviliges, I assume some folks may want to use it too.

I think this one has got far less use-case than the other, and I don't
want to expose the internal representation of ACLITEM to the world.
So, nope, not excited about it.



The sendv for enums sends the label, and ISTR there are some others that 
send the text representation also. Would that be better?


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Binary in/out for aclitem

2011-02-22 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 02/22/2011 05:04 PM, Tom Lane wrote:
 I think this one has got far less use-case than the other, and I don't
 want to expose the internal representation of ACLITEM to the world.

 The sendv for enums sends the label, and ISTR there are some others that 
 send the text representation also. Would that be better?

It'd be more future-proof than this patch, but I'm still unconvinced
about the use-case.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [GENERAL] How to extract a value from a record using attnum or attname?

2011-02-22 Thread Kevin Grittner
[moving to -hackers with BC to -general]
 
Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 
 PL/pgSQL seems tantalizingly close to being useful for developing
 a generalized trigger function for notifying the client of
 changes. I don't know whether I'm missing something or whether
 we're missing a potentially useful feature here.  Does anyone see
 how to fill in where the commented question is, or do I need to
 write this function in C?
 
 See those:
 
 http://tapoueh.org/articles/blog/_Dynamic_Triggers_in_PLpgSQL.html

http://www.pgsql.cz/index.php/PL_toolbox_%28en%29#Record.27s_functions
 
   for i in array_lower(keycols, 1)..array_upper(keycols, 1) loop
 select quote_ident(attname) from pg_catalog.pg_attribute
   where attrelid = tg_relid and attnum = keycols[i]::oid
 
 Beware of attisdropped, which I've not fixed in the published URL
 before (the tapoueh.org one).
 
Thanks.
 
In the absence of an earlier response, though, I went ahead and
wrote the attached, which has passed some initial programmer testing
and is scheduled to start business analyst testing tomorrow with the
application software for production deployment in a couple months.
We probably won't go back to PL/pgSQL for this now.
 
I'm assuming that while I have an AccessShareLock on the index
relation for the primary key, any attributes it tells me are used by
that relation will not have the attisdropped flag set?
 
What this trigger function does is to issue a NOTIFY to the channel
specified as a parameter to the function in CREATE TRIGGER (with
'tcn' as the default), and a payload consisting of the table name, a
code for the operation (Insert, Update, or Delete), and the primary
key values.  So, an update to a Party record for us might generate
this NOTIFY payload:
 
Party,U,countyNo='71',caseNo='2011CF001234',partyNo='1'
 
This is one of those things which our shop needs, but I was planning
to post it for the first 9.2 CF fest to see if anyone else was
interested.  It struck me while typing this post that for general
use the schema would probably need to be in there, but I'll worry
about that later, if anyone else *is* interested.  If anyone wants
it I can provide Java code to tear apart the NOTIFY payloads using
the Pattern and Matches classes.
 
I'll add to the first 9.2 CF referencing this post.
 
-Kevin

*** a/src/backend/utils/adt/trigfuncs.c
--- b/src/backend/utils/adt/trigfuncs.c
***
*** 13,21 
   */
  #include postgres.h
  
! #include access/htup.h
  #include commands/trigger.h
  #include utils/builtins.h
  
  
  /*
--- 13,25 
   */
  #include postgres.h
  
! #include executor/spi.h
! #include catalog/indexing.h
! #include commands/async.h
  #include commands/trigger.h
  #include utils/builtins.h
+ #include utils/fmgroids.h
+ #include utils/tqual.h
  
  
  /*
***
*** 93,95  suppress_redundant_updates_trigger(PG_FUNCTION_ARGS)
--- 97,261 
  
return PointerGetDatum(rettuple);
  }
+ 
+ 
+ /*
+  * Copy from s (for source) to r (for result), wrapping with q (quote)
+  * characters and doubling any quote characters found.
+  */
+ static char *
+ strcpy_quoted(char *r, const char *s, const char q)
+ {
+   *r++ = q;
+   while (*s)
+   {
+   if (*s == q)
+   *r++ = q;
+   *r++ = *s;
+   s++;
+   }
+   *r++ = q;
+   return r;
+ }
+ 
+ /*
+  * triggered_change_notification
+  *
+  * This trigger function will send a notification of data modification with
+  * primary key values.The channel will be tcn unless the trigger is
+  * created with a parameter, in which case that parameter will be used.
+  */
+ Datum
+ triggered_change_notification(PG_FUNCTION_ARGS)
+ {
+   TriggerData *trigdata = (TriggerData *) fcinfo-context;
+   Trigger*trigger;
+   int nargs;
+   HeapTuple   trigtuple,
+   newtuple;
+   HeapTupleHeader trigheader,
+   newheader;
+   Relationrel;
+   TupleDesc   tupdesc;
+   RelationindexRelation;
+   ScanKeyData skey;
+   SysScanDesc scan;
+   HeapTuple   indexTuple;
+   char   *channel;
+   charoperation;
+   charpayload[200];
+   char   *p;
+   boolfoundPK;
+ 
+   /* make sure it's called as a trigger */
+   if (!CALLED_AS_TRIGGER(fcinfo))
+   ereport(ERROR,
+   
(errcode(ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED),
+   errmsg(triggered_change_notification: must be called as 
trigger)));
+ 
+   /* and that it's called after the change */
+   if (!TRIGGER_FIRED_AFTER(trigdata-tg_event))
+   ereport(ERROR,
+   
(errcode(ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED),
+

Re: [HACKERS] Re: [GENERAL] How to extract a value from a record using attnum or attname?

2011-02-22 Thread Andrew Dunstan



On 02/22/2011 05:32 PM, Kevin Grittner wrote:

[moving to -hackers with BC to -general]

Dimitri Fontainedimi...@2ndquadrant.fr  wrote:

Kevin Grittnerkevin.gritt...@wicourts.gov  writes:


PL/pgSQL seems tantalizingly close to being useful for developing
a generalized trigger function for notifying the client of
changes. I don't know whether I'm missing something or whether
we're missing a potentially useful feature here.  Does anyone see
how to fill in where the commented question is, or do I need to
write this function in C?

See those:

http://tapoueh.org/articles/blog/_Dynamic_Triggers_in_PLpgSQL.html


http://www.pgsql.cz/index.php/PL_toolbox_%28en%29#Record.27s_functions

   for i in array_lower(keycols, 1)..array_upper(keycols, 1) loop
 select quote_ident(attname) from pg_catalog.pg_attribute
   where attrelid = tg_relid and attnum = keycols[i]::oid

Beware of attisdropped, which I've not fixed in the published URL
before (the tapoueh.org one).


Thanks.

In the absence of an earlier response, though, I went ahead and
wrote the attached, which has passed some initial programmer testing
and is scheduled to start business analyst testing tomorrow with the
application software for production deployment in a couple months.
We probably won't go back to PL/pgSQL for this now.

I'm assuming that while I have an AccessShareLock on the index
relation for the primary key, any attributes it tells me are used by
that relation will not have the attisdropped flag set?

What this trigger function does is to issue a NOTIFY to the channel
specified as a parameter to the function in CREATE TRIGGER (with
'tcn' as the default), and a payload consisting of the table name, a
code for the operation (Insert, Update, or Delete), and the primary
key values.  So, an update to a Party record for us might generate
this NOTIFY payload:

Party,U,countyNo='71',caseNo='2011CF001234',partyNo='1'

This is one of those things which our shop needs, but I was planning
to post it for the first 9.2 CF fest to see if anyone else was
interested.  It struck me while typing this post that for general
use the schema would probably need to be in there, but I'll worry
about that later, if anyone else *is* interested.  If anyone wants
it I can provide Java code to tear apart the NOTIFY payloads using
the Pattern and Matches classes.

I'll add to the first 9.2 CF referencing this post.




Have you performance tested it? Scanning pg_index for index columns for 
each row strikes me as likely to be unpleasant.


Also, the error messages seem to need a bit of work (no wonder they 
seemed familiar to me :) )


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI bug?

2011-02-22 Thread Dan Ports
On Tue, Feb 22, 2011 at 10:51:05AM -0600, Kevin Grittner wrote:
 Dan Ports d...@csail.mit.edu wrote:
  
  It looks like CheckTargetForConflictsIn is making the assumption
  that the backend-local lock table is accurate, which was probably
  even true at the time it was written.
  
 I remember we decided that it could only be false in certain ways
 which allowed us to use it as a lossy first-cut test in a couple
 places.  I doubt that we can count on any of that any longer, and
 should drop those heuristics.

Hmm. The theory was before that the local lock table would only have
false negatives, i.e. if it says we hold a lock then we really do. That
makes it a useful heuristic because we can bail out quickly if we're
trying to re-acquire a lock we already hold. It seems worthwhile to try
to preserve that.

This property holds as long as the only time one backend edits
another's lock list is to *add* a new lock, not remove an existing one.
Fortunately, this is true most of the time (at a glance, it seems to be
the case for the new tuple update code).  

There are two exceptions; I think they're both OK but we need to be
careful here.
 - if we're forced to promote an index page lock's granularity to avoid
   running out of shared memory, we remove the fine-grained one. This
   is fine as long as we relax our expectations to if the local
   lock table says we hold a lock, then we hold that lock or one that
   covers it, which is acceptable for current users of the table.
 - if we combine two index pages, we remove the lock entry for the page
   being deleted. I think that's OK because the page is being removed
   so we will not make any efforts to lock it.


 Yeah, as far as a I can recall the only divergence was in *page*
 level entries for *indexes* until this latest patch.  We now have
 *tuple* level entries for *heap* relations, too.

*nod*. I'm slightly concerned about the impact of that on granularity
promotion -- the new locks created by heap updates won't get counted
toward the lock promotion thresholds. That's not a fatal problem of
anything, but it could make granularity promotion less effective at
conserving lock entries.

  The solution is only slightly more complicated than just removing
  the assertion.
  
 That's certainly true for that one spot, but we need an overall
 review of where we might be trying to use LocalPredicateLockHash for
 first cut tests as an optimization.

Yes, I'd planned to go through the references to LocalPredicateLockHash
to make sure none of them were making any unwarranted assumptions about
the results. Fortunately, there are not too many of them...

Dan

-- 
Dan R. K. Ports  MIT CSAILhttp://drkp.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to extract a value from a record using attnum or attname?

2011-02-22 Thread Kevin Grittner
Andrew Dunstan and...@dunslane.net wrote:
 
 Have you performance tested it? Scanning pg_index for index
 columns for each row strikes me as likely to be unpleasant.
 
I haven't, yet.  I had rather assumed that the index info for a
relation would have a high probability of being cached during
execution of an AFTER trigger for that relation, so I think we're
talking RAM access here.  It didn't seem sane to try to create an
HTAB for this and worry about invalidation of it, etc.  If there's a
faster way to get to the info without going to such extremes, I'd be
happy to hear them.  (At least I avoided building and parsing a
query to get at it.)
 
 Also, the error messages seem to need a bit of work (no wonder
 they seemed familiar to me :) )
 
[blush]  I was just trying to write code with fits in with
surrounding code, as recommended.  You mean I should change the
function name in the message from the name of the function I copied
*from* to the name of the function I copied *to*?  Well, I *guess*
it still fits in  ;-)
 
Seriously, thanks for pointing that out.  Will fix.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI bug?

2011-02-22 Thread Kevin Grittner
Dan Ports d...@csail.mit.edu wrote:
 On Tue, Feb 22, 2011 at 10:51:05AM -0600, Kevin Grittner wrote:
 
 The theory was before that the local lock table would only have
 false negatives, i.e. if it says we hold a lock then we really do.
 That makes it a useful heuristic because we can bail out quickly
 if we're trying to re-acquire a lock we already hold. It seems
 worthwhile to try to preserve that.
 
 This property holds as long as the only time one backend edits
 another's lock list is to *add* a new lock, not remove an existing
 one.  Fortunately, this is true most of the time (at a glance, it
 seems to be the case for the new tuple update code).  
 
 There are two exceptions; I think they're both OK but we need to
 be careful here.
  - if we're forced to promote an index page lock's granularity to
avoid running out of shared memory, we remove the fine-grained
one. This is fine as long as we relax our expectations to if
the local lock table says we hold a lock, then we hold that
lock or one that covers it, which is acceptable for current
users of the table.
 
That makes sense.  This one sounds OK.
 
  - if we combine two index pages, we remove the lock entry for the
page being deleted. I think that's OK because the page is being
removed so we will not make any efforts to lock it.
 
I'm not sure it's safe to assume that the index page won't get
reused before the local lock information is cleared.  In the absence
of a clear proof that it is safe, or some enforcement mechanism to
ensure that it is, I don't think we should make this assumption. 
Off-hand I can't think of a clever way to make this safe which would
cost less than taking out the LW lock and checking the definitive
shared memory HTAB, but that might be for lack of creative thinking
at the moment..
 
 Yeah, as far as a I can recall the only divergence was in *page*
 level entries for *indexes* until this latest patch.  We now have
 *tuple* level entries for *heap* relations, too.
 
 *nod*. I'm slightly concerned about the impact of that on
 granularity promotion -- the new locks created by heap updates
 won't get counted toward the lock promotion thresholds. That's not
 a fatal problem of anything, but it could make granularity
 promotion less effective at conserving lock entries.
 
This pattern doesn't seem to come up very often in most workloads. 
Since it's feeding into a heuristic which already triggers pretty
quickly I think we should be OK with this.  It makes me less tempted
to tinker with the threshold for promoting tuple locks to page
locks, though.
 
The only alternative I see would be to use some form of asynchronous
notification of the new locks so that the local table can be
maintained.  That seems overkill without some clear evidence that it
is needed.  I *really* wouldn't want to go back to needing LW locks
to maintain this info; as you know (and stated only for the benefit
of the list), it was a pretty serious contention point in early
profiling and adding the local table was a big part of getting an
early benchmark down from a 14+% performance hit for SSI to a 1.8%
performance hit.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-22 Thread Tom Lane
I wrote:
 I did a bit of poking around here, and came to the following
 conclusions:

 1. We don't want to add another RTEKind.  RTE_RELATION basically has the
 semantics of anything with a pg_class OID, so it ought to include
 foreign tables.  Therefore the fix ought to be to add relkind to
 RangeTblEntry.  (BTW, so far as I can tell, RTE_SPECIAL is obsolete:
 there are assorted switch cases that handle it, but no place that can
 generate the value.  I'm inclined to delete it while we are messing
 with this.)

 2. In the current code layout, to make sense of relkind you need to
 #include catalog/pg_class.h where the values for relkind are #defined.
 I dislike the idea of that being true for a field of such a widely-known
 struct as RangeTblEntry.  Accordingly, I suggest that we move those
 values into parsenodes.h.  (Perhaps we could convert them to an enum,
 too, though still keeping the same ASCII values.)

 3. We can have the rewriter update an RTE's stored value of relkind
 during AcquireRewriteLocks: it opens the rel for each RTE_RELATION entry
 anyway, so copying over the relkind is essentially free.  While it's not
 instantly obvious that that is soon enough, I think that it is, since
 up to the point of acquiring a lock there we can't assume that the rel
 isn't being changed or dropped undeneath us --- that is, any earlier
 test on an RTE's relkind might be testing just-obsoleted state anyway.

 4. I had hoped that we might be able to get rid of some pre-existing
 syscache lookups, but at least so far as the parse/plan/execute chain
 is concerned, there don't seem to be any other places that need to
 fetch a relkind based on just an RTE entry.

 So point #4 is a bit discouraging, but other that, this seems like
 a fairly straightforward exercise.  I'm inclined to go ahead and do it,
 unless there are objections.

Applied, except I ended up not moving the RELKIND #defines as suggested
in point #2.  Those #defines are used by pg_dump, and having pg_dump.c
#include parsenodes.h seemed like a bad idea.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Snapshot synchronization, again...

2011-02-22 Thread Joachim Wieland
On Tue, Feb 22, 2011 at 3:34 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 22.02.2011 16:29, Robert Haas wrote:
 On Tue, Feb 22, 2011 at 8:58 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com  wrote:
 No, the hash is stored in shared memory. The hash of the garbage has to
 match.

 Oh.  Well that's really silly.  At that point you might as well just
 store the snapshot and an integer identifier in shared memory, right?

 Yes, that's the point I was trying to make. I believe the idea of a hash was
 that it takes less memory than storing the whole snapshot (and more
 importantly, a fixed amount of memory per snapshot). But I'm not convinced
 either that dealing with a hash is any less troublesome.

Both Tom and Robert voted quite explicitly against the
store-in-shared-memory idea. Others don't want to allow people request
arbitrary snapshots and again others wanted to pass the snapshot
through the client so that in the future we could also request
snapshots from standby servers. The hash idea seemed to at least make
nobody unhappy.

For easier review, here are a few links to the previous discusion:

http://archives.postgresql.org/pgsql-hackers/2010-12/msg00361.php
http://archives.postgresql.org/pgsql-hackers/2010-12/msg00383.php
http://archives.postgresql.org/pgsql-hackers/2010-12/msg00481.php
http://archives.postgresql.org/pgsql-hackers/2010-12/msg02454.php

Why exactly, Heikki do you think the hash is more troublesome? And how
could we validate/invalidate snapshots without the checksum (assuming
the through-the-client approach instead of storing the whole snapshot
in shared memory)?


Joachim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Binary in/out for aclitem

2011-02-22 Thread Robert Haas
On Tue, Feb 22, 2011 at 5:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andrew Dunstan and...@dunslane.net writes:
 On 02/22/2011 05:04 PM, Tom Lane wrote:
 I think this one has got far less use-case than the other, and I don't
 want to expose the internal representation of ACLITEM to the world.

 The sendv for enums sends the label, and ISTR there are some others that
 send the text representation also. Would that be better?

 It'd be more future-proof than this patch, but I'm still unconvinced
 about the use-case.

Do we want to intentionally make binary format a second-class citizen?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Snapshot synchronization, again...

2011-02-22 Thread Robert Haas
On Tue, Feb 22, 2011 at 8:00 PM, Joachim Wieland j...@mcknight.de wrote:
 Both Tom and Robert voted quite explicitly against the
 store-in-shared-memory idea.

No, I voted *for* that approach.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Binary in/out for aclitem

2011-02-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Feb 22, 2011 at 5:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 It'd be more future-proof than this patch, but I'm still unconvinced
 about the use-case.

 Do we want to intentionally make binary format a second-class citizen?

Well, it's not exactly a first-class citizen; compare for instance the
amount of verbiage in the docs about text I/O formats versus the amount
about binary formats.  But my question isn't about that; it's about why
aclitem should be considered a first-class citizen.  It makes me
uncomfortable that client apps are looking at it at all, because any
that do are bound to get broken in the future, even assuming that they
get the right answers today.  I wonder how many such clients are up to
speed for per-column privileges and non-constant default privileges for
instance.  And sepgsql is going to cut them off at the knees.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] psql tab-completion for CREATE UNLOGGED TABLE

2011-02-22 Thread Itagaki Takahiro
Here is a patch to support CREATE UNLOGGED TABLE in psql tab-completion.
It also fixes a bug that DROP is completed with TEMP and UNIQUE unexpectedly
and cleanup codes for DROP OWNED BY in drop_command_generator().

-- 
Itagaki Takahiro


psql_tab_completion_for_unlogged-20110223.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Binary in/out for aclitem

2011-02-22 Thread Robert Haas
[ removing Radoslaw from the CC list, as his email is bouncing every time ]

On Tue, Feb 22, 2011 at 8:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Feb 22, 2011 at 5:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 It'd be more future-proof than this patch, but I'm still unconvinced
 about the use-case.

 Do we want to intentionally make binary format a second-class citizen?

 Well, it's not exactly a first-class citizen; compare for instance the
 amount of verbiage in the docs about text I/O formats versus the amount
 about binary formats.  But my question isn't about that; it's about why
 aclitem should be considered a first-class citizen.  It makes me
 uncomfortable that client apps are looking at it at all, because any
 that do are bound to get broken in the future, even assuming that they
 get the right answers today.  I wonder how many such clients are up to
 speed for per-column privileges and non-constant default privileges for
 instance.  And sepgsql is going to cut them off at the knees.

Well, unfortunately, there's an awful lot of information that can only
be obtained in a reasonable way by introspection of the system
catalogs.  If you want to know whether user A can select from table B,
there's really no sensible way of obtaining that without parsing the
aclitem entries in some fashion, and unfortunately that's just the tip
of the iceberg.  One of the first applications I wrote for PG included
a tool to upgrade the production schema to match the dev schema, which
promptly got broken when (I'm dating myself here[1]) PG added support
for dropping columns (7.3) and recording the grantor on aclitems
(7.4).  I'm not going to claim that there aren't better ways of trying
to solve the problems that I was trying to solve that day, but at the
time it seemed like the best solution, and if I had a dollar for every
other person who is written a similar application, I am pretty sure I
could afford at least a pizza, if not dinner at Fogo de Chao.

Now, if you were to propose adding a well-designed set of DCL commands
to expose this kind of information to clients in a more structured
way, I would be the first to applaud.  LIST TABLES?  SHOW GRANTS TO?
Sign me up!  (I got a request for the latter just today.)  But until
then, if you need this information, you don't have much choice but to
pull it out of the system catalogs; and if JDBC would rather use
binary format to talk to the server, I don't particularly see any
reason to say no.  If we prefer to expose the text format rather
than anything else, that's OK with me, but I do think it would make
sense to expose something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] Insert obligatory joke about how no one else would.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI bug?

2011-02-22 Thread Dan Ports
On Tue, Feb 22, 2011 at 05:54:49PM -0600, Kevin Grittner wrote:
 I'm not sure it's safe to assume that the index page won't get
 reused before the local lock information is cleared.  In the absence
 of a clear proof that it is safe, or some enforcement mechanism to
 ensure that it is, I don't think we should make this assumption. 
 Off-hand I can't think of a clever way to make this safe which would
 cost less than taking out the LW lock and checking the definitive
 shared memory HTAB, but that might be for lack of creative thinking
 at the moment..

Hmm. Yeah, I wasn't sure about that one, and having now thought about
it some more I think it isn't safe -- consider adding a lock on an
index page concurrently with another backend merging that page into
another one.

The obvious solution to me is to just keep the lock on both the old and
new page. The downside is that because this requires allocating a new
lock and is in a context where we're not allowed to fail, we'll need to
fall back on acquiring the relation lock just as we do for page splits.
I was going to bemoan the extra complexity this would add -- but
actually, couldn't we just replace PredicateLockPageCombine with a call
to PredicateLockPageSplit since they'd now do the same thing?

 The only alternative I see would be to use some form of asynchronous
 notification of the new locks so that the local table can be
 maintained.  That seems overkill without some clear evidence that it
 is needed. 

I agree. It is certainly weird and undesirable that the backend-local
lock table is not always accurate, but I don't see a good way to keep
it up to date without the cure being worse than the disease.

 I *really* wouldn't want to go back to needing LW locks
 to maintain this info; as you know (and stated only for the benefit
 of the list), it was a pretty serious contention point in early
 profiling and adding the local table was a big part of getting an
 early benchmark down from a 14+% performance hit for SSI to a 1.8%
 performance hit.

Yes, it's definitely important for a backend to be able to check
whether it's already holding a lock (even if that's just a hint)
without having to take locks.

Let me add one more piece of info for the benefit of the list: a
backend's local lock table contains not just locks held by the backend,
but also an entry and refcount for every parent of a lock it holds.
This is used to determine when to promote to one of the coarser-grained
parent locks. It's both unnecessary and undesirable for that info to be
in shared memory.

Dan

-- 
Dan R. K. Ports  MIT CSAILhttp://drkp.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Snapshot synchronization, again...

2011-02-22 Thread Alvaro Herrera

The current discussion seems to have reached the same conclusion as the
last one: the snapshot info shouldn't leave the server, but be
transmitted internally -- the idea of the temp file seems prevalent.
Here's an attempt at a detailed spec:

By invoking pg_export_snapshot(), a transaction can export its current
transaction snapshot.

To export a snapshot, the transaction creates a temp file containing all
information about the snapshot, including the BackendId and
TransactionId of the creating transaction.  The file name is determined
internally by the exporting transaction, and returned by the function
that creates the snapshot.  The act of exporting a snapshot causes a
TransactionId to be acquired, if there is none.

Care is taken that no existing snapshot file is ever overwritten, to
prevent security problems.  A transaction may export any number of
snapshots.

The cookie to be passed around, then, is a temp file identifier.  This
cookie gets passed as START TRANSACTION (snapshot='foo'), which sets the
transaction's snapshot to the one determined by the identifier.  The
importing transaction must have isolation level serializable or
repeatable read declared on the same START TRANSACTION command; an
attempt to import a snapshot into a read committed transaction or one
with unspecified isolation level causes the transaction to get into
aborted state (so you still need to say ROLLBACK to get out of it.  This
is necessary to avoid the session to proceed involuntarily with the
wrong snapshot.)

Similarly, if the snapshot is not available, an error is raised and the
transaction block remains in aborted state.  This can happen if the
originating transaction ended after you opened the file, for example.
The BackendId and TransactionId of the exporting transaction let the
importing backend determine the validity of the snapshot beyond the mere
existence of the file.

The snapshot temp file is to be marked for deletion on transaction end.
All snapshot temp files are also deleted on crash recovery, to prevent
dangling files (I need some help here to determine how this plays with
hot standby.)

Privileges: Any role can export or import a snapshot.  The rationale
here is that exporting a snapshot doesn't cause any more harm than
holding a transaction open, so if you let your users do that, then
there's no extra harm.  Similarly, there's no extra privilege given to a
role that imports a snapshot that cannot be obtained by sending a START
TRANSACTION command at the right time.


Note: this proposal doesn't mention restrictions on having
subtransactions in the exporting transaction.  This is because I haven't
figured them out, not because I think there shouldn't be any.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: cross column correlation ...

2011-02-22 Thread Robert Haas
2011/2/22 PostgreSQL - Hans-Jürgen Schönig postg...@cybertec.at:
 how does it work? we try to find suitable statistics for an arbitrary length 
 list of conditions so that the planner can use it directly rather than 
 multiplying all the selectivities. this should make estimates a lot more 
 precise.
 the current approach can be extended to work with expressions and well as 
 straight conditions.

/me prepares to go down in flames.

Personally, I think the first thing we ought to do is add a real, bona
fide planner hint to override the selectivity calculation manually,
maybe something like this:

WHERE (x  5 AND y = 1) SELECTIVITY (0.1);

Then, having provided a method for the DBA to extinguish the raging
flames of searing agony which are consuming them while a crocodile
chews off their leg and their boss asks them why they didn't use
Oracle, we can continue bikeshedding about the best way of fixing this
problem in a more user-transparent fashion.

As to the approach you've proposed here, I'm not sure I understand
what this is actually doing.  Selectivity estimates aren't made
directly for predicates; they're made based on MCV and histogram
information for predicates.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Snapshot synchronization, again...

2011-02-22 Thread Robert Haas
On Tue, Feb 22, 2011 at 8:55 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 The current discussion seems to have reached the same conclusion as the
 last one: the snapshot info shouldn't leave the server, but be
 transmitted internally -- the idea of the temp file seems prevalent.
 Here's an attempt at a detailed spec:

 By invoking pg_export_snapshot(), a transaction can export its current
 transaction snapshot.

 To export a snapshot, the transaction creates a temp file containing all
 information about the snapshot, including the BackendId and
 TransactionId of the creating transaction.  The file name is determined
 internally by the exporting transaction, and returned by the function
 that creates the snapshot.  The act of exporting a snapshot causes a
 TransactionId to be acquired, if there is none.

 Care is taken that no existing snapshot file is ever overwritten, to
 prevent security problems.  A transaction may export any number of
 snapshots.

 The cookie to be passed around, then, is a temp file identifier.  This
 cookie gets passed as START TRANSACTION (snapshot='foo'), which sets the
 transaction's snapshot to the one determined by the identifier.  The
 importing transaction must have isolation level serializable or
 repeatable read declared on the same START TRANSACTION command; an
 attempt to import a snapshot into a read committed transaction or one
 with unspecified isolation level causes the transaction to get into
 aborted state (so you still need to say ROLLBACK to get out of it.  This
 is necessary to avoid the session to proceed involuntarily with the
 wrong snapshot.)

 Similarly, if the snapshot is not available, an error is raised and the
 transaction block remains in aborted state.  This can happen if the
 originating transaction ended after you opened the file, for example.
 The BackendId and TransactionId of the exporting transaction let the
 importing backend determine the validity of the snapshot beyond the mere
 existence of the file.

 The snapshot temp file is to be marked for deletion on transaction end.
 All snapshot temp files are also deleted on crash recovery, to prevent
 dangling files (I need some help here to determine how this plays with
 hot standby.)

I think this can be done within StartupXLOG(), right around where we
ResetUnloggedRelations(UNLOGGED_RELATION_CLEANUP).

 Privileges: Any role can export or import a snapshot.  The rationale
 here is that exporting a snapshot doesn't cause any more harm than
 holding a transaction open, so if you let your users do that, then
 there's no extra harm.  Similarly, there's no extra privilege given to a
 role that imports a snapshot that cannot be obtained by sending a START
 TRANSACTION command at the right time.

Agree.

 Note: this proposal doesn't mention restrictions on having
 subtransactions in the exporting transaction.  This is because I haven't
 figured them out, not because I think there shouldn't be any.

I think it's probably sufficient to say that a snapshot can only be
exported from the toplevel transaction.

Overall, this proposal is fine with me.  But then I wasn't the one who
complained about it last time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Binary in/out for aclitem

2011-02-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Feb 22, 2011 at 8:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 ... But my question isn't about that; it's about why
 aclitem should be considered a first-class citizen.  It makes me
 uncomfortable that client apps are looking at it at all, because any
 that do are bound to get broken in the future, even assuming that they
 get the right answers today.  I wonder how many such clients are up to
 speed for per-column privileges and non-constant default privileges for
 instance.  And sepgsql is going to cut them off at the knees.

 Well, unfortunately, there's an awful lot of information that can only
 be obtained in a reasonable way by introspection of the system
 catalogs.  If you want to know whether user A can select from table B,
 there's really no sensible way of obtaining that without parsing the
 aclitem entries in some fashion, and unfortunately that's just the tip
 of the iceberg.

Um, that question is precisely why we invented the has_foo_privilege
class of functions.  I would *much* rather see users applying those
functions than looking at ACLs directly --- and if there's some
reasonable use-case that those don't cover, let's talk about that.

 Now, if you were to propose adding a well-designed set of DCL commands
 to expose this kind of information to clients in a more structured
 way, I would be the first to applaud.  LIST TABLES?  SHOW GRANTS TO?
 Sign me up!  (I got a request for the latter just today.)  But until
 then, if you need this information, you don't have much choice but to
 pull it out of the system catalogs; and if JDBC would rather use
 binary format to talk to the server, I don't particularly see any
 reason to say no.  If we prefer to expose the text format rather
 than anything else, that's OK with me, but I do think it would make
 sense to expose something.

Well, to go back to the binary-format issue, if we're going to insist
that all standard types have binary I/O support then we should actually
do that, not accept piecemeal patches that move us incrementally in that
direction without establishing a policy.  To my mind, establishing a
policy would include adding a type_sanity regression test query that
shows there are no missing binary I/O functions.  As of HEAD, we have

postgres=# select typname,typtype from pg_type where typreceive = 0 or typsend 
= 0;
 typname  | typtype 
--+-
 smgr | b
 aclitem  | b
 gtsvector| b
 any  | p
 trigger  | p
 language_handler | p
 internal | p
 opaque   | p
 anyelement   | p
 anynonarray  | p
 anyenum  | p
 fdw_handler  | p
(12 rows)

Possibly we could exclude pseudotypes from the policy, on the grounds
there are never really values of those types anyway.  But other than
that, we have:

smgr: let's just get rid of that useless vestigial type.

aclitem: as per this thread, using the text representation as
the binary representation seems reasonable, or at least it doesn't
make anything any worse.

gtsvector: this is strictly an internal type, so it probably
doesn't need actual I/O support, but we could put in a couple of
dummy functions that just throw ERRCODE_FEATURE_NOT_SUPPORTED.

Maybe the right plan would be to give all the pseudotypes error-throwing
binary I/O functions too.  Then, if anyone lobbies for not throwing an
error (as per what we just did with void), at least it doesn't take
a catversion bump to fix it.

If someone wanted to propose doing all that, I could get behind it.
But I'm not excited about debating this one datatype at a time.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: cross column correlation ...

2011-02-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 /me prepares to go down in flames.

 Personally, I think the first thing we ought to do is add a real, bona
 fide planner hint to override the selectivity calculation manually,
 maybe something like this:

 WHERE (x  5 AND y = 1) SELECTIVITY (0.1);

One of the criteria we've always had for a suitable hint-or-whatever-
you-call-it design is that it *not* involve decorating the queries.
There are a number of reasons for that, some of the killer ones being

(1) People frequently *can't* adjust their queries that way, because
they're coming out of some broken query generator or other.  (Crappy
query generators are of course one of the prime reasons for
poor-performing queries in the first place, so you can't write this off
as not being a key use case.)

(2) Anything we do like that, we'd be locked into supporting forever,
even after we think of better solutions.

(3) People don't like decorating their queries with nonstandard stuff;
it smells of vendor lock-in.  Especially if it's actually SQL syntax
and not comments.  Once you put something into the DML it's just too
hard to fix applications to get rid of it (the inverse case of point
#1).

I haven't looked at Hans' patch in any detail, and don't intend to
do so while the CF is still running; but at least he got this point
right.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: cross column correlation ...

2011-02-22 Thread Robert Haas
On Tue, Feb 22, 2011 at 9:43 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 /me prepares to go down in flames.

 Personally, I think the first thing we ought to do is add a real, bona
 fide planner hint to override the selectivity calculation manually,
 maybe something like this:

 WHERE (x  5 AND y = 1) SELECTIVITY (0.1);

 One of the criteria we've always had for a suitable hint-or-whatever-
 you-call-it design is that it *not* involve decorating the queries.
 There are a number of reasons for that, some of the killer ones being

 (1) People frequently *can't* adjust their queries that way, because
 they're coming out of some broken query generator or other.  (Crappy
 query generators are of course one of the prime reasons for
 poor-performing queries in the first place, so you can't write this off
 as not being a key use case.)

 (2) Anything we do like that, we'd be locked into supporting forever,
 even after we think of better solutions.

 (3) People don't like decorating their queries with nonstandard stuff;
 it smells of vendor lock-in.  Especially if it's actually SQL syntax
 and not comments.  Once you put something into the DML it's just too
 hard to fix applications to get rid of it (the inverse case of point
 #1).

Those are real problems, but I still want it.  The last time I hit
this problem I spent two days redesigning my schema and adding
triggers all over the place to make things work.  If I had been
dealing with a 30TB database instead of a 300MB database I would have
been royally up a creek.

To put that another way, it's true that some people can't adjust their
queries, but also some people can.  It's true that nonstandard stuff
sucks, but queries that don't work suck, too.  And as for better
solutions, how many major release cycles do we expect people to wait
for them?  Even one major release cycle is an eternity when you're
trying to get the application working before your company runs out of
money, and this particular problem has had a lot of cycles expended on
it without producing anything very tangible (proposed patch, which
like you I can't spare a lot of cycles to look at just now, possibly
excepted).

I agree that if we can get something that actually works that doesn't
involve decorating the queries, that is better.  But I would surely
rather decorate the queries than rewrite the entire application around
the problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: cross column correlation ...

2011-02-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Feb 22, 2011 at 9:43 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 One of the criteria we've always had for a suitable hint-or-whatever-
 you-call-it design is that it *not* involve decorating the queries.

 [ snip ]
 To put that another way, it's true that some people can't adjust their
 queries, but also some people can.  It's true that nonstandard stuff
 sucks, but queries that don't work suck, too.  And as for better
 solutions, how many major release cycles do we expect people to wait
 for them?

Well, a decorating-the-queries solution that isn't utter crap is not
going to be a small amount of work, either.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TODO: You can alter it, but you can't view it

2011-02-22 Thread Josh Berkus

 Right now pg_options_to_table() is not documented.  Should it be?

Yes, I think so.


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Correctly producing array literals for prepared statements

2011-02-22 Thread Peter Geoghegan
I'm investigating the possibility of developing a utility function for
our C++ client library, libpqxx, that produces array literals that can
be used in prepared statements. This problem appears to be a bit of a
tar pit, so I'm hoping that someone can help me out. My goal is to
produce a template function that accepts arbitrarily nested standard
library containers, that contain at the most nested level
constants/literals of some type that can be fed into a stream, such as
an int or a std::string.

I'm aware that I cannot assume that types are delimited by a single
quote, even for built-in types. I thought that I would put the onus on
the client to specify the correct delimiter, by checking pg_type
themselves if necessary, but default to ',' . Is this a reasonable
approach?

Escaping/quoting individual elements seems tricky. I have produced a
generic and superficially well behaved implementation by using double
quotes for constants. However, I have now opened the door to malicious
parties injecting multiple array elements where only one is allowed,
or causing malformed array literal errors by simply including a double
quote of their own. It's not clear where the responsibility should
rest for escaping constants/ensuring that constants don't contain
double quotes. Can someone suggest a better approach? I can't very
well use single quotes, because they are escaped/doubled up when we
pass the array literal to something similar to PQexecPrepared(), and
they shouldn't be - strings end up looking like this: 'has errant
single quotes on either side'.

Since Postgres only supports encodings that are ASCII supersets, I
don't believe that I have to consider encoding - only my clients do.

Can someone please point me in the direction of an established client
library/driver where all corner cases are covered, or at least enough
of them to produce a net gain in usefulness? There may well be
additional subtleties that have not occurred to me.

-- 
Regards,
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-22 Thread Greg Stark
On Wed, Feb 23, 2011 at 4:16 AM, Peter Geoghegan
peter.geoghega...@gmail.com wrote:
 Since Postgres only supports encodings that are ASCII supersets, I
 don't believe that I have to consider encoding - only my clients do.


This is only true for server encodings. In a client library I think
you lose on this and do have to deal with it. I'm not sure what client
encodings we do support that aren't ascii-supersets though, it's
possible none of them generate quote characters this way.

I'm a bit surprised libpqxx isn't using binary mode internally though.
This would at least avoid the problems with encoding. However I'm not
sure things like the array binary format are really stable and
portable enough to really use from a client library. Some datatypes
might be dependent on the server ABI (floats -- I'm looking at you) so
that might make it difficult or impossible.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI bug?

2011-02-22 Thread Kevin Grittner
Dan Ports  wrote:
 
 The obvious solution to me is to just keep the lock on both the old
 and new page.
 
That's the creative thinking I was failing to do.  Keeping the old
lock will generate some false positives, but it will be rare and
those don't compromise correctness -- they just carry the cost of
starting the transaction over.  In exchange you buy a savings in
predicate lock acquisition.  In particular, by dodging LW locking for
a large percentage of calls, you help scalability.  It can't cause
false negatives, so correctness is OK -- it's all about net costs.
 
 I was going to bemoan the extra complexity this would add -- but
 actually, couldn't we just replace PredicateLockPageCombine with a
 call to PredicateLockPageSplit since they'd now do the same thing?
 
I'd be inclined to leave the external interface alone, in case we
conceive of an even better implementation.  We can just remove the
removeOld bool from the parameter list of the static
TransferPredicateLocksToNewTarget function, and keep the behavior
from the false case.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: cross column correlation ...

2011-02-22 Thread Nathan Boley
 Personally, I think the first thing we ought to do is add a real, bona
 fide planner hint to override the selectivity calculation manually,
 maybe something like this:

 WHERE (x  5 AND y = 1) SELECTIVITY (0.1);



If you're going to go that far, why not just collect statistics on
that specific predicate?

ie,  ANALYZE SELECTIVITY ON tablename (x, y) WHERE (x  5 AND y = 1);

Then it won't fall subject to all of the pitfalls that Tom outlines below.

Selectivities are easy to estimate if we know the predicate. They only
become hard when they have to work for every possible predicate.

Best,
Nathan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-22 Thread Tatsuo Ishii
 This is only true for server encodings. In a client library I think
 you lose on this and do have to deal with it. I'm not sure what client
 encodings we do support that aren't ascii-supersets though, it's
 possible none of them generate quote characters this way.

We have a clear definition what encodings are for client
only(mb/pg_wchar.h):

/* followings are for client encoding only */
PG_SJIS,/* Shift JIS 
(Winindows-932) */
PG_BIG5,/* Big5 (Windows-950) */
PG_GBK, /* GBK (Windows-936) */
PG_UHC, /* UHC (Windows-949) */
PG_GB18030, /* GB18030 */
PG_JOHAB,   /* EUC for Korean JOHAB 
*/
PG_SHIFT_JIS_2004,  /* Shift-JIS-2004 */
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers