date:20111116

Re: [HACKERS] FlexLocks

2011-11-16 Thread Simon Riggs

On Tue, Nov 15, 2011 at 8:33 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:

  I'm not really enthused by the idea of completely rewriting lwlocks
  for this. Seems like specialised code is likely to be best, as well as
  having less collateral damage.

 Well, the problem that I have with that is that we're going to end up
 with a lot of specialized code, particularly around error recovery.
 This code doesn't remove the need for ProcArrayLock to be taken in
 exclusive mode, and I don't think there IS any easy way to remove the
 need for that to happen sometimes.  So we have to deal with the
 possibility that an ERROR might occur while we hold the lock, which
 means we have to release the lock and clean up our state.  That means
 every place that has a call to LWLockReleaseAll() will now also need
 to cleanup ProperlyCleanUpProcArrayLockStuff().  And the next time we
 need to add some kind of specialized lock, we'll need to do the same
 thing again.  It seems to me that that rapidly gets unmanageable, not
 to mention *slow*.  We need some kind of common infrastructure for
 releasing locks, and this is an attempt to create such a thing.  I'm
 not opposed to doing it some other way, but I think doing each one as
 a one-off isn't going to work out very well.

 I agree.  In fact, I would think that we should look into rewriting the
 sync rep locking (and group commit) on top of flexlocks, not the other
 way around.  As Kevin says nearby it's likely that we could find some
 way to rewrite the SLRU (clog and such) locking protocol using these new
 things too.

I see the commonality between ProcArray locking and Sync Rep/ Group
Commit locking. It's basically the same design, so although it wasn't
my first thought, I agree.

I did originally write that using spinlocks, but that was objected to.
Presumably the same objection would hold here also, but if it doesn't
that's good.

Mixing the above 3 things together is enough for me; I just don't see
the reason to do a global search and replace on the lwlock name in
order to do that. This is 2 patches at same time, 1 we clearly need, 1
I'm not sure about. Perhaps some more explanation about the flexlocks
structure and design will smooth that unease.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Unremovable tuple monitoring

2011-11-16 Thread Yeb Havinga


On 2011-11-15 22:04, Tom Lane wrote:

Robert Haasrobertmh...@gmail.com  writes:

Oh.  I was thinking dead meant no longer visible to anyone.   But
it sounds what we call unremovable here is what we elsewhere call
recently dead.

Would have to look at the code to be sure, but I think that
nonremovable is meant to count both live tuples and
dead-but-still-visible-to-somebody tuples.

The question that I think needs to be asked is why it would be useful
to track this using the pgstats mechanisms.  By definition, the
difference between this and the live-tuple count is going to be
extremely unstable --- I don't say small, necessarily, but short-lived.
So it's debatable whether it's worth memorializing the count obtained
by the last VACUUM at all.  And doing it through pgstats is an expensive
thing.  We've already had push-back about the size of the stats table
on large (lots-o-tables) databases.  Adding another counter will impose
a performance overhead on everybody, whether they care about this number
or not.

What's more, to the extent that I can think of use-cases for knowing
this number, I think I would want a historical trace of it --- that is,
not only the last VACUUM's result but those of previous VACUUM cycles.
So pgstats seems like it's both expensive and useless for the purpose.



Before reviewing this patch I didn't even know these kind of dead rows 
could exist. Now I know it, I expect that if I wanted to know the 
current number, I would start looking at table statistics: pg_stat* or 
perhaps contrib/pgstattuple.


Looking at how that looks with transaction a the old version:

t=# begin TRANSACTION ISOLATION LEVEL repeatable read;
BEGIN
t=# select * from t;
 i  | b
+---
 10 | 2
(1 row)

in transaction b the new version:
t=# select * from t;
 i  | b
+---
 10 | 4
(1 row)

after a vacuum of t:

stat_user_table counts:
n_tup_ins | 1
n_tup_upd | 6
n_tup_del | 0
n_tup_hot_upd | 6
n_live_tup| 2
n_dead_tup| 0
n_unremovable_tup | 1

t=# select * from pgstattuple('t');
-[ RECORD 1 ]--+--
table_len  | 8192
tuple_count| 1
tuple_len  | 32
tuple_percent  | 0.39
dead_tuple_count   | 1
dead_tuple_len | 32
dead_tuple_percent | 0.39
free_space | 8080
free_percent   | 98.63

Apparently pg_stat* counts the recently_dead tuple under n_live_tup, 
else 2 is a wrong number, where pgstattuple counts recently_dead under 
dead_tuple_count. This could be a source of confusion. If there is any 
serious work considered here, IMHO at least the numbers of the two 
different sources of tuple counters should match in terminology and 
actual values. Maybe also if pgstattuple were to include the distinction 
unremovable dead tuples vs dead tuples, a log line by vacuum 
encountering unremovable dead tuples could refer to pgstattuple for 
statistics.


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Group Commit

2011-11-16 Thread Simon Riggs

On Wed, Nov 16, 2011 at 1:17 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Nov 15, 2011 at 7:18 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 When I apply this to HEAD and run make check, it freezes at:
 test tablespace               ...
 [...]
  Does anyone else see this?

 It hangs for me, too, just slightly later:


OK, thanks for checking. I'll investigate.


-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding Node support in outfuncs.c and readfuncs.c

2011-11-16 Thread Dimitri Fontaine

Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 What about adding that into src/tools/editors/pgsrc.el?

Should I add an item for that in the commit fest?

-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Core Extensions relocation

2011-11-16 Thread Dimitri Fontaine

Thom Brown t...@linux.com writes:
 None of those others perform such a role.  Instead they add additional
 functionality intended to be utilised as part of general data usage,
 adding new types, operators, query functions etc.  Maybe the term
 core is inappropriate.  Instead we might wish to refer to them as
 utility extensions or something like that, although that may be just
 as vague.

The term “core” here intends to show off that those extensions are
maintained by the core PostgreSQL developer team. If needs be, those
extensions will get updated in minor releases (crash, bugs, security,
etc).

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Syntax for partitioning

2011-11-16 Thread Dimitri Fontaine

Martijn van Oosterhout klep...@svana.org writes:
 That said, I still don't see how you can enforce a unique index over
 multiple segments over something other than the partition key while
 still allowing quick dropping of segments.  If you can fix that you can
 make it work for the current inheritence-style partitioning.

Well the Primary Key and the Physical Map Index do not need to be on the
same set of columns.

 If you happen to drop a part of the data that fits in one or more
 segments (and with a decent fillfactor you need less than 1GB to get
 there), then you can unlink() whole files at a time.  That would be the
 goal here.

 I feel uncomfortable with the happen to. You can add the magic too,
 but for scripting purposes I'd feel better if it could be done via DDL
 also. That way typos don't end up being 5 day queries all of a sudden.

If the data fills less than a segment then you can't unlink() the file,
you have to mark the tuples / pages as free space.  If you have a
partial index matching the whole portion of data you're removing, you
can still drop it before hand — or maybe the system can be instructed to
do so?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Core Extensions relocation

2011-11-16 Thread Robert Haas

On Tue, Nov 15, 2011 at 5:50 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Thom Brown t...@linux.com writes:
 None of those others perform such a role.  Instead they add additional
 functionality intended to be utilised as part of general data usage,
 adding new types, operators, query functions etc.  Maybe the term
 core is inappropriate.  Instead we might wish to refer to them as
 utility extensions or something like that, although that may be just
 as vague.

 The term “core” here intends to show off that those extensions are
 maintained by the core PostgreSQL developer team. If needs be, those
 extensions will get updated in minor releases (crash, bugs, security,
 etc).

Everything in contrib meets that definition, more or less.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Unremovable tuple monitoring

2011-11-16 Thread Robert Haas

On Wed, Nov 16, 2011 at 3:31 AM, Yeb Havinga yebhavi...@gmail.com wrote:
 Apparently pg_stat* counts the recently_dead tuple under n_live_tup, else 2
 is a wrong number, where pgstattuple counts recently_dead under
 dead_tuple_count. This could be a source of confusion. If there is any
 serious work considered here, IMHO at least the numbers of the two different
 sources of tuple counters should match in terminology and actual values.

+1.

 Maybe also if pgstattuple were to include the distinction unremovable dead
 tuples vs dead tuples, a log line by vacuum encountering unremovable dead
 tuples could refer to pgstattuple for statistics.

Not sure about the log line, but allowing pgstattuple to distinguish
between recently-dead and quite-thoroughly-dead seems useful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Unremovable tuple monitoring

2011-11-16 Thread Robert Haas

On Tue, Nov 15, 2011 at 10:02 PM, Royce Ausburn royce...@inomial.com wrote:
 Fair enough -- someone knowledgable could set that up if they wanted.  My 
 goal was mostly to have something helpful in the logs.  If that's not 
 something postgres wants/needs Ill drop it =)

IMHO, it's generally not desirable to provided hard-coded
functionality that could easily be duplicated in user-space.  We can't
really know whether the user wants this warning at all, and if so what
the cut-off ought to be for a too old transaction, and how often the
warning should be emitted.  It's far more flexible for the user to set
it up themselves.  Log clutter is a major problem for some users, and
we shouldn't add to it without a fairly compelling reason.

Just to take an example of something that *couldn't* easily be done in
userspace, suppose VACUUM emitted a NOTICE when the number of
recently-dead tuples was more than a certain (configurable?)
percentage of the table.  That's something that VACUUM could calculate
(or at least estimate) while scanning the table, but it wouldn't be so
easy for the user to get that same data, at least not cheaply.  Now
I'm not sure we need that particular thing; it's just an example.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding Node support in outfuncs.c and readfuncs.c

2011-11-16 Thread Robert Haas

On Tue, Nov 15, 2011 at 5:41 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 What about adding that into src/tools/editors/pgsrc.el?

 Should I add an item for that in the commit fest?

Sounds like a good idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_restore --no-post-data and --post-data-only

2011-11-16 Thread Robert Haas

On Tue, Nov 15, 2011 at 8:19 PM, Joshua Berkus j...@agliodbs.com wrote:
  Here is a patch for that for pg_dump. The sections provided for are
  pre-data, data and post-data, as discussed elsewhere. I still feel that
  anything finer grained should be handled via pg_restore's --use-list
  functionality. I'll provide a patch to do the same switch for pg_restore
  shortly.
 
  Adding to the commitfest.

 Updated version with pg_restore included is attached.

 Functionality review:

 I have tested the backported version of this patch using a 500GB production 
 database with over 200 objects and it worked as specified.

 This functionality is extremely useful for the a variety of selective copying 
 of databases, including creating shrunken test instances, ad-hoc parallel 
 dump, differently indexed copies, and sanitizing copies of sensitive data, 
 and even bringing the database up for usage while the indexes are still 
 building.

 Note that this feature has the odd effect that some constraints are loaded at 
 the same time as the tables and some are loaded with the post-data.  This is 
 consistent with how text-mode pg_dump has always worked, but will seem odd to 
 the user.  This also raises the possibility of a future pg_dump/pg_restore 
 optimization.

That does seem odd.  Why do we do it that way?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding Node support in outfuncs.c and readfuncs.c

2011-11-16 Thread Dimitri Fontaine

Robert Haas robertmh...@gmail.com writes:
 Should I add an item for that in the commit fest?

 Sounds like a good idea.

Done: https://commitfest.postgresql.org/action/patch_view?id=707

Note: I might also add support for equalfuncs and copyfuncs while at,
been doing that again and I guess I would just M-x it next time.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Core Extensions relocation

2011-11-16 Thread Dimitri Fontaine

Robert Haas robertmh...@gmail.com writes:
 The term “core” here intends to show off that those extensions are
 maintained by the core PostgreSQL developer team. If needs be, those
 extensions will get updated in minor releases (crash, bugs, security,
 etc).

 Everything in contrib meets that definition, more or less.

Yeah? It would only mean that Josh Berkus complaint about the naming is
to be followed.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] includeifexists in configuration file

2011-11-16 Thread Euler Taveira de Oliveira

On 16-11-2011 02:28, Greg Smith wrote:
 By recent popular request in the ongoing discussion saga around merging the
 recovery.conf, I've added an includeifexists directive to the
 postgresql.conf in the attached patch.
 
I'm not following the merging recovery.conf thread but isn't it worth emitting
at least an WARNING message when the file does not exist?

Something like

WARNING:  could not open configuration file /foo/missing.conf, skipping

Let's suppose a DBA is using this new feature to include some general company
recommendations. If (s)he mistyped the name of the file, the general
recommendations will not be applied and the DBA won't be even warned. That's
not what a DBA would expect.


-- 
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor optimisation of XLogInsert()

2011-11-16 Thread Robert Haas

On Tue, Nov 15, 2011 at 6:24 AM, Simon Riggs si...@2ndquadrant.com wrote:
 Patch adds cacheline padding around parts of XLogCtl.

 Idea from way back, but never got performance tested in a situation
 where WALInsertLock was the bottleneck.

 So this is speculative, in need of testing to investigate value.

I kicked off a round of benchmarking around this.  I'll post results
in a few hours when the run finishes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Unremovable tuple monitoring

2011-11-16 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 Not sure about the log line, but allowing pgstattuple to distinguish
 between recently-dead and quite-thoroughly-dead seems useful.

The dividing line is enormously unstable though.  pgstattuple's idea of
RecentGlobalXmin could even be significantly different from that of a
concurrently running VACUUM.  I can see the point of having VACUUM log
what it did, but opinions from the peanut gallery aren't worth much.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] how to get tuple

2011-11-16 Thread Rudyar


Hello,

I'm new in postgreSQL programming. I try to print a tuple from 
tupleTableSlot structure..

for example..

outerTupleSlot = ExecHashJoinOuterGetTuple(outerNode,
   node,
hashvalue);

how to extract a tuple value?

there are any kind of documentation about that?

Regards.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] strict aliasing

2011-11-16 Thread Kevin Grittner

Ants Aasma ants.aa...@eesti.ee wrote:
 
 Concurrency 8 results should probably be ignored - variance was
 huge, definitely more than the differences.
 
I'm not so sure it should be ignored -- one thing I noticed in
looking at the raw numbers from my benchmarks was that the -O2 code
was much more consistent from run to run than the -O3 code.  I doubt
that the more aggressive optimizations were developed under NUMA
architecture, and I suspect that the aggressively optimized code may
be more sensitive to whether memory is directly accessed by the core
running the process or routed though the memory controller on
another core.  (I hit on this idea this morning when I remembered
seeing similar variations in run times of STREAM against our new
servers with NUMA.)
 
This suggests that in the long term, it might be worth investigating
whether we can arrange for a connection's process to have some
degree of core affinity and encourage each process to allocate local
memory from RAM controlled by that core.  To some extent I would
expect the process-based architecture of PostgreSQL to help with
that, as you would expect a NUMA-aware OS to try to arrange that to
some degree.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] includeifexists in configuration file

2011-11-16 Thread Tom Lane

Euler Taveira de Oliveira eu...@timbira.com writes:
 On 16-11-2011 02:28, Greg Smith wrote:
 By recent popular request in the ongoing discussion saga around merging the
 recovery.conf, I've added an includeifexists directive to the
 postgresql.conf in the attached patch.

 I'm not following the merging recovery.conf thread but isn't it worth emitting
 at least an WARNING message when the file does not exist?

 Something like

 WARNING:  could not open configuration file /foo/missing.conf, skipping

Minor note here: people keep thinking that WARNING  LOG with respect to
messages that can only go to the server log.  This is not correct ...
LOG would be the right elevel to use.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ToDo: pg_backup - using a conditional DROP

2011-11-16 Thread Peter Eisentraut

On tis, 2011-11-15 at 22:27 -0500, Tom Lane wrote:
 Now, --clean using DROP IF EXISTS would be targeted at, um, what case?
 I guess the idea is to be able to load into a database that sort of
 mostly shares the same schema as the one you dumped from, only it's
 not the same (if it were the same, you'd not need IF EXISTS). 

What about a schema that is nominally the same, but some object is
missing, because some earlier attempt to clean things up, or because
it's an earlier version of the schema, or because of some batch jobs
make things come and go.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Unremovable tuple monitoring

2011-11-16 Thread Yeb Havinga


On 2011-11-16 15:34, Tom Lane wrote:

Robert Haasrobertmh...@gmail.com  writes:

Not sure about the log line, but allowing pgstattuple to distinguish
between recently-dead and quite-thoroughly-dead seems useful.

The dividing line is enormously unstable though.  pgstattuple's idea of
RecentGlobalXmin could even be significantly different from that of a
concurrently running VACUUM.  I can see the point of having VACUUM log
what it did, but opinions from the peanut gallery aren't worth much.


I don't understand your the last remark so I want to get it clear: I 
looked up peanut gallery on the wiki. Is 'opinion from the peanut 
gallery' meant to describe my comments as patch reviewer? I'd appreciate 
brutal honesty on this point.


thanks
Yeb


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Unremovable tuple monitoring

2011-11-16 Thread Robert Haas

On Wed, Nov 16, 2011 at 9:34 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Not sure about the log line, but allowing pgstattuple to distinguish
 between recently-dead and quite-thoroughly-dead seems useful.

 The dividing line is enormously unstable though.  pgstattuple's idea of
 RecentGlobalXmin could even be significantly different from that of a
 concurrently running VACUUM.  I can see the point of having VACUUM log
 what it did, but opinions from the peanut gallery aren't worth much.

Hmm, you have a point.

But as Yeb points out, it seems like we should at least try to be more
consistent about the terminology.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Unremovable tuple monitoring

2011-11-16 Thread Tom Lane

Yeb Havinga yebhavi...@gmail.com writes:
 On 2011-11-16 15:34, Tom Lane wrote:
 The dividing line is enormously unstable though.  pgstattuple's idea of
 RecentGlobalXmin could even be significantly different from that of a
 concurrently running VACUUM.  I can see the point of having VACUUM log
 what it did, but opinions from the peanut gallery aren't worth much.

 I don't understand your the last remark so I want to get it clear: I 
 looked up peanut gallery on the wiki. Is 'opinion from the peanut 
 gallery' meant to describe my comments as patch reviewer? I'd appreciate 
 brutal honesty on this point.

No no no, sorry if you read that as a personal attack.  I was trying to
point out that a process running pgstattuple does not have a value of
RecentGlobalXmin that should be considered authoritative --- it is only
a bystander, not the process that might do actual cleanup work.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] strict aliasing

2011-11-16 Thread Robert Haas

On Wed, Nov 16, 2011 at 9:47 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 This suggests that in the long term, it might be worth investigating
 whether we can arrange for a connection's process to have some
 degree of core affinity and encourage each process to allocate local
 memory from RAM controlled by that core.  To some extent I would
 expect the process-based architecture of PostgreSQL to help with
 that, as you would expect a NUMA-aware OS to try to arrange that to
 some degree.

I've done some testing on HP/UX-Itanium and have not been able to
demonstrate any significant performance benefit from overriding the
operating system's default policies regarding processor affinity.  For
example, I hacked the code to request that the shared memory be
allocated as cell-local memory, then used mpsched with the FILL_TREE
policy to bind everything to a single cell, and sure enough it all ran
in that cell, but it wasn't any better than 4 clients running on
different cells with the shared memory segment allocated interleaved.
This result didn't really make much sense to me, because it seemed
like it SHOULD have helped.  So it's possible I did something wrong.
But if so, I couldn't find it.  The other possibility is that the OS
is smart enough about moving things around to get good locality that
sticking locality hints on top doesn't really make any difference.
Certainly, I would expect any OS to be smart enough to allocate
backend-local memory on the same processor where the task is running,
and to avoid moving processes between cells more than necessary.

Regarding results instability, on some patch sets I've tried, I've
seen very unstable performance.  I've also noticed that a very short
run sometimes gives much higher performance than a longer run.  My
working theory is that this is the result of spinlock contention.
Suppose you have a spinlock that is getting passed around very quickly
between, say, 32 processes.  Since the data protected by the spinlock
is on the same cache line as the lock, what ideally happens is that
the process gets the lock and then finishes its work and releases the
lock before anyone else tries to pull the cache line away.  And at the
beginning of the run, that's what does actually happen.  But then for
some reason a process gets context-switched out while it holds the
lock, or maybe it's just that somebody gets unlucky enough to have the
cache line stolen before they can dump the spinlock and can't quite
get it back fast enough.  Now people start to pile up trying to get
that spinlock, and that means trouble, because now it's much harder
for any given process to get the cache line and finish its work before
the cache line gets stolen away.  So my theory is that now the
performance goes down more or less permanently, unless or until
there's some momentary break in the action that lets the queue of
people waiting for that spinlock drain out.  This is just a wild-ass
guess, and I might be totally wrong...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] includeifexists in configuration file

2011-11-16 Thread Robert Haas

On Wed, Nov 16, 2011 at 12:28 AM, Greg Smith g...@2ndquadrant.com wrote:
 By recent popular request in the ongoing discussion saga around merging the
 recovery.conf, I've added an includeifexists directive to the
 postgresql.conf in the attached patch.

I haven't read the code yet, but just to get the bikeshedding started,
I think it might be better to call this include_if_exists rather than
running it together as one word.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] CommitFest 2011-11 Started

2011-11-16 Thread Greg Smith

The November CommitFest is now closed for new entries.  We have 30 
patches in the queue that are still looking for a reviewer at this 
point, out of a total of 53.  If you'd like to review a patch but are 
looking for a suggestion as to which to choose, e-mail the 
pgsql-rrreviewers list saying so.


4 patches have already received some early review and are waiting for 
updates from the author:


Allow toast tables to be moved to a different tablespace
Online base backup from the hot-standby
Separate pg_stat_activity into current_query into state and query columns
Allow substitute allocator for PGresult.

And we have 5 earlier submissions that have been flagged ready for a 
committer to look at them; names here are the expected committer when 
one has been mentioned already:


Non-inheritable check constraints (Greg Stark)
pg_last_xact_insert_timestamp (Simon)
Add Support for building with Visual Studio 2010 (Magnus)
plperl verify utf8 strings
xsubpp from cpan

--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Core Extensions relocation

2011-11-16 Thread Robert Haas

On Wed, Nov 16, 2011 at 9:20 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 The term “core” here intends to show off that those extensions are
 maintained by the core PostgreSQL developer team. If needs be, those
 extensions will get updated in minor releases (crash, bugs, security,
 etc).

 Everything in contrib meets that definition, more or less.

 Yeah? It would only mean that Josh Berkus complaint about the naming is
 to be followed.

I am not sure I'm quite following you, but I'm unaware that there are
some contrib modules that we maintain more than others.  Bugs and
security vulnerabilities in any of them are typically fixed when
reported.  Now, sometimes we might not be able to fix a bug because of
some architectural deficiency, but that also happens in the server -
consider, for example, the recent discussion of creating a table in a
schema that is concurrently being dropped, which is likely to require
far more invasive fixing than we are probably willing to do anywhere
other than master.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FlexLocks

2011-11-16 Thread Kevin Grittner

Simon Riggs si...@2ndquadrant.com wrote:
 
 I just don't see the reason to do a global search and replace on
 the lwlock name
 
I was going to review further before commenting on that, but since
it has now come up -- it seems odd that a source file which uses
only LW locks needs to change so much for the FlexLock
implementation.  I'm not sure source code which uses the next layer
up from FlexLock should need to be aware of it quite so much.  I
assume that this was done to avoid adding a layer to some code where
it could cause an unnecessary performance hit, but maybe it would be
worth just wrapping with a macro to isolate the levels where that's
all it takes.
 
For example, if these two macros were defined, predicate.c wouldn't
have needed any modifications, and I suspect that is true of many
other files (although possibly needing a few other macros):
 
#define LWLockId FlexLockId
#define LWLockHeldByMe(lock) FlexLockHeldByMe(lock)
 
Particularly with the function call it seems like it's a mistake to
assume that test will always be the same between LW locks and flex
locks.  There may be a better way to do it than the above, but I
think a worthy goal would be to impose zero source code changes on
code which continues to use traditional lightweight locks.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FlexLocks

2011-11-16 Thread Robert Haas

On Wed, Nov 16, 2011 at 10:26 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 For example, if these two macros were defined, predicate.c wouldn't
 have needed any modifications, and I suspect that is true of many
 other files (although possibly needing a few other macros):

 #define LWLockId FlexLockId
 #define LWLockHeldByMe(lock) FlexLockHeldByMe(lock)

 Particularly with the function call it seems like it's a mistake to
 assume that test will always be the same between LW locks and flex
 locks.  There may be a better way to do it than the above, but I
 think a worthy goal would be to impose zero source code changes on
 code which continues to use traditional lightweight locks.

Well, it would certainly be easy enough to add those macros, and I'm
not necessarily opposed to it, but I fear it could end up being a bit
confusing in the long run.  If we adopt this infrastructure, then I
expect knowledge of different types of FlexLocks to gradually
propagate through the system.  Now, you're always going to use
LWLockAcquire() and LWLockRelease() to acquire and release an LWLock,
but a FlexLockId isn't guaranteed to be an LWLockId - any given value
might also refer to a FlexLock of some other type.  If we let everyone
continue to refer to those things as LWLockIds, then it seems like
only a matter of time before someone has a variable that's declared as
LWLockId but actually doesn't refer to an LWLock at all.  I think it's
better to bite the bullet and do the renaming up front, rather than
having to think about it every time you modify some code that uses
LWLockId or LWLockHeldByMe and say to yourself, oh, wait a minute, is
this really guaranteed to be an LWLock?

For LWLockHeldByMe, a sensible compromise might be to add a function
that asserts that the FlexLockId passed as an argument is in fact
pointing to an LWLock, and then calls FlexLockHeldByMe() and returns
the result.  That way you'd presumably noticed if you used the more
specific function when you needed the more general one (because,
hopefully, the regression tests would fail).  But I'm not seeing any
obvious way of providing a similar degree of insulation against
abusing LWLockId.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FlexLocks

2011-11-16 Thread Tom Lane

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Simon Riggs si...@2ndquadrant.com wrote:
 I just don't see the reason to do a global search and replace on
 the lwlock name
 
 I was going to review further before commenting on that, but since
 it has now come up -- it seems odd that a source file which uses
 only LW locks needs to change so much for the FlexLock
 implementation.

Yeah, -1 on wideranging source changes for me too.  There is no reason
that the current LWLock API need change.  (I'm not saying that it has
to be same ABI though --- macro wrappers would be fine.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FlexLocks

2011-11-16 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 Well, it would certainly be easy enough to add those macros, and I'm
 not necessarily opposed to it, but I fear it could end up being a bit
 confusing in the long run.  If we adopt this infrastructure, then I
 expect knowledge of different types of FlexLocks to gradually
 propagate through the system.  Now, you're always going to use
 LWLockAcquire() and LWLockRelease() to acquire and release an LWLock,
 but a FlexLockId isn't guaranteed to be an LWLockId - any given value
 might also refer to a FlexLock of some other type.  If we let everyone
 continue to refer to those things as LWLockIds, then it seems like
 only a matter of time before someone has a variable that's declared as
 LWLockId but actually doesn't refer to an LWLock at all.  I think it's
 better to bite the bullet and do the renaming up front, rather than
 having to think about it every time you modify some code that uses
 LWLockId or LWLockHeldByMe and say to yourself, oh, wait a minute, is
 this really guaranteed to be an LWLock?

In that case, I think you've chosen an unfortunate naming convention
and should rethink it.  There is not any benefit to be gained from a
global search and replace here, and as somebody who spends quite enough
time dealing with cross-branch coding differences already, I'm going to
put my foot down about introducing a useless one.

Perhaps it would be better to think of this as they're all lightweight
locks, but some have different locking policies.  Or we're taking a
different type of lock on this particular lock --- that would match up
rather better with the way we think about heavyweight locks.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FlexLocks

2011-11-16 Thread Robert Haas

On Wed, Nov 16, 2011 at 10:51 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Well, it would certainly be easy enough to add those macros, and I'm
 not necessarily opposed to it, but I fear it could end up being a bit
 confusing in the long run.  If we adopt this infrastructure, then I
 expect knowledge of different types of FlexLocks to gradually
 propagate through the system.  Now, you're always going to use
 LWLockAcquire() and LWLockRelease() to acquire and release an LWLock,
 but a FlexLockId isn't guaranteed to be an LWLockId - any given value
 might also refer to a FlexLock of some other type.  If we let everyone
 continue to refer to those things as LWLockIds, then it seems like
 only a matter of time before someone has a variable that's declared as
 LWLockId but actually doesn't refer to an LWLock at all.  I think it's
 better to bite the bullet and do the renaming up front, rather than
 having to think about it every time you modify some code that uses
 LWLockId or LWLockHeldByMe and say to yourself, oh, wait a minute, is
 this really guaranteed to be an LWLock?

 In that case, I think you've chosen an unfortunate naming convention
 and should rethink it.  There is not any benefit to be gained from a
 global search and replace here, and as somebody who spends quite enough
 time dealing with cross-branch coding differences already, I'm going to
 put my foot down about introducing a useless one.

 Perhaps it would be better to think of this as they're all lightweight
 locks, but some have different locking policies.  Or we're taking a
 different type of lock on this particular lock --- that would match up
 rather better with the way we think about heavyweight locks.

I struggled a lot with how to name this, and I'm not going to pretend
that what I came up with is necessarily ideal.  But the basic idea
here is that all FlexLocks share the following properties in common:

- they are identified by a FlexLockId
- they are released by FlexLockReleaseAll
- they make use of the lwlock-related fields (renamed in the patch) in
PGPROC for sleep and wakeup handling
- they have a type indicator, a mutex, a retry flag, and a wait queue

But the following things are different per-type:

- acquire, conditional acquire (if any), and release functions
- available lock modes
- additional data fields that are part of the lock

Now, it seemed to me that if I was going to split the LWLock facililty
into two layers, either the upper layer could be LWLocks, or the lower
layer could be LWLocks, but they couldn't both be LWLocks.  Since we
use LWLockAcquire() and LWLockRelease() all over the place but only
make reference to LWLockId in comparatively few places, it seemed to
me to be by far the less invasive renaming to make the upper layer be
LWLocks and the lower layer be something else.

Now maybe there is some better way to do this, but at the moment, I'm
not seeing it.  If we call them all LWLocks, but only some of them
support LWLockAcquire(), then that's going to be pretty weird.  The
situation is not really analagous to heavy-weight locks, where every
lock supports every lock mode, but in practice only table locks make
use of them all.  In this particular case, we do not want to clutter
up the vanilla LWLock implementation with a series of special cases
that are only useful for a minority of locks in the system.  That will
cause them to stop being lightweight, which misses the point; and it
will be ugly as hell, because the exact frammishes needed will
doubtless vary from one lock to another, and having just one lock type
that supports every single one of those frammishes is certainly a bad
idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FlexLocks

2011-11-16 Thread Greg Stark

On Wed, Nov 16, 2011 at 3:41 PM, Robert Haas robertmh...@gmail.com wrote:
  Now, you're always going to use
 LWLockAcquire() and LWLockRelease() to acquire and release an LWLock,
 but a FlexLockId isn't guaranteed to be an LWLockId - any given value
 might also refer to a FlexLock of some other type.  If we let everyone
 continue to refer to those things as LWLockIds, then it seems like
 only a matter of time before someone has a variable that's declared as
 LWLockId but actually doesn't refer to an LWLock at all.  I think it's
 better to bite the bullet and do the renaming up front, rather than
 having to think about it every time you modify some code that uses
 LWLockId or LWLockHeldByMe and say to yourself, oh, wait a minute, is
 this really guaranteed to be an LWLock?

But that's an advantage to having a distinct API layer for LWLock
instead of having callers directly call FlexLock methods. The LWLock
macros can AssertMacro that the lockid they were passed are actually
LWLocks and not some other type of lock. That would require assigning
FlexLockIds that are recognizably LWLocks but that's not implausible
is it?

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FlexLocks

2011-11-16 Thread Kevin Grittner

Robert Haas robertmh...@gmail.com wrote:
 
 Now maybe there is some better way to do this, but at the moment,
 I'm not seeing it.  If we call them all LWLocks, but only some of
 them support LWLockAcquire(), then that's going to be pretty
 weird. 
 
Is there any way to typedef our way out of it, such that a LWLock
*is a* FlexLock, but a FlexLock isn't a LWLock?  If we could do
that, you couldn't use just a plain old FlexLock in LWLockAcquire(),
but you could do the cleanups, etc., that you want.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] strict aliasing

2011-11-16 Thread Kevin Grittner

Robert Haas robertmh...@gmail.com wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 This suggests that in the long term, it might be worth [...]
 
 The other possibility is that the OS is smart enough about moving
 things around to get good locality that sticking locality hints on
 top doesn't really make any difference.  Certainly, I would expect
 any OS to be smart enough to allocate backend-local memory on the
 same processor where the task is running, and to avoid moving
 processes between cells more than necessary.
 
Right.  I'm not sure that it will make any more sense to do this
than to do raw access to a disk partition.  I don't think it's a
given that we can do a better job of this than the OS does.
 
 Regarding results instability [...] My working theory is that this
 is the result of spinlock contention.
 
 So my theory is that now the performance goes down more or less
 permanently, unless or until there's some momentary break in the
 action that lets the queue of people waiting for that spinlock
 drain out.  This is just a wild-ass guess, and I might be totally
 wrong...
 
Well, I suspect that you're basing that guess on enough evidence
that it's more likely to be right than the wild-assed guesses I've
been throwing out there.  :-)  I can't say it's inconsistent with
anything I've seen.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FlexLocks

2011-11-16 Thread Robert Haas

On Wed, Nov 16, 2011 at 11:14 AM, Greg Stark st...@mit.edu wrote:
 On Wed, Nov 16, 2011 at 3:41 PM, Robert Haas robertmh...@gmail.com wrote:
  Now, you're always going to use
 LWLockAcquire() and LWLockRelease() to acquire and release an LWLock,
 but a FlexLockId isn't guaranteed to be an LWLockId - any given value
 might also refer to a FlexLock of some other type.  If we let everyone
 continue to refer to those things as LWLockIds, then it seems like
 only a matter of time before someone has a variable that's declared as
 LWLockId but actually doesn't refer to an LWLock at all.  I think it's
 better to bite the bullet and do the renaming up front, rather than
 having to think about it every time you modify some code that uses
 LWLockId or LWLockHeldByMe and say to yourself, oh, wait a minute, is
 this really guaranteed to be an LWLock?

 But that's an advantage to having a distinct API layer for LWLock
 instead of having callers directly call FlexLock methods. The LWLock
 macros can AssertMacro that the lockid they were passed are actually
 LWLocks and not some other type of lock. That would require assigning
 FlexLockIds that are recognizably LWLocks but that's not implausible
 is it?

Well, that works for the most part.  You still need a few generic
functions, like FlexLockReleaseAll(), which releases all FlexLocks of
all types, not just those of some particular type.  And it doesn't
solve the problem with FlexLockId, which can potentially refer to a
FlexLock of any type, not just a LWLock.

I think we might be getting slightly more excited about this problem
than it actually deserves.  Excluding lwlock.{c,h}, the new files
added by this patch, and the documentation changes, this patch adds
103 lines and removes 101.  We can uncontroversially reduce each
numbers by 14 by adding a separate LWLockHeldByMe() function that does
the same thing as FlexLockHeldByMe() but also asserts the lock type.
That would leave us adding 89 lines of code and removing 87.

If we (against my better judgement) take the position that we must
continue to use LWLockId rather than FlexLockId as the type name in
any place that only treats with LWLocks we could reduce each of those
numbers by an additional 34, giving new totals of 55 and 53 lines of
changes outside the lwlock/flexlock code itself rather than 89 and 87.
 I humbly submit that this is not really enough to get excited about.
We've make far more sweeping notational changes than that more than
once - even, I think, with some regularity.

This may seem invasive because it's touching LWLocks, and we use those
everywhere, but in practice the code footprint is very small because
typical usage is just LWLockAcquire(BlahLock) and then
LWLockRelease(BlahLock).  And I'm not proposing to change that usage
in any way; avoiding any change in that area was, in fact, one of my
main design goals.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FlexLocks

2011-11-16 Thread Robert Haas

On Wed, Nov 16, 2011 at 11:17 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:

 Now maybe there is some better way to do this, but at the moment,
 I'm not seeing it.  If we call them all LWLocks, but only some of
 them support LWLockAcquire(), then that's going to be pretty
 weird.

 Is there any way to typedef our way out of it, such that a LWLock
 *is a* FlexLock, but a FlexLock isn't a LWLock?  If we could do
 that, you couldn't use just a plain old FlexLock in LWLockAcquire(),
 but you could do the cleanups, etc., that you want.

Well, if we just say:

typedef FlexLockId LWLockId;

...that's about equivalent to the #define from the compiler's point of
view.  We could alternatively change one or the other of them to be a
struct with one member, but I think the cure might be worse than the
disease.  By my count, we are talking about saving perhaps as many as
34 lines of code changes here, and that's only if complicating the
type handling doesn't require any changes to places that are untouched
at present, which I suspect it would.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] declarations of range-vs-element @ and @

2011-11-16 Thread Tom Lane

Why do these use anynonarray rather than anyelement?  Given that we
support ranges of arrays (there's even a regression test), this seems
a bogus limitation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FlexLocks

2011-11-16 Thread Kevin Grittner

Robert Haas robertmh...@gmail.com wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 Is there any way to typedef our way out of it [?]
 
 Well, if we just say:
 
 typedef FlexLockId LWLockId;
 
 ...that's about equivalent to the #define from the compiler's
 point of view.
 
Bummer -- I was hoping there was some equivalent to subclassing
that I just didn't know about.  :-(
 
 We could alternatively change one or the other of them to be a
 struct with one member, but I think the cure might be worse than
 the disease.  By my count, we are talking about saving perhaps as
 many as 34 lines of code changes here, and that's only if
 complicating the type handling doesn't require any changes to
 places that are untouched at present, which I suspect it would.
 
So I stepped through all the changes of this type, and I notice that
most of them are in areas where we've talked about likely benefits
of creating new FlexLock variants instead of staying with LWLocks;
if any of that is done (as seems likely), it further reduces the
impact from 34 lines.  If we take care of LWLockHeldByMe() as you
describe, I'll concede the FlexLockId changes.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ISN was: Core Extensions relocation

2011-11-16 Thread Josh Berkus

On 11/15/11 7:40 PM, Tom Lane wrote:
 Peter Geoghegan pe...@2ndquadrant.com writes:
 If we can't put isn out of its misery, a sensible compromise would be
 to rip out the prefix enforcement feature so that, for example, ISBN13
 behaved exactly the same as EAN13.
 
 That might be a reasonable compromise.  Certainly the check-digit
 calculation is much more useful for validity checking than the prefix
 checking.

Sounds good to me.  FWIW, I know that ISBN is being used for some
library software, so a backwards-compatible fix would be very desirable.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding Node support in outfuncs.c and readfuncs.c

2011-11-16 Thread Peter Eisentraut

On ons, 2011-11-09 at 20:14 +0100, Dimitri Fontaine wrote:
 The task in $subject is something I will have to do repeatedly for
 completing the Command Trigger patch.  I've been doing some of them
 manually, covering initdb.  Then I've been scripting away the editing.
 
 The script takes a Node number as input (because that's what you're
 given in ERROR messages) and as an output will edit outfuncs.c and
 readfuncs.c for you.
 
 That's only intended as a developer friendly help, not a part of the
 build process or the like, and I've been writing that in Emacs Lisp --
 that's what make most sense to me (I don't do perl). Now, I intend to
 be
 maintaining the script if needs be, and it could be useful for others
 too.
 
 What about adding that into src/tools/editors/pgsrc.el?

This is a massive amount of code that very few people in our community
will use, and very few be able to maintain it, too.  If you want to make
a lasting contribution on this area, write a program that generates the
node handling functionality automatically as part of the build process.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding Node support in outfuncs.c and readfuncs.c

2011-11-16 Thread Dimitri Fontaine

Peter Eisentraut pete...@gmx.net writes:
 This is a massive amount of code that very few people in our community
 will use, and very few be able to maintain it, too.  If you want to make
 a lasting contribution on this area, write a program that generates the
 node handling functionality automatically as part of the build process.

Can emacs --batch be used there?  If not, apart from C and perl, what
can I use?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Patch to allow users to kill their own queries

2011-11-16 Thread Edward Muller

Looking for comments ...

https://gist.github.com/be937d3a7a5323c73b6e

We'd like to get this, or something like it, into 9.2

PS Subscribing to psql-hack...@postgresql.org just spins for me.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch to allow users to kill their own queries

2011-11-16 Thread Kevin Grittner

Edward Muller edw...@heroku.com wrote:
 
 Looking for comments ...
 
 https://gist.github.com/be937d3a7a5323c73b6e
 
 We'd like to get this, or something like it, into 9.2
 
If you want it to be seriously considered, you should post the patch
to this list, which makes it part of the permanent archives and
indicates your willingness to place the code under the PostgreSQL
license.
 
http://wiki.postgresql.org/wiki/Submitting_a_Patch#Patch_submission
 
Unfortunately, you're a day late in terms of getting it reviewed in
the just-started CommitFest, but another will start in two months.
 
https://commitfest.postgresql.org/action/commitfest_view/open
 
Of course, you're free to talk about it until then, and perhaps
someone will take a look at it before the next CF; but a lot of
attention will be focused on the patches submitted by the start of
the current CF.
 
Any chance you could help review patches in the CF in progress, to
help get others' time freed up sooner?:
 
https://commitfest.postgresql.org/action/commitfest_view/inprogress
 
It's a good way to become familiar with the process, so that you can
better move your own patch along.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor optimisation of XLogInsert()

2011-11-16 Thread Robert Haas

On Wed, Nov 16, 2011 at 9:33 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Nov 15, 2011 at 6:24 AM, Simon Riggs si...@2ndquadrant.com wrote:
 Patch adds cacheline padding around parts of XLogCtl.

 Idea from way back, but never got performance tested in a situation
 where WALInsertLock was the bottleneck.

 So this is speculative, in need of testing to investigate value.

 I kicked off a round of benchmarking around this.  I'll post results
 in a few hours when the run finishes.

I thought that the best way to see any possible benefit of this patch
would be to use permanent tables (since unlogged tables won't
generated enough XLogInsert traffic to throttle the system) and lots
of clients.  So I ran tests at 32 clients and at 80 clients.  There
appeared to be a very small speedup.

resultswp.master.32:tps = 10275.238580 (including connections establishing)
resultswp.master.32:tps = 10231.634195 (including connections establishing)
resultswp.master.32:tps = 10220.907852 (including connections establishing)
resultswp.master.32:tps = 10115.534399 (including connections establishing)
resultswp.master.32:tps = 10048.268430 (including connections establishing)
resultswp.xloginsert-padding.32:tps = 10310.046371 (including
connections establishing)
resultswp.xloginsert-padding.32:tps = 10269.839056 (including
connections establishing)
resultswp.xloginsert-padding.32:tps = 10259.268030 (including
connections establishing)
resultswp.xloginsert-padding.32:tps = 10242.861923 (including
connections establishing)
resultswp.xloginsert-padding.32:tps = 10167.947496 (including
connections establishing)

Taking the median of those five results, the patch seems to have sped
things up by 0.3%.  At 80 clients:

resultswp.master.80:tps = 7540.388586 (including connections establishing)
resultswp.master.80:tps = 7502.803369 (including connections establishing)
resultswp.master.80:tps = 7469.338925 (including connections establishing)
resultswp.master.80:tps = 7505.377256 (including connections establishing)
resultswp.master.80:tps = 7509.402704 (including connections establishing)
resultswp.xloginsert-padding.80:tps = 7541.305973 (including
connections establishing)
resultswp.xloginsert-padding.80:tps = 7568.942936 (including
connections establishing)
resultswp.xloginsert-padding.80:tps = 7533.128618 (including
connections establishing)
resultswp.xloginsert-padding.80:tps = 7558.258839 (including
connections establishing)
resultswp.xloginsert-padding.80:tps = 7562.585635 (including
connections establishing)

Again taking the median of the five results, that's about an 0.7% speedup.

I also tried this over top of my flexlock patch.  I figured that the
benefit ought to be greater there, because with ProcArrayLock
contention reduced, WALInsertLock contention should be the whole
banana.  But:

resultswp.flexlock.32:tps = 11628.279679 (including connections establishing)
resultswp.flexlock.32:tps = 11556.254524 (including connections establishing)
resultswp.flexlock.32:tps = 11606.840931 (including connections establishing)
resultswp.flexlock-xloginsert-padding.32:tps = 11357.904459 (including
connections establishing)
resultswp.flexlock-xloginsert-padding.32:tps = 11226.538104 (including
connections establishing)
resultswp.flexlock-xloginsert-padding.32:tps = 11187.932415 (including
connections establishing)

That's about a 3.3% slowdown.

resultswp.flexlock.80:tps = 11560.508772 (including connections establishing)
resultswp.flexlock.80:tps = 11673.255674 (including connections establishing)
resultswp.flexlock.80:tps = 11597.229252 (including connections establishing)
resultswp.flexlock-xloginsert-padding.80:tps = 11092.549726 (including
connections establishing)
resultswp.flexlock-xloginsert-padding.80:tps = 11179.927207 (including
connections establishing)
resultswp.flexlock-xloginsert-padding.80:tps = 2.771332 (including
connections establishing)

That's even a little worse.

One possible explanation is that I don't believe that what you've done
here is actually sufficient to guarantee alignment on cache-line
boundary - ShmemAlloc() only guarantees MAXALIGN alignment unless the
allocation is at least BLCKSZ bytes.  So depending on exactly how much
shared memory gets allocated before XLogCtlData gets allocated, it's
possible that the change could actually end up splitting all of the
things you're trying to put on their own cache line across two cache
lines.  Might be worth fixing that and running this through its paces
again.

(I wonder if we shouldn't just align every shared memory allocation to
64 or 128 bytes.  It wouldn't waste much memory and it would make us
much more resistant to performance changes caused by minor
modifications to the shared memory layout.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] When do we lose column names?

2011-11-16 Thread Andrew Dunstan


Consider the following query:

select (x).key as k, (x).value as v
from (select each(hstore(q)) as x
  from (select oid, c.*
from pg_class c
where oid = 'parent'::regclass) q) t;

This gives results like this:

 k  |   v
-+
 f1  | 16410
 f2  | parent
 f3  | 2200
 f4  | 16412
 f5  | 0
 f6  | 10


Now add a limit clause to the innermost query:

select (x).key as k, (x).value as v
from (select each(hstore(q)) as x
  from (select oid, c.*
from pg_class c
where oid = 'parent'::regclass
limit ) q) t;

Now the result looks like this:

   k|   v
+
 oid| 16410
 relam  | 0
 relacl |
 relkind| r
 relname| parent
 reltype| 16412


What I'm having difficulty understanding is why the limit clause should
make any difference.

Is this a bug? If not, is it documented.

cheers

andrew


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] IDLE in transaction introspection

2011-11-16 Thread Scott Mead

On Tue, Nov 15, 2011 at 1:18 PM, Robert Treat r...@xzilla.net wrote:

 On Tue, Nov 15, 2011 at 12:00 PM, Greg Smith g...@2ndquadrant.com wrote:
  On 11/15/2011 09:44 AM, Scott Mead wrote:
 
  Fell off the map last week, here's v2 that:
   * RUNNING = active
   * all states from CAPS to lower case
 
 
  This looks like what I was hoping someone would add here now.  Patch
 looks
  good, only issue I noticed was a spaces instead of a tab goof where you
 set
  beentry_st_state at line 2419 in src/backend/postmaster/pgstat.c
 
  Missing pieces:
 
  -There is one regression test that uses pg_stat_activity that is broken
 now.
  -The documentation should list the new field and all of the states it
 might
  include.  That's a serious doc update from the minimal info available
 right
  now.
 
  I know this issue has been beat up already some, but let me summarize and
  extend that thinking a moment.  I see two equally valid schools of
 thought
  on how exactly to deal with introducing this change:
 
  -Add the new state field just as you've done it, but keep updating the
 query
  text anyway.  Do not rename current_query.  Declare the overloading of
  current_query as both a state and the query text to be deprecated in 9.3.
   This would keep existing tools working fine against 9.2 and give a clean
  transition period.
 
  -Forget about backward compatibility and just put all the breaking stuff
  we've been meaning to do in here.  If we're going to rename
 current_query to
  query--what Scott's patch does here--that will force all code using
  pg_stat_activity to be rewritten.  This seems like the perfect time to
 also
  change procpid to pid, finally blow away that wart.
 

 +1

+1 for me as well.

 Anybody else?



  I'll happily update all of the tools and samples I deal with to support
 this
  change.  Most of the ones I can think of will be simplified; they're
 already
  parsing query_text and extracting the implicit state.  Just operating on
 an
  explicit one instead will be simpler and more robust.
 

 lmk if you need help, we'll be doing this with some of our tools /
 projects anyway, it probably wouldn't hurt to coordinate.


 Robert Treat
 conjecture: xzilla.net
 consulting: omniti.com

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch to allow users to kill their own queries

2011-11-16 Thread Edward Muller

On Wed, Nov 16, 2011 at 12:06 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Edward Muller edw...@heroku.com wrote:

 Looking for comments ...

 https://gist.github.com/be937d3a7a5323c73b6e

 We'd like to get this, or something like it, into 9.2

 If you want it to be seriously considered, you should post the patch
 to this list, which makes it part of the permanent archives and
 indicates your willingness to place the code under the PostgreSQL
 license.

inline 


 http://wiki.postgresql.org/wiki/Submitting_a_Patch#Patch_submission

I think this was done right.



[snip]


 Any chance you could help review patches in the CF in progress, to
 help get others' time freed up sooner?:

 https://commitfest.postgresql.org/action/commitfest_view/inprogress

Well, I'm totally new to the code base, I don't write in *C* very
often (probably obvious from my patch). Maybe over time


 It's a good way to become familiar with the process, so that you can
 better move your own patch along.

 -Kevin



user_signal.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch to allow users to kill their own queries

2011-11-16 Thread Daniel Farina

On Wed, Nov 16, 2011 at 12:06 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Edward Muller edw...@heroku.com wrote:

 Looking for comments ...

 https://gist.github.com/be937d3a7a5323c73b6e

 We'd like to get this, or something like it, into 9.2

On Wed, Nov 16, 2011 at 12:06 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Edward Muller edw...@heroku.com wrote:

 Looking for comments ...

 https://gist.github.com/be937d3a7a5323c73b6e

 We'd like to get this, or something like it, into 9.2

As it would turn out, a patch for this has already been submitted:

http://archives.postgresql.org/pgsql-hackers/2011-10/msg1.php

There was some wrangling on whether it needs to be extended to be
useful, but for our purposes the formulation already posted already
captures vital value for us, and in that form appears to be fairly
uncontentious. I have moved it to the current commitfest, with a
comment linking to the 'please revive this patch' thread whereby a
second glance at what to do about this was conducted.  The link
follows:

https://commitfest.postgresql.org/action/patch_view?id=541

 If you want it to be seriously considered, you should post the patch
 to this list, which makes it part of the permanent archives and
 indicates your willingness to place the code under the PostgreSQL
 license.

Although technical mailing lists are not primarily a place of
reflection and sensitivity, I do think that wording addressed to a new
participant could have been kinder.  Perhaps, Unfortunately we cannot
accept or even read your patch because of licensing concerns, would
you please follow the following patch submission guidelines? link.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] declarations of range-vs-element @ and @

2011-11-16 Thread Tom Lane

I wrote:
 Why do these use anynonarray rather than anyelement?  Given that we
 support ranges of arrays (there's even a regression test), this seems
 a bogus limitation.

After experimenting with changing that, I see why you did it: some of
the regression tests fail, eg,

  SELECT * FROM array_index_op_test WHERE i @ '{38,34,32,89}' ORDER BY seqno;
  ERROR:  operator is not unique: integer[] @ unknown

That is, if we have both anyarray @ anyarray and anyelement @ anyrange
operators, the parser is unable to decide which one is a better match to
integer[] @ unknown.  However, restricting @ to not work for ranges
over arrays is a pretty horrid fix for that, because there is simply not
any access to the lost functionality.  It'd be better IMO to fail here
and require the unknown literal to be cast explicitly than to do this.

But what surprises me about this example is that I'd have expected the
heuristic assume the unknown is of the same type as the other input
to resolve it.  Looking more closely, I see that we apply that heuristic
in such a way that it works only for exact operator matches, not for
matches requiring coercion (including polymorphic-type matches).  This
seems a bit weird.  I propose adding a step to func_select_candidate
that tries to resolve things that way, ie, if all the known-type inputs
have the same type, then try assuming that the unknown-type ones are of
that type, and see if that leads to a unique match.  There actually is a
comment in there that claims we do that, but the code it's attached to
is really doing something else that involves preferred types within
type categories...

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [Review] Include detailed information about a row failing a CHECK constraint into the error message

2011-11-16 Thread Royce Ausburn

The patch applies cleanly to the current git master and is in context diff format.The patch fails the regression tests because it is outputting new DETAIL line which four of tests aren't expecting. The tests will need to be updated.Functionality:The patch works as advertised. An insert or update that fails a check condition indeed logs the row that has failed:test=#  create table test (test(#   a integer,test(#   b integer CHECK (b  2),test(#   c text,test(#   d integertest(#test(#test(#  );CREATE TABLEtest=#test=#  insert into test select 1, 2, 'Test', 4;ERROR: new row for relation "test" violates check constraint "test_b_check"DETAIL: Failing row: (1, 2, Test, 4).One comment I have on the output is that strings are not in quotes. It's a little jarring, but might not be that big a deal. A contrived case that is pretty confusing:test=#  insert into test select 1, 2, '3, 4', 4;ERROR: new row for relation "test" violates check constraint "test_b_check"DETAIL: Failing row: (1, 2, 3, 4, 4).A select inserting 4 columns seemingly results in a 5 column row ;)Another super minor thing, postgres doesn't seem to put periods at the end of log messages, yet this new detail line does.CodeI'm no C or postgres expert, but the code looks okay to me.Attached is a small script I used to test/demo the functionality.--Royce

testCheckConstraints.sql
Description: Binary data
On 11/11/2011, at 2:56 AM, Jan Kundrát wrote:On 11/10/11 16:05, Tom Lane wrote:I agree with Jan that this is probably useful; I'm pretty sure therehave been requests for it before. We just have to make sure that thelength of the message stays in bounds.One tip for keeping the length down: there is no value in repeatinginformation from the primary error message, such as the name of theconstraint.Thanks to your comments and suggestions, I appreciate the time of thereviewers.Attached is a second version of this patch which keeps the size of theoutput at 64 characters per column (which is an arbitrary value definedas a const int, which I hope matches your style). Longer values havetheir last three characters replaced by "...", so there's no way todistinguish them from a legitimate string that ends with just that.There's also no escaping of special-string values, similar to how theBuildIndexValueDescription operates.Cheers,Jan-- Trojita, a fast e-mail client -- http://trojita.flaska.net/context_in_check_constraints-v2.patch

Re: [HACKERS] Minor optimisation of XLogInsert()

2011-11-16 Thread Simon Riggs

On Wed, Nov 16, 2011 at 8:42 PM, Robert Haas robertmh...@gmail.com wrote:

 Taking the median of those five results, the patch seems to have sped
 things up by 0.3%.  At 80 clients:

Thanks for doing that. Even if we fix it as you suggest it seems to
indicate that the WALInsertLock contention is real rather than false
contention. Which lends some clues as to how to proceed.

 One possible explanation is that I don't believe that what you've done
 here is actually sufficient to guarantee alignment on cache-line
 boundary - ShmemAlloc() only guarantees MAXALIGN alignment unless the
 allocation is at least BLCKSZ bytes.  So depending on exactly how much
 shared memory gets allocated before XLogCtlData gets allocated, it's
 possible that the change could actually end up splitting all of the
 things you're trying to put on their own cache line across two cache
 lines.  Might be worth fixing that and running this through its paces
 again.

 (I wonder if we shouldn't just align every shared memory allocation to
 64 or 128 bytes.  It wouldn't waste much memory and it would make us
 much more resistant to performance changes caused by minor
 modifications to the shared memory layout.)

I'll look at that, thanks for the suggestion.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Are range_before and range_after commutator operators?

2011-11-16 Thread Tom Lane

I noticed that  and  are not marked as commutator operators,
though a naive view of their semantics suggests they should be.
However, I realized that there might be edge cases I wasn't thinking
about, so I went looking in the patch to try to confirm this.  And
I found neither a single line of documentation about it, nor a single
comment in that hairy little nest of unobvious tests that calls itself
range_cmp_bounds.  I am of the opinion that that routine not only
requires a comment, but very possibly a comment longer than the routine
itself.  What's more, if it's this complicated to code, surely it would
be a good idea for the user-facing documentation to explain exactly
what we think before/after mean?

In general, the level of commenting in the rangetypes code seems far short
of what I'd consider acceptable for Postgres code.  I plan to fix some
of that myself, but I do not wish to reverse-engineer what the heck
range_cmp_bounds thinks it's doing.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Configuration include directory

2011-11-16 Thread Greg Jaskiewicz


On 16 Nov 2011, at 04:53, Greg Smith wrote:
 
 -Called by specifying includedir directory.  No changes to the shipped 
 postgresql.conf yet.
 -Takes an input directory name
Very useful idea. 

What will happen if I specify:

includedir './'

Ie, what about potential cyclic dependency. 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Regression tests fail once XID counter exceeds 2 billion

2011-11-16 Thread Tom Lane

I wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 We need a function called transactionid_current() so a normal user can write

 select virtualtransaction
 from pg_locks
 where transactionid = transactionid_current()

 and have it just work.

 That would solve that one specific use-case.  The reason I suggested
 txid_from_xid is that it could also be used to compare XIDs seen in
 tuples to members of a txid_snapshot, which is not possible now.

BTW, a pgsql-general question just now made me realize that
txid_from_xid() could have another use-case too.  Right now, there are
no inequality comparisons on XIDs, which is necessary because XIDs in
themselves don't have a total order.  However, you could

ORDER BY txid_from_xid(xmin)

and it would work, ie, give you rows in their XID order.  This could be
useful for finding the latest-modified rows in a table, modulo the fact
that it would be ordering by transaction start time not commit time.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Configuration include directory

2011-11-16 Thread Andrew Dunstan

On Wed, November 16, 2011 6:45 pm, Greg Jaskiewicz wrote:

 On 16 Nov 2011, at 04:53, Greg Smith wrote:

 -Called by specifying includedir directory.  No changes to the
 shipped postgresql.conf yet.
 -Takes an input directory name
 Very useful idea.

 What will happen if I specify:

 includedir './'

 Ie, what about potential cyclic dependency.



I would vote for it only to handle plain files (possibly softlinked) in
the named directory.

cheers

andrew


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ISN was: Core Extensions relocation

2011-11-16 Thread Tom Lane

Josh Berkus j...@agliodbs.com writes:
 On 11/15/11 7:40 PM, Tom Lane wrote:
 Peter Geoghegan pe...@2ndquadrant.com writes:
 If we can't put isn out of its misery, a sensible compromise would be
 to rip out the prefix enforcement feature so that, for example, ISBN13
 behaved exactly the same as EAN13.

 That might be a reasonable compromise.  Certainly the check-digit
 calculation is much more useful for validity checking than the prefix
 checking.

 Sounds good to me.  FWIW, I know that ISBN is being used for some
 library software, so a backwards-compatible fix would be very desirable.

How backwards-compatible do you mean?  Do you think we need to keep the
existing prefix-check logic as an option, or can we just drop it and be
done?

(I'm a bit concerned about the angle that the code has some smarts
currently about where to put hyphens in output.  If we rip that out
it could definitely break application code, whereas dropping a check
shouldn't.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding Node support in outfuncs.c and readfuncs.c

2011-11-16 Thread Tom Lane

Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 Peter Eisentraut pete...@gmx.net writes:
 This is a massive amount of code that very few people in our community
 will use, and very few be able to maintain it, too.

It's not that massive, at least not as it stands, although I agree it
looks distressingly write-only.

 If you want to make
 a lasting contribution on this area, write a program that generates the
 node handling functionality automatically as part of the build process.

 Can emacs --batch be used there?  If not, apart from C and perl, what
 can I use?

You can *not* assume that emacs is available in any random build
environment; and not C either, because it might be a cross-compile.
It'd have to be Perl.

FWIW, even though I use emacs exclusively, I have little or no interest
in this approach myself.  I don't think the node functions are as
boilerplate as you think --- for one thing, how will you deal with
typedef'd field types?  (Assuming any unknown type name is a node type
is not right.)  And even ignoring that, there are always exceptions to
any general rule.

If Peter is seriously suggesting that construction of the backend/nodes
files could be automated entirely, I think he's nuts.  The amount of
work to construct a bulletproof tool (if it's possible at all) would
greatly outweigh the benefit.  What you've got here could be useful
to people who use emacs and understand they've got to hand-check the
results.  I'm not sure how much further it'd be useful to go.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor optimisation of XLogInsert()

2011-11-16 Thread Robert Haas

On Wed, Nov 16, 2011 at 5:15 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Wed, Nov 16, 2011 at 8:42 PM, Robert Haas robertmh...@gmail.com wrote:
 Taking the median of those five results, the patch seems to have sped
 things up by 0.3%.  At 80 clients:

 Thanks for doing that. Even if we fix it as you suggest it seems to
 indicate that the WALInsertLock contention is real rather than false
 contention. Which lends some clues as to how to proceed.

Sure thing.

My general impression from playing around with this over the last 6-7
months is that cache line sharing is just not that big a problem for
us.  In a case like WALInsertLock, I'm fairly certain that we're
actually putting processes to sleep on a regular basis - and the
overhead of a system call to go to sleep and another one to wake up
and the associated context switches dwarfs the cost of passing the
cache line around.  It's far more important to get rid of the sleeping
(which, sadly, is far harder than padding out the data structures).

There are some cases where the cache line really does seem to matter -
e.g. Pavan's PGPROC_MINIMAL patch delivers excellent results on
Itanium - but those seem to be fairly rate FWICT.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Configuration include directory

2011-11-16 Thread Tom Lane

Andrew Dunstan and...@dunslane.net writes:
 On Wed, November 16, 2011 6:45 pm, Greg Jaskiewicz wrote:
 What will happen if I specify:
 includedir './'

 I would vote for it only to handle plain files (possibly softlinked) in
 the named directory.

I think Greg's point is that that would lead to again reading
postgresql.conf, and then again processing the includedir directive,
lather rinse repeat till stack overflow.

Now one view of this is that we already expect postgresql.conf to only
be writable by responsible adults, so if a DBA breaks his database this
way he has nobody but himself to blame.  But still, if there's a simple
way to define that risk away, it wouldn't be a bad thing.

(Do we guard against recursive inclusion via plain old include?  If
not, maybe this isn't worth worrying about.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] When do we lose column names?

2011-11-16 Thread Tom Lane

Andrew Dunstan and...@dunslane.net writes:
 What I'm having difficulty understanding is why the limit clause should
 make any difference.

Without the LIMIT, the query gets flattened to something like this:

 Index Scan using pg_class_oid_index on pg_catalog.pg_class c
(cost=0.00..8.27 rows=1 width=202)
   Output: ROW(c.oid, c.relname, c.relnamespace, c.reltype, c.reloftype, 
c.relowner, c.relam, c.relfilenode, c.reltablespace, c.relpages, c.reltuples, 
c.relallvisible, c.reltoastrelid, c.reltoastidxid, c.relhasindex, 
c.relisshared, c.relpersistence, c.relkind, c.relnatts, c.relchecks, 
c.relhasoids, c.relhaspkey, c.relhasrules, c.relhastriggers, c.relhassubclass, 
c.relfrozenxid, c.relacl, c.reloptions)
   Index Cond: (c.oid = 53532::oid)

and the issue seems to be that in execution of a RowExpr, the
executor doesn't pay any attention to preserving the column names
in the generated tupledesc --- see the ExecTypeFromExprList call
in execQual.c.

We could certainly make it do that --- it wouldn't even be terribly
hard, since RowExpr already does store the column names.  The only
downside I can see is that this might lead to more transient rowtypes
being kept around in a backend, since RowExprs with distinct field
names would now lead to different blessed rowtypes.  But that doesn't
seem like a big deal.  It was just never apparent before that we should
care about field names in a tupledesc at execution time.

I'm disinclined to consider this a back-patchable bug fix; it seems
possible that somebody out there is depending on the current behavior.
But we could think about changing it in HEAD.

(wanders off to look at whether the only other caller of
ExecTypeFromExprList could be taught to provide useful field names...)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ISN was: Core Extensions relocation

2011-11-16 Thread Josh Berkus


 (I'm a bit concerned about the angle that the code has some smarts
 currently about where to put hyphens in output.  If we rip that out
 it could definitely break application code, whereas dropping a check
 shouldn't.)

Right.  Do we need to dump the hyphen logic?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ISN was: Core Extensions relocation

2011-11-16 Thread Peter Geoghegan

On 17 November 2011 02:32, Josh Berkus j...@agliodbs.com wrote:

 (I'm a bit concerned about the angle that the code has some smarts
 currently about where to put hyphens in output.  If we rip that out
 it could definitely break application code, whereas dropping a check
 shouldn't.)

 Right.  Do we need to dump the hyphen logic?

Only if you think that it's better to have something that covers many
cases but is basically broke. Perhaps it's different for code that is
already committed, but in the case of new submissions it tends to be
better to have something that is limited in a well-understood way
rather than less limited in a way that is unpredictable or difficult
to reason about.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] When do we lose column names?

2011-11-16 Thread Tom Lane

I wrote:
 and the issue seems to be that in execution of a RowExpr, the
 executor doesn't pay any attention to preserving the column names
 in the generated tupledesc --- see the ExecTypeFromExprList call
 in execQual.c. ...
 We could certainly make it do that --- it wouldn't even be terribly
 hard, since RowExpr already does store the column names. ...
 (wanders off to look at whether the only other caller of
 ExecTypeFromExprList could be taught to provide useful field names...)

PFC, a patch that does this.  This seems to fix Andrew's issue with
respect to the RowExpr case.  It's not really ideal with respect to
the ValuesScan case, because what you get seems to always be the
hard-wired columnN names for VALUES columns, even if you try to
override that with an alias:

regression=# select each(hstore(q)) as x
  from (values (1,2,3),(4,5,6) limit 2) as q(x,y,z);
  x  
-
 (column1,1)
 (column2,2)
 (column3,3)
 (column1,4)
 (column2,5)
 (column3,6)
(6 rows)

I think this happens because VALUES in a FROM item is treated like a
sub-select, and the aliases are getting applied at the wrong level.
Don't know if that's worth trying to fix, or how hard it would be.
Curiously, it works just fine if the VALUES can be folded:

regression=# select each(hstore(q)) as x
  from (values (1,2,3),(4,5,6)) as q(x,y,z);
   x   
---
 (x,1)
 (y,2)
 (z,3)
 (x,4)
 (y,5)
 (z,6)
(6 rows)


regards, tom lane

diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c
index 887e5ce82a0b9011627450fdd1046d5529beb110..8f0b5979ba589a1e3aa10405671d4c0793361c5e 100644
*** a/src/backend/executor/execQual.c
--- b/src/backend/executor/execQual.c
*** ExecInitExpr(Expr *node, PlanState *pare
*** 4627,4633 
  if (rowexpr-row_typeid == RECORDOID)
  {
  	/* generic record, use runtime type assignment */
! 	rstate-tupdesc = ExecTypeFromExprList(rowexpr-args);
  	BlessTupleDesc(rstate-tupdesc);
  	/* we won't need to redo this at runtime */
  }
--- 4627,4634 
  if (rowexpr-row_typeid == RECORDOID)
  {
  	/* generic record, use runtime type assignment */
! 	rstate-tupdesc = ExecTypeFromExprList(rowexpr-args,
! 		   rowexpr-colnames);
  	BlessTupleDesc(rstate-tupdesc);
  	/* we won't need to redo this at runtime */
  }
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index 3f44ef0186ace8434ceaaf2ef37c14ca0fec632d..3ed90da7a52e2d0e982cfe0ef29022ac1ce1a4a5 100644
*** a/src/backend/executor/execTuples.c
--- b/src/backend/executor/execTuples.c
*** ExecTypeFromTLInternal(List *targetList,
*** 954,980 
  /*
   * ExecTypeFromExprList - build a tuple descriptor from a list of Exprs
   *
!  * Here we must make up an arbitrary set of field names.
   */
  TupleDesc
! ExecTypeFromExprList(List *exprList)
  {
  	TupleDesc	typeInfo;
! 	ListCell   *l;
  	int			cur_resno = 1;
! 	char		fldname[NAMEDATALEN];
  
  	typeInfo = CreateTemplateTupleDesc(list_length(exprList), false);
  
! 	foreach(l, exprList)
  	{
! 		Node	   *e = lfirst(l);
! 
! 		sprintf(fldname, f%d, cur_resno);
  
  		TupleDescInitEntry(typeInfo,
  		   cur_resno,
! 		   fldname,
  		   exprType(e),
  		   exprTypmod(e),
  		   0);
--- 954,981 
  /*
   * ExecTypeFromExprList - build a tuple descriptor from a list of Exprs
   *
!  * Caller must also supply a list of field names (String nodes).
   */
  TupleDesc
! ExecTypeFromExprList(List *exprList, List *namesList)
  {
  	TupleDesc	typeInfo;
! 	ListCell   *le;
! 	ListCell   *ln;
  	int			cur_resno = 1;
! 
! 	Assert(list_length(exprList) == list_length(namesList));
  
  	typeInfo = CreateTemplateTupleDesc(list_length(exprList), false);
  
! 	forboth(le, exprList, ln, namesList)
  	{
! 		Node	   *e = lfirst(le);
! 		char	   *n = strVal(lfirst(ln));
  
  		TupleDescInitEntry(typeInfo,
  		   cur_resno,
! 		   n,
  		   exprType(e),
  		   exprTypmod(e),
  		   0);
diff --git a/src/backend/executor/nodeValuesscan.c b/src/backend/executor/nodeValuesscan.c
index d5260e40b32abe5dc5ff437cda8e037f6cdbe12a..0089e501fa2e43c10a1d530110d1fa5f38144cd9 100644
*** a/src/backend/executor/nodeValuesscan.c
--- b/src/backend/executor/nodeValuesscan.c
***
*** 25,30 
--- 25,31 
  
  #include executor/executor.h
  #include executor/nodeValuesscan.h
+ #include parser/parsetree.h
  
  
  static TupleTableSlot *ValuesNext(ValuesScanState *node);
*** ValuesScanState *
*** 188,193 
--- 189,196 
  ExecInitValuesScan(ValuesScan *node, EState *estate, int eflags)
  {
  	ValuesScanState *scanstate;
+ 	RangeTblEntry *rte = rt_fetch(node-scan.scanrelid,
+   estate-es_range_table);
  	TupleDesc	tupdesc;
  	ListCell   *vtl;
  	int			i;
*** ExecInitValuesScan(ValuesScan *node, ESt
*** 239,245 
  	/*
  	 * get info about values list
  	 */
! 	tupdesc =

Re: [HACKERS] When do we lose column names?

2011-11-16 Thread Tom Lane

I wrote:
 PFC, a patch that does this.

Upon further review, this patch would need some more work even for the
RowExpr case, because there are several places that build RowExprs
without bothering to build a valid colnames list.  It's clearly soluble
if anyone cares to put in the work, but I'm not personally excited
enough to pursue it ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ISN was: Core Extensions relocation

2011-11-16 Thread Tom Lane

Peter Geoghegan pe...@2ndquadrant.com writes:
 On 17 November 2011 02:32, Josh Berkus j...@agliodbs.com wrote:
 Right.  Do we need to dump the hyphen logic?

 Only if you think that it's better to have something that covers many
 cases but is basically broke. Perhaps it's different for code that is
 already committed, but in the case of new submissions it tends to be
 better to have something that is limited in a well-understood way
 rather than less limited in a way that is unpredictable or difficult
 to reason about.

Well, as was stated upthread, we might have bounced this module in toto
if it were submitted today.  But contrib/isn has been there since 2006,
and its predecessor contrib/isbn_issn was there since 1998, and both of
those submissions came from (different) people who needed the
functionality bad enough to write it.  It's not reasonable to suppose
that nobody is using it today.  Ergo, we can't just summarily break
backwards compatibility on the grounds that we don't like the design.
Heck, we don't even have a field bug report that the design limitation
is causing any real problems for real users ... so IMO, the claims that
this is dangerously broken are a bit overblown.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor optimisation of XLogInsert()

2011-11-16 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 (I wonder if we shouldn't just align every shared memory allocation to
 64 or 128 bytes.  It wouldn't waste much memory and it would make us
 much more resistant to performance changes caused by minor
 modifications to the shared memory layout.)

I could get behind this idea if we had a reasonably clear idea of the
hardware's cache line width, but AFAIK there is no portable way to
identify that.  (This is a pretty fatal objection to Simon's original
patch as well...)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FlexLocks

2011-11-16 Thread Pavan Deolasee

On Tue, Nov 15, 2011 at 7:20 PM, Robert Haas robertmh...@gmail.com wrote:
 The lower layer I called FlexLocks,
 and it's designed to allow a variety of locking implementations to be
 built on top of it and reuse as much of the basic infrastructure as I
 could figure out how to make reusable without hurting performance too
 much.  LWLocks become the anchor client of the FlexLock system; in
 essence, most of flexlock.c is code that was removed from lwlock.c.
 The second patch, procarraylock.c, uses that infrastructure to define
 a new type of FlexLock specifically for ProcArrayLock.  It basically
 works like a regular LWLock, except that it has a special operation to
 optimize ProcArrayEndTransaction().  In the uncontended case, instead
 of acquiring and releasing the lock, it just grabs the lock, observes
 that there is no contention, clears the critical PGPROC fields (which
 isn't noticeably slower than updating the state of the lock would be)
 and releases the spin lock.

(Robert, we already discussed this a bit privately, so apologies for
duplicating this here)

Another idea is to have some sort of shared work queue mechanism which
might turn out to be more manageable and extendable. What I am
thinking about is having a {Request, Response} kind of structure per
backend in shared memory. An obvious place to hold them is in PGPROC
for every backend. We the have a new API like LWLockExecute(lock,
mode, ReqRes). The caller first initializes the ReqRes structure with
the work it needs get done and then calls LWLockExecute with that.
IOW, the code flow would look like this:

Initialize the Req/Res structure with request type and input data
LWLockExecute(lock, mode, ReqRes)
Consume Response and proceed further

If the lock is available in the desired mode, LWLockExecute() will
internally finish the work and return immediately. If the lock is
contended, the process would sleep. When current holder of the lock
finishes its work and calls LWLockRelease() to release the lock, it
would not only find the processes to wake up, but would also go
through their pending work items and complete them before waking them
up. The Response area will be populated with the result.

I think this general mechanism will be useful for many users of
LWLock, especially those who do very trivial updates/reads from the
shared area, but still need synchronization. One example that Robert
has already found helping a lot if ProcArrayEndTransaction. Also, even
though both shared and exclusive waiters can use this mechanism, it
may make more sense to the exclusive waiters because of the
exclusivity. For sake of simplicity, we can choose to force a
semantics that when LWLockExecute returns, the work is guaranteed to
be done, either by self or some other backend. That will keep the code
simpler for users of this new API.

Thanks,
Pavan
-- 
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor optimisation of XLogInsert()

2011-11-16 Thread Robert Haas

On Wed, Nov 16, 2011 at 11:11 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 (I wonder if we shouldn't just align every shared memory allocation to
 64 or 128 bytes.  It wouldn't waste much memory and it would make us
 much more resistant to performance changes caused by minor
 modifications to the shared memory layout.)

 I could get behind this idea if we had a reasonably clear idea of the
 hardware's cache line width, but AFAIK there is no portable way to
 identify that.  (This is a pretty fatal objection to Simon's original
 patch as well...)

I don't think it's very important to know the exact size.  As far as I
can tell, x64 is 64 bytes and Itanium and Power are 128 bytes.  If you
optimize for those, you'll also handle any smaller size (that's a
power of two, without which a lot of things we do are wrongheaded)
without wasting much memory.  If you run into hardware with a giant
256-byte or large cache line, you might have some sharing, but you
can't win 'em all.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2011-11-16 Thread Noah Misch

Hi Gabriele,

On Fri, Nov 04, 2011 at 01:48:02PM +0100, Gabriele Bartolini wrote:
 CREATE TABLE pt (
   id INTEGER PRIMARY KEY,
   ...
 );

 CREATE TABLE ft (
   id SERIAL PRIMARY KEY,
   pids INTEGER[] REFERENCES pt,
   ...
 );

This seems useful.

I'm assuming the SQL spec says nothing about a feature like this?

 This patch is for discussion and has been built against HEAD.
 It compiles and passes all regressions tests (including specific ones -  
 see the src/test/regress/sql/foreign_key.sql file).
 Empty arrays, multi-dimensional arrays, duplicate elements and NULL  
 values are allowed.

With this patch, RI_Initial_Check does not detect a violation in an array that
contains at least one conforming element:

BEGIN;
CREATE TABLE parent (c int PRIMARY KEY);
CREATE TABLE child  (c int[]);
INSERT INTO parent VALUES (1);
INSERT INTO child VALUES ('{3,1,2}');
ALTER TABLE child ADD FOREIGN KEY (c) REFERENCES parent; -- should error
INSERT INTO child VALUES ('{3,1,2}'); -- does error, as expected
ROLLBACK;

The error message DETAIL on constraint violation would benefit from
array-FK-specific language.  Example of current message:

ERROR:  insert or update on table child violates foreign key constraint 
child_c_fkey
DETAIL:  Key (c)=({3,1,2}) is not present in table parent.


The patch is missing a change to the code that does FK=FK checks when a user
updates the FK side:

\set VERBOSITY verbose
BEGIN;
CREATE TABLE parent (c int PRIMARY KEY);
CREATE TABLE child  (c int[] REFERENCES parent);
INSERT INTO parent VALUES (1);
INSERT INTO child VALUES ('{1,1}');
COMMIT;
-- ERROR:  XX000: no conversion function from integer[] to integer
-- LOCATION:  ri_HashCompareOp, ri_triggers.c:4097
UPDATE child SET c = '{1,1}';
DROP TABLE parent, child;
COMMIT;

Please audit each ri_triggers.c entry point for further problems like this.

 We had to enforce some limitations, due to the lack (yet) of a clear and  
 universally accepted behaviour and strategy.
 For example, consider the ON DELETE action on the above tables: in case  
 of delete of a record in the 'pt' table, should we remove the whole row  
 or just the values from the array?
 We hope we can start a discussion from here.

Removing values from the array seems best to me.  There's no doubt about what
ON UPDATE CASCADE should do, and having ON DELETE CASCADE excise individual
array elements is consistent with that.  It's less clear for SET NULL, but I'd
continue with a per-element treatment.  I'd continue to forbid SET DEFAULT.

However, Jeff Davis did expect ON DELETE CASCADE to remove entire rows:
http://archives.postgresql.org/message-id/1288119207.15279.24.camel@jdavis-ux.asterdata.local
So, perhaps the behavior needs to be user-selectable.

 Current limitations:

 * Only arrays of the same type as the primary key in the referenced  
 table are supported

This is understandable for a WIP, but the final patch will need to use our
existing, looser foreign key type match requirement.

 * multi-column foreign keys are not supported (only single column)

Any particular reason for this?

 *** a/doc/src/sgml/ddl.sgml
 --- b/doc/src/sgml/ddl.sgml
 ***
 *** 764,769  CREATE TABLE order_items (
 --- 764,796 
   the last table.
  /para
   
 +para
 + Another option you have with foreign keys is to use a referencing column
 + which is an array of elements with the same type as the referenced 
 column
 + in the related table. This feature, also known as firsttermforeign 
 key arrays/firstterm,
 + is described in the following example:

Please wrap your documentation paragraphs.

 *** a/src/backend/commands/tablecmds.c
 --- b/src/backend/commands/tablecmds.c
 ***
 *** 5705,5710  ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation 
 rel,
 --- 5705,5735 
   Oid ffeqop;
   int16   eqstrategy;
   
 + /* Check if foreign key is an array of primary key types */
 + const bool  is_foreign_key_array = (fktype == 
 get_array_type (pktype));

We don't declare non-pointer, local variables const.  Also, [not positive on
this one] when an initial assignment requires a comment, declare the variable
with no assignment and no comment.  Then, assign it later with the comment.
This keeps the per-block declarations packed together.


This test wrongly rejects FK types that are domains over the array type:

BEGIN;
CREATE TABLE parent (c int PRIMARY KEY);
CREATE DOMAIN intarrdom AS int[];
CREATE TABLE child  (c intarrdom REFERENCES parent);
ROLLBACK;

 + 
 + /* Enforce foreign key array restrictions */
 + if (is_foreign_key_array)
 + {
 + /*
 +  * Foreign key array must not be part of a multi-column 
 foreign key
 +  */
 + if (is_foreign_key_array  numpks  1)
 + ereport(ERROR,
 +

Re: [HACKERS] FlexLocks

2011-11-16 Thread Robert Haas

On Wed, Nov 16, 2011 at 11:16 PM, Pavan Deolasee
pavan.deola...@gmail.com wrote:
 On Tue, Nov 15, 2011 at 7:20 PM, Robert Haas robertmh...@gmail.com wrote:
 The lower layer I called FlexLocks,
 and it's designed to allow a variety of locking implementations to be
 built on top of it and reuse as much of the basic infrastructure as I
 could figure out how to make reusable without hurting performance too
 much.  LWLocks become the anchor client of the FlexLock system; in
 essence, most of flexlock.c is code that was removed from lwlock.c.
 The second patch, procarraylock.c, uses that infrastructure to define
 a new type of FlexLock specifically for ProcArrayLock.  It basically
 works like a regular LWLock, except that it has a special operation to
 optimize ProcArrayEndTransaction().  In the uncontended case, instead
 of acquiring and releasing the lock, it just grabs the lock, observes
 that there is no contention, clears the critical PGPROC fields (which
 isn't noticeably slower than updating the state of the lock would be)
 and releases the spin lock.

 (Robert, we already discussed this a bit privately, so apologies for
 duplicating this here)

 Another idea is to have some sort of shared work queue mechanism which
 might turn out to be more manageable and extendable. What I am
 thinking about is having a {Request, Response} kind of structure per
 backend in shared memory. An obvious place to hold them is in PGPROC
 for every backend. We the have a new API like LWLockExecute(lock,
 mode, ReqRes). The caller first initializes the ReqRes structure with
 the work it needs get done and then calls LWLockExecute with that.
 IOW, the code flow would look like this:

 Initialize the Req/Res structure with request type and input data
 LWLockExecute(lock, mode, ReqRes)
 Consume Response and proceed further

 If the lock is available in the desired mode, LWLockExecute() will
 internally finish the work and return immediately. If the lock is
 contended, the process would sleep. When current holder of the lock
 finishes its work and calls LWLockRelease() to release the lock, it
 would not only find the processes to wake up, but would also go
 through their pending work items and complete them before waking them
 up. The Response area will be populated with the result.

 I think this general mechanism will be useful for many users of
 LWLock, especially those who do very trivial updates/reads from the
 shared area, but still need synchronization. One example that Robert
 has already found helping a lot if ProcArrayEndTransaction. Also, even
 though both shared and exclusive waiters can use this mechanism, it
 may make more sense to the exclusive waiters because of the
 exclusivity. For sake of simplicity, we can choose to force a
 semantics that when LWLockExecute returns, the work is guaranteed to
 be done, either by self or some other backend. That will keep the code
 simpler for users of this new API.

I am not convinced that that's a better API.  I mean, consider
something like this:

/*
 * OK, let's do it.  First let other backends know I'm in ANALYZE.
 */
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
MyProc-vacuumFlags |= PROC_IN_ANALYZE;
LWLockRelease(ProcArrayLock);

I'm not sure exactly how you'd proposed to rewrite that, but I think
it's almost guaranteed to be more than three lines of code.  Also, you
can't assume that the work can be done equally well by any backend.
In this case it could, because the PGPROC structures are all in shared
memory, but that won't work for something like GetSnapshotData(),
which needs to copy a nontrivial amount of data into backend-local
memory.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2011-11-16 Thread Pavel Stehule

Hello

2011/11/17 Noah Misch n...@leadboat.com:
 Hi Gabriele,

 On Fri, Nov 04, 2011 at 01:48:02PM +0100, Gabriele Bartolini wrote:
 CREATE TABLE pt (
   id INTEGER PRIMARY KEY,
   ...
 );

 CREATE TABLE ft (
   id SERIAL PRIMARY KEY,
   pids INTEGER[] REFERENCES pt,
   ...
 );

 This seems useful.


will be supported situation

CREATE TABLE main(
  id int[] PRIMARY KEY,
  ...
);

CREATE TABLE child(
  main_id int[] REFERENCES main(id),

??

Regards

Pavel Stehule

 I'm assuming the SQL spec says nothing about a feature like this?

 This patch is for discussion and has been built against HEAD.
 It compiles and passes all regressions tests (including specific ones -
 see the src/test/regress/sql/foreign_key.sql file).
 Empty arrays, multi-dimensional arrays, duplicate elements and NULL
 values are allowed.

 With this patch, RI_Initial_Check does not detect a violation in an array that
 contains at least one conforming element:

 BEGIN;
 CREATE TABLE parent (c int PRIMARY KEY);
 CREATE TABLE child  (c int[]);
 INSERT INTO parent VALUES (1);
 INSERT INTO child VALUES ('{3,1,2}');
 ALTER TABLE child ADD FOREIGN KEY (c) REFERENCES parent; -- should error
 INSERT INTO child VALUES ('{3,1,2}'); -- does error, as expected
 ROLLBACK;

 The error message DETAIL on constraint violation would benefit from
 array-FK-specific language.  Example of current message:

 ERROR:  insert or update on table child violates foreign key constraint 
 child_c_fkey
 DETAIL:  Key (c)=({3,1,2}) is not present in table parent.


 The patch is missing a change to the code that does FK=FK checks when a user
 updates the FK side:

 \set VERBOSITY verbose
 BEGIN;
 CREATE TABLE parent (c int PRIMARY KEY);
 CREATE TABLE child  (c int[] REFERENCES parent);
 INSERT INTO parent VALUES (1);
 INSERT INTO child VALUES ('{1,1}');
 COMMIT;
 -- ERROR:  XX000: no conversion function from integer[] to integer
 -- LOCATION:  ri_HashCompareOp, ri_triggers.c:4097
 UPDATE child SET c = '{1,1}';
 DROP TABLE parent, child;
 COMMIT;

 Please audit each ri_triggers.c entry point for further problems like this.

 We had to enforce some limitations, due to the lack (yet) of a clear and
 universally accepted behaviour and strategy.
 For example, consider the ON DELETE action on the above tables: in case
 of delete of a record in the 'pt' table, should we remove the whole row
 or just the values from the array?
 We hope we can start a discussion from here.

 Removing values from the array seems best to me.  There's no doubt about what
 ON UPDATE CASCADE should do, and having ON DELETE CASCADE excise individual
 array elements is consistent with that.  It's less clear for SET NULL, but I'd
 continue with a per-element treatment.  I'd continue to forbid SET DEFAULT.

 However, Jeff Davis did expect ON DELETE CASCADE to remove entire rows:
 http://archives.postgresql.org/message-id/1288119207.15279.24.camel@jdavis-ux.asterdata.local
 So, perhaps the behavior needs to be user-selectable.

 Current limitations:

 * Only arrays of the same type as the primary key in the referenced
 table are supported

 This is understandable for a WIP, but the final patch will need to use our
 existing, looser foreign key type match requirement.

 * multi-column foreign keys are not supported (only single column)

 Any particular reason for this?

 *** a/doc/src/sgml/ddl.sgml
 --- b/doc/src/sgml/ddl.sgml
 ***
 *** 764,769  CREATE TABLE order_items (
 --- 764,796 
       the last table.
      /para

 +    para
 +     Another option you have with foreign keys is to use a referencing 
 column
 +     which is an array of elements with the same type as the referenced 
 column
 +     in the related table. This feature, also known as firsttermforeign 
 key arrays/firstterm,
 +     is described in the following example:

 Please wrap your documentation paragraphs.

 *** a/src/backend/commands/tablecmds.c
 --- b/src/backend/commands/tablecmds.c
 ***
 *** 5705,5710  ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation 
 rel,
 --- 5705,5735 
               Oid                     ffeqop;
               int16           eqstrategy;

 +             /* Check if foreign key is an array of primary key types */
 +             const bool              is_foreign_key_array = (fktype == 
 get_array_type (pktype));

 We don't declare non-pointer, local variables const.  Also, [not positive on
 this one] when an initial assignment requires a comment, declare the variable
 with no assignment and no comment.  Then, assign it later with the comment.
 This keeps the per-block declarations packed together.


 This test wrongly rejects FK types that are domains over the array type:

 BEGIN;
 CREATE TABLE parent (c int PRIMARY KEY);
 CREATE DOMAIN intarrdom AS int[];
 CREATE TABLE child  (c intarrdom REFERENCES parent);
 ROLLBACK;

 +
 +             /* Enforce foreign key array restrictions */
 +             if (is_foreign_key_array)
 +             {
 +

Re: [HACKERS] FlexLocks

2011-11-16 Thread Pavan Deolasee

On Thu, Nov 17, 2011 at 10:01 AM, Robert Haas robertmh...@gmail.com wrote:


 I am not convinced that that's a better API.  I mean, consider
 something like this:

    /*
     * OK, let's do it.  First let other backends know I'm in ANALYZE.
     */
    LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
    MyProc-vacuumFlags |= PROC_IN_ANALYZE;
    LWLockRelease(ProcArrayLock);

 I'm not sure exactly how you'd proposed to rewrite that, but I think
 it's almost guaranteed to be more than three lines of code.

I would guess the ReqRes will look something like this where
ReqResRequest/Response would probably be union of all various requests
and responses, one for each type of request:

struct ReqRes {
  ReqResRequestType  reqtype;
  ReqResRequest req;
  ReqResResponse  res;
}

The code above can be rewritten as:

reqRes.reqtype = RR_PROC_SET_VACUUMFLAGS;
reqRes.req.set_vacuumflags.flags =  PROC_IN_ANALYZE;
LWLockExecute(ProcArrayLock, LW_EXCLUSIVE, reqRes);

I mean, I agree it doesn't look exactly elegant and the number of
requests types and their handling may go up a lot, but we need to do
this only for those heavily contended locks. Other callers can
continue with the current code style. But with this general
infrastructure, there will be still be a way to do this.

  Also, you
 can't assume that the work can be done equally well by any backend.
 In this case it could, because the PGPROC structures are all in shared
 memory, but that won't work for something like GetSnapshotData(),
 which needs to copy a nontrivial amount of data into backend-local
 memory.

Yeah, I am not suggesting we should do (even though I think it should
be possible with appropriate input/output data) this everywhere. But
places where this can done, like end-transaction stuff, the
infrastructure might be quite useful.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ISN was: Core Extensions relocation

2011-11-16 Thread Josh Berkus

On 11/16/11 7:54 PM, Tom Lane wrote:
 Heck, we don't even have a field bug report that the design limitation
 is causing any real problems for real users ... so IMO, the claims that
 this is dangerously broken are a bit overblown.

This is why I mentioned clients using ISBN in production.  I have yet to
see any actual breakage in the field.  Granted, both clients are
US-only.   I don't believe either of these clients is depending on the
prefix-check, and that's the sort of thing we could announce in the
release notes.

I do get the feeling that Peter got burned by ISN, given his vehemence
in erasing it from the face of the earth.  So that's one bug report.  ;-)

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2011-11-16 Thread Tom Lane

Noah Misch n...@leadboat.com writes:
 On Fri, Nov 04, 2011 at 01:48:02PM +0100, Gabriele Bartolini wrote:
 CREATE TABLE pt (
 id INTEGER PRIMARY KEY,
 
 CREATE TABLE ft (
 id SERIAL PRIMARY KEY,
 pids INTEGER[] REFERENCES pt,

 I'm assuming the SQL spec says nothing about a feature like this?

I'm pretty certain that the SQL spec flat out forbids this.

The least we could do is invent some non-spec syntax that makes the
intention clear, rather than having the system assume that an error case
was intended to mean something else.  Maybe

pids INTEGER[] ARRAY REFERENCES pt,

or something like that.  (ARRAY is a fully reserved word already,
so I think this syntax should work, but I've not tried it.)

BTW, has anyone thought through whether this is a sane idea at all?
It seems to me to be full of cases that will require rather arbitrary
decisions, like whether ON DELETE CASCADE should involve deleting the
whole row or just one array element.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2011-11-16 Thread Josh Berkus


 BTW, has anyone thought through whether this is a sane idea at all?
 It seems to me to be full of cases that will require rather arbitrary
 decisions, like whether ON DELETE CASCADE should involve deleting the
 whole row or just one array element.

One array element, presumably.

Does the patch implement CASCADE?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2011-11-16 Thread Pavel Stehule

2011/11/17 Tom Lane t...@sss.pgh.pa.us:
 Noah Misch n...@leadboat.com writes:
 On Fri, Nov 04, 2011 at 01:48:02PM +0100, Gabriele Bartolini wrote:
 CREATE TABLE pt (
 id INTEGER PRIMARY KEY,

 CREATE TABLE ft (
 id SERIAL PRIMARY KEY,
 pids INTEGER[] REFERENCES pt,

 I'm assuming the SQL spec says nothing about a feature like this?

 I'm pretty certain that the SQL spec flat out forbids this.

 The least we could do is invent some non-spec syntax that makes the
 intention clear, rather than having the system assume that an error case
 was intended to mean something else.  Maybe

        pids INTEGER[] ARRAY REFERENCES pt,

 or something like that.  (ARRAY is a fully reserved word already,
 so I think this syntax should work, but I've not tried it.)

+1

Regards

Pavel Stehule


 BTW, has anyone thought through whether this is a sane idea at all?
 It seems to me to be full of cases that will require rather arbitrary
 decisions, like whether ON DELETE CASCADE should involve deleting the
 whole row or just one array element.

                        regards, tom lane

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2011-11-16 Thread Josh Berkus

Folks,

BTW, I don't want to block this patch.  However, it occurs to me that a
more generalized FK based on non-equality conditions (i.e. expressions)
would be nice if it were possible.  Then we could have FKs from all
kinds of complex structures.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2011-11-16 Thread Jaime Casanova

On Wed, Nov 16, 2011 at 11:28 PM, Noah Misch n...@leadboat.com wrote:

 Removing values from the array seems best to me.  There's no doubt about what
 ON UPDATE CASCADE should do, and having ON DELETE CASCADE excise individual
 array elements is consistent with that.  It's less clear for SET NULL, but I'd
 continue with a per-element treatment.  I'd continue to forbid SET DEFAULT.

 However, Jeff Davis did expect ON DELETE CASCADE to remove entire rows:
 http://archives.postgresql.org/message-id/1288119207.15279.24.camel@jdavis-ux.asterdata.local
 So, perhaps the behavior needs to be user-selectable.


i will agree with Jeff on this...

i mean, on the normal case it will delete the row. no?

the docs says about the CASCADE action

CASCADE
Delete any rows referencing the deleted row, or update the value of
the referencing column to the new value of the referenced column,
respectively.


so, that is what i will expect

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2011-11-16 Thread Tom Lane

Josh Berkus j...@agliodbs.com writes:
 BTW, has anyone thought through whether this is a sane idea at all?
 It seems to me to be full of cases that will require rather arbitrary
 decisions, like whether ON DELETE CASCADE should involve deleting the
 whole row or just one array element.

 One array element, presumably.

Um, why?  One reasonable interpretation of an array reference is that
the row depends on *all* of the referenced pkeys.  Also, if you do
delete one array element at a time, what do you do when the array
becomes empty --- delete the row, or not, and in each case what's your
semantic justification for that choice?

In short, presumably doesn't cut it here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

80 matches

Mail list logo