Re: [HACKERS] Has anyone tried out the PL/pgSQL debugger?

2007-09-05 Thread Dave Page
John DeSoi wrote:
 In some simple tests it seems to work OK with pgAdmin (1.8b3) on OS X.
 There appears to be a pgAdmin bug when you start a debug session on a
 function that has no parameters:
 
 ERROR: syntax error at or near )
 LINE 1: SELECT * FROM myschema.myfunction)
^

That's odd - I cannot reproduce that on OS X using beta 4 (which has no
important changes in the debugger over beta 3).

Can you provide a simple test case?

Regards, Dave

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Per-function GUC settings: trickier than it looked

2007-09-05 Thread tomas
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Wed, Sep 05, 2007 at 02:13:57PM +1000, Brendan Jurd wrote:
 On 9/5/07, Tom Lane [EMAIL PROTECTED] wrote:
  Brendan Jurd [EMAIL PROTECTED] writes:
   Am I on the right page?
 
  Got it in one, I believe.
 
 In that case, +1 for your proposed changes.
 
 At first, like Florian, I found the idea of a SET LOCAL ever
 persisting beyond a function astonishing, but that's because I was
 approaching the term LOCAL from a programming frame of mind, not an
 SQL one [...]

As an unqualified POV, seeing that this got at least two people confused
- -- wouldn't it make sense to be more verbose and call the thing SET
TRANSACTION LOCAL (not just TRANSACTION, which is ambiguous as we have
already seen). May be even SET LOCAL TO TRANSACTION (that gives at least
some room for possible extensibility).

I know too little about the parser to have even an idea whether this
would be feasible at all.

Regards
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFG3mFABcgs9XrR2kYRAug1AJ9FJdFEjDGpYWSj09+LgRv218efdwCcDBR8
kjE8O+QCdD/DMntr6mjHBoA=
=FI+2
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Has anyone tried out the PL/pgSQL debugger?

2007-09-05 Thread John DeSoi

Hi Dave,

On Sep 5, 2007, at 3:54 AM, Dave Page wrote:

That's odd - I cannot reproduce that on OS X using beta 4 (which  
has no

important changes in the debugger over beta 3).

Can you provide a simple test case?


I'll try to come up with a simple test case and send it sometime this  
evening. Possible hint: the function had no IN parameters, but many  
OUT parameters.


John




John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] HEAD build troubles, buildfarm misconfigurations

2007-09-05 Thread Robert Treat
On Wednesday 05 September 2007 00:06, Tom Lane wrote:
 Robert Treat [EMAIL PROTECTED] writes:
  I get the following error during make (configure runs fine)
 
  /usr/include/netdb.h:560: error: syntax error before =E2=80=98[=E2=80=99
  to= ken

 Which line of netdb.h is that exactly?  I'm thinking we've redefined
 some name as a macro that's breaking the include file, but I don't see
 which name exactly ...


Sorry, it was the line int __unused[5]; 

 BTW, on re-reading that, it seems a tad surprising to get an error right
 there --- if postgres_fe.h or anything it includes were broken, then the
 build should have failed earlier.  Is the failure reproducible in
 repeated tries?


Yes.  Just for giggles I played with the configure flags as well, but it 
always errors in that spot.  

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Has anyone tried out the PL/pgSQL debugger?

2007-09-05 Thread korry.douglas


Is there any documentation that describes how to use the SQL 
functions? Some are obvious enough, but a simple example showing a 
debugging session would be helpful.


I'll add that to the README file and let you know when it's ready.

-- Korry

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] pl/pgsql function result cache

2007-09-05 Thread Peter Manchev

I was thinking about pl/pgsql and comparing it with pl/sql function
result cache, featured in Oracle 11g - see
http://www.oracle.com/technology/oramag/oracle/07-sep/o57plsql.html

Is it possible to get pl/pgsql function result cache functionality into 
PostgreSQL real soon?

Best,
Peter

_
Discover the new Windows Vista
http://search.msn.com/results.aspx?q=windows+vistamkt=en-USform=QBRE

Re: [HACKERS] pl/pgsql function result cache

2007-09-05 Thread Richard Huxton

Peter Manchev wrote:
I was thinking about pl/pgsql and comparing it with pl/sql function 
result cache, featured in Oracle 11g - see 
http://www.oracle.com/technology/oramag/oracle/07-sep/o57plsql.html


Is it possible to get pl/pgsql function result cache functionality
into PostgreSQL real soon?


Well it's unlikely to be real soon since 8.3 is currently approaching 
beta-test.


Surely memcached+pgmemcache does basically this anyway, except:
- it's not restricted to function outputs
- you can cache application objects
- you can spread your cache across multiple machines

--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] pl/pgsql function result cache

2007-09-05 Thread Josh Berkus
Richard,

 Surely memcached+pgmemcache does basically this anyway, except:
 - it's not restricted to function outputs
 - you can cache application objects
 - you can spread your cache across multiple machines

Sure, but using memcached is far from automatic ... you'd have to roll your 
own scheme for caching previously called functions, and there would be no way 
to make it performant for iterative execution due to needing to make a 
pgmemcached call to check for cached results with each execution.  

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] HEAD build troubles, buildfarm misconfigurations

2007-09-05 Thread Tom Lane
Robert Treat [EMAIL PROTECTED] writes:
 On Wednesday 05 September 2007 00:06, Tom Lane wrote:
 BTW, on re-reading that, it seems a tad surprising to get an error right
 there --- if postgres_fe.h or anything it includes were broken, then the
 build should have failed earlier.  Is the failure reproducible in
 repeated tries?

 Yes.  Just for giggles I played with the configure flags as well, but it 
 always errors in that spot.  

Hmph.  I checked the CVS logs and there haven't been any recent changes
that seem like they could be related.

It might be worth getting gcc -E output and looking to see what that
part of netdb.h looks like after macro expansion.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Oddity with psql \d and pg_table_is_visible

2007-09-05 Thread Decibel!
I have a database where I'd created a copy of pg_class in public.
pgAdmin shows that the table exists, but \d doesn't. This is because of
how pg_table_is_visible works, specifically this comment:

/*
 * If it is in the path, it might still not be visible; it could be
 * hidden by another relation of the same name earlier in the path. So
 * we must do a slow check for conflicting relations.
 */

While this is correct on a per-relation level, I'm thinking that it's
not what we'd really like to have happen in psql. What I'd like \d to do
is show me everything in any schema that's in my search_path, even if
there's something higher in the search_path that would over-ride it.
ISTM that's what most people would expect out of \d.

If no one objects I'll come up with a patch for this.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpW1fVBGsZXy.pgp
Description: PGP signature


Re: [HACKERS] Final background writer cleanup for 8.3

2007-09-05 Thread Greg Smith

On Tue, 4 Sep 2007, Josh Berkus wrote:


In about 200 benchmark test runs, I don't feel like we ever came up with a
set of bgwriter settings we'd happily recommend to others.  SO it's hard
for me to tell whether this is true or not.


Are you talking about 200 runs with 8.2.4 or 8.3?  If you've collected a 
bunch of 8.3 data, that's something I haven't been able to do; if what 
you're saying is that you never found settings with 8.2.4 that you'd 
recommend, that's consistant with what I was saying.



Can you send me the current version of the patch, plus some bgwriter
settings to try with it, so we can throw it on some of the Sun benchmarks?


Am in the middle of wrapping this up today, will send out a patch for 
everyone to try shortly.  Tests are done, patch is done for now, just 
writing the results up and making my tests reproducible.  I had some 
unexpected inspiration the other day that dragged things out, but with 
useful improvements.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Final background writer cleanup for 8.3

2007-09-05 Thread Josh Berkus
Greg,

 Are you talking about 200 runs with 8.2.4 or 8.3?  

8.2.4.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[HACKERS] loose ends in lazy-XID-assigment patch

2007-09-05 Thread Tom Lane
I've committed Florian's patch, but there remain a couple of things
that need work:

* Should CSV-mode logging include the virtual transaction ID (VXID) in
addition to, or instead of, XID?  There will be many situations where
there is no XID.

* As things stand, when a two-phase transaction is prepared, it drops
its lock on the original VXID; this seems necessary for reasons
previously discussed.  I made the code put an invalid VXID into the
gxact structure for the prepared xact, which means that pg_locks shows
things like this:

regression=# select * from pg_locks;
   locktype| database | relation | page | tuple | virtualxid | 
transactionid | classid | objid | objsubid | virtualtransaction |  pid  |  
mode   | granted 
---+--+--+--+---++---+-+---+--++---+-+-
 transactionid |  |  |  |   || 
21774 | |   |  | -1/0   |   | ExclusiveLock 
  | t
 relation  |   126093 |   126124 |  |   ||  
 | |   |  | -1/0   |   | AccessShareLock | t
 relation  |   126093 |10969 |  |   ||  
 | |   |  | 1/260  | 20592 | AccessShareLock | t
 virtualxid|  |  |  |   | 1/260  |  
 | |   |  | 1/260  | 20592 | ExclusiveLock   | t
(4 rows)

This seems fairly undesirable :-( not least because you can't tell one
prepared xact from another and thus can't see which locks belong to
each.  But I'm unsure what to do about it.  We could have the prepared
xact continue to display the original VXID, but there would be no
certainty about the VXID remaining unique, which seems bad.  Another
possibility is to put back the transaction ID column, but since that's
not unique for read-only transactions, we still don't have anything
usable as a join key.

The best idea I can think of is to make the virtualtransaction column
read out the VXID for regular transactions and the transaction ID for
prepared transactions, or maybe the transaction ID for any transaction
that has one and VXID just for read-only xacts.  We can get away with
that because the column is only text and not any better-defined
datatype.  It seems mighty ugly though; and changing the ID shown for
a transaction mid-stream isn't very pleasant either.

Anyone have a better idea?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Should pointers to PGPROC be volatile-qualified?

2007-09-05 Thread Tom Lane
There are a bunch of places where we do things like

PGPROC *proc = arrayP-procs[index];
/* Fetch xid just once - see GetNewTransactionId */
TransactionId pxid = proc-xid;

... use pxid several times ...

In the case where use pxid involves a function call, this is probably
safe enough, because the compiler can't assume that the called function
won't access and modify the PGPROC.  However, if use pxid is straight
line code, I am suddenly wondering if the C compiler could think it can
get away with loading proc-xid more than once instead of expending a
register on it.  As far as it knows there isn't any way for the value
to change underneath it.

The correct fix for this is probably to make the fetch happen through a
volatile-qualified pointer, eg

volatile PGPROC *proc = arrayP-procs[index];
/* Fetch xid just once - see GetNewTransactionId */
TransactionId pxid = proc-xid;

I'm wondering how far to go with that.  In the extreme we could try to
make MyProc and all other PGPROC pointers volatile-qualified; is that
overkill?

Or maybe I'm worried over nothing.  I can't recall any bug reports that
seem like they could be tied to such a thing, and given that we can only
set, not clear, MyProc-xid and xmin without exclusive lock, there might
not actually be a bug here.  But it seems a bit risky.

Comments?  Does anyone think the C standard forbids what I'm worried
about?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] loose ends in lazy-XID-assigment patch

2007-09-05 Thread Andrew Dunstan



Tom Lane wrote:

* Should CSV-mode logging include the virtual transaction ID (VXID) in
addition to, or instead of, XID?  There will be many situations where
there is no XID.


  


But will there be any where we care? Isn't the point of this to restrict 
allocation of a real XID to situations where having one might actually 
matter? All the rest seems to me just bookkeeping.


cheers

andrew

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Final background writer cleanup for 8.3

2007-09-05 Thread Greg Smith

On Wed, 5 Sep 2007, Josh Berkus wrote:


Are you talking about 200 runs with 8.2.4 or 8.3?

8.2.4.


Right, then we're in agreement here.  I did something like 4000 small test 
runs with dozens of settings under various 8.2.X releases and my 
conclusion was that in the general case, it just didn't work at reducing 
checkpoint spikes the way it was supposed to.  Your statement that you 
never found a set of bgwriter settings we'd happily recommend to others 
was also the case for me.


While there certainly are some cases where we've heard about people whose 
workloads were such that the background writer worked successfully for 
them, I consider those lucky rather than normal.  I'd like those people to 
test 8.3 because I'd hate to see the changes made to improve the general 
case cause a regression for them.


You are certainly spot-on that this causes a bit of a problem for testing 
8.3 in beta, because if you come from a world-view where the 8.2.4 
background writer was never successful it's hard to figure out a starting 
point for comparing it to the one in 8.3.  Maybe I'll spark some ideas 
when I get the rest of my data out here soon.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] loose ends in lazy-XID-assigment patch

2007-09-05 Thread Florian G. Pflug

Tom Lane wrote:

I've committed Florian's patch, but there remain a couple of things
that need work:

* Should CSV-mode logging include the virtual transaction ID (VXID) in
addition to, or instead of, XID?  There will be many situations where
there is no XID.

Maybe make %x show both, or only the xid if that is set, and the vxid
otherwise? That would probably be what most existing users of %x want.
For those who want them seperated, we'd have %v (vxid), and maybe
%X (xid only). Seems a bit like overkills, though...



* As things stand, when a two-phase transaction is prepared, it drops
its lock on the original VXID; this seems necessary for reasons
previously discussed.  I made the code put an invalid VXID into the
gxact structure for the prepared xact, which means that pg_locks shows
things like this:

regression=# select * from pg_locks;
   locktype| database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction |  pid  |  mode   | granted 
---+--+--+--+---++---+-+---+--++---+-+-

 transactionid |  |  |  |   || 
21774 | |   |  | -1/0   |   | ExclusiveLock 
  | t
 relation  |   126093 |   126124 |  |   ||  
 | |   |  | -1/0   |   | AccessShareLock | t
 relation  |   126093 |10969 |  |   ||  
 | |   |  | 1/260  | 20592 | AccessShareLock | t
 virtualxid|  |  |  |   | 1/260  |  
 | |   |  | 1/260  | 20592 | ExclusiveLock   | t
(4 rows)

This seems fairly undesirable :-( not least because you can't tell one
prepared xact from another and thus can't see which locks belong to
each.  But I'm unsure what to do about it.  We could have the prepared
xact continue to display the original VXID, but there would be no
certainty about the VXID remaining unique, which seems bad.  Another
possibility is to put back the transaction ID column, but since that's
not unique for read-only transactions, we still don't have anything
usable as a join key.

The best idea I can think of is to make the virtualtransaction column
read out the VXID for regular transactions and the transaction ID for
prepared transactions, or maybe the transaction ID for any transaction
that has one and VXID just for read-only xacts.  We can get away with
that because the column is only text and not any better-defined
datatype.  It seems mighty ugly though; and changing the ID shown for
a transaction mid-stream isn't very pleasant either.


We could make the VXID in the gxact struct be
backendId=InvalidBackendId, lxid=xid. That'd be still an invalid vxid, but not
the same for every prepared transaction.

If we take this further, we could get rid of the lock on the xid completely,
I believe. We'd define some PermanentBackendId (lets say, -2, since -1 is
taken). When preparing the xact, we'd drop the lock on the old VXID, and
instead acquire one on (PermanentBackendId, xid). Waiting on an xid would
become a bit tricky, but doable I think. We'd have to first check the procarray
- if we find the xid there, we translate it to a vxid, and wait on that.
Aftwards (whether we found a vxid, or not) we wait on (PermanentBackendId, xid).
That doesn't exactly make XactLockTableWait cheaper, but that might be OK,
since we

I haven't really thought this through, though. I think that with carefull
ordering of things we can control the race conditions this might posses -
but I'm not sure at this point.

greetings, Florian Pflug


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Should pointers to PGPROC be volatile-qualified?

2007-09-05 Thread Brian Hurt

Tom Lane wrote:


Comments?  Does anyone think the C standard forbids what I'm worried
about?
 



My understanding of the C spec is that it explicitly *allows* for 
exactly what you're afraid of.  It's even possible if the uses include 
function calls, as the compiler might inline the function calls.


The downside of litering the code with volatile qualifications is that 
it's an optimization stopper.  For example, if proc is declared 
volatile, the compiler couldn't merge multiple different proc-foo 
references into a single load into a register.


Note that all sorts of weirdnesses are possible when you have shared 
mutable state between multiple different threads.  For example, assume 
you have two threads, and two global ints x and y, initially both 0.  
Thread 1 do:

   y = 1;
   r1 = x;
(where r1 is a local variable in thread 1), while thread 2 does:
   x = 1;
   r2 = y;
(with r2 being a local variable in thread 2).

Here's the thing: both r1 and r2 can end up 0!  I've seen this in real 
code.  What happens is that the compiler notices that in both cases, the 
load and stores are independent, so it can reorder them.  And as loads 
tend to be expensive, and nothing can progress until the load completes, 
it moves the loads up before the stores, assuming the program won't 
notice.  Unfortunately, it does, as the impossible can then happen.


Brian


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] loose ends in lazy-XID-assigment patch

2007-09-05 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 * Should CSV-mode logging include the virtual transaction ID (VXID) in
 addition to, or instead of, XID?  There will be many situations where
 there is no XID.

 But will there be any where we care? Isn't the point of this to restrict 
 allocation of a real XID to situations where having one might actually 
 matter? All the rest seems to me just bookkeeping.

Well, the point is that a transaction might well emit log messages when
it hasn't changed anything and hence hasn't got an XID.  If you would
like to be able to match up the messages generated by a single
transaction, you'd need to have VXID available to use as a join key.

In any case it seems a bit odd that we have a value that's available in
log_line_prefix and not available to users of CSV log output.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] loose ends in lazy-XID-assigment patch

2007-09-05 Thread Tom Lane
Florian G. Pflug [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 This seems fairly undesirable :-( not least because you can't tell one
 prepared xact from another and thus can't see which locks belong to
 each.  But I'm unsure what to do about it.

 We could make the VXID in the gxact struct be
 backendId=InvalidBackendId, lxid=xid. That'd be still an invalid vxid, but not
 the same for every prepared transaction.

Hmm, that would work.

 If we take this further, we could get rid of the lock on the xid completely,

Maybe, but let's not go there for now.  I was already bending the rules
to push this into 8.3 --- I think further improvement needs to wait for
8.4.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Should pointers to PGPROC be volatile-qualified?

2007-09-05 Thread Tom Lane
Brian Hurt [EMAIL PROTECTED] writes:
 Note that all sorts of weirdnesses are possible when you have shared 
 mutable state between multiple different threads.

Yeah.  In the majority of places this isn't a big problem because
access to shared memory looks like

LWLockAcquire(some_lock);
... mess with shared state ...
LWLockRelease(some_lock);

and as long as the LWLock functions are extern and not inlineable
all is well.  But we do need to be more careful in places where
we're violating that coding rule, and the various stuff that looks
at or changes xid and xmin is violating it.

What I'm inclined to do for now is put volatile into those places in
procarray.c where there seems to be a risk.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Oddity with psql \d and pg_table_is_visible

2007-09-05 Thread Tom Lane
Decibel! [EMAIL PROTECTED] writes:
 While this is correct on a per-relation level, I'm thinking that it's
 not what we'd really like to have happen in psql. What I'd like \d to do
 is show me everything in any schema that's in my search_path, even if
 there's something higher in the search_path that would over-ride it.
 ISTM that's what most people would expect out of \d.

I don't agree with that reasoning in the least, particularly not if you
intend to fix it by redefining pg_table_is_visible() ...

What will happen if we change \d to work that way is that it will show
you a table, and you'll try to access it, and you'll get the wrong table
because the access will go to the one that really is visible.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Reducing Transaction Start/End Contention

2007-09-05 Thread Simon Riggs
On Mon, 2007-07-30 at 20:20 +0100, Simon Riggs wrote:

 Jignesh Shah's scalability testing on Solaris has revealed further
 tuning opportunities surrounding the start and end of a transaction.
 Tuning that should be especially important since async commit is likely
 to allow much higher transaction rates than were previously possible.
 
 There is strong contention on the ProcArrayLock in Exclusive mode, with
 the top path being CommitTransaction(). This becomes clear as the number
 of connections increases, but it seems likely that the contention can be
 caused in a range of other circumstances. My thoughts on the causes of
 this contention are that the following 3 tasks contend with each other
 in the following way:
 
 CommitTransaction(): takes ProcArrayLock Exclusive
 but only needs access to one ProcArray element
 
 waits for
 
 GetSnapshotData():ProcArrayLock Shared
 ReadNewTransactionId():XidGenLock Shared
 
 which waits for
 
 GetNextTransactionId()
 takes XidGenLock Exclusive
 ExtendCLOG(): takes ClogControlLock Exclusive, WALInsertLock Exclusive
 two possible place where I/O is required
 ExtendSubtrans(): takes SubtransControlLock()
 one possible place where I/O is required
 Avoids lock on ProcArrayLock: atomically updates one ProcArray element
 
 
 or more simply:
 
 CommitTransaction() -- i.e. once per transaction
 waits for
 GetSnapshotData() -- i.e. once per SQL statement
 which waits for
 GetNextTransactionId() -- i.e. once per transaction
 
 This gives some goals for scalability improvements and some proposals.
 (1) and (2) are proposals for 8.3 tuning, the others are directions for
 further research.
 
 
 Goal: Reduce total time that GetSnapshotData() waits for
 GetNextTransactionId()

The latest patch for lazy xid allocation reduces the number of times
GetNextTransactionId() is called by eliminating the call entirely for
read only transactions. That will reduce the number of waits and so will
for most real world cases increase the scalability of Postgres.
Right-mostly workloads will be slightly less scalable, so we should
expect our TPC-C numbers to be slightly worse than our TPC-E numbers.

We should retest to see whether the bottleneck has been moved
sufficiently to allow us to avoid doing techniques (1), (2), (3), (5) or
(6) at all. 

 1. Increase size of Clog-specific BLCKSZ
 Clog currently uses BLCKSZ to define the size of clog buffers. This can
 be changed to use CLOG_BLCKSZ, which would then be set to 32768.
 This will naturally increase the amount of memory allocated to the clog,
 so we need not alter CLOG_BUFFERS above 8 if we do this (as previously
 suggested, with successful results). This will also reduce the number of
 ExtendClog() calls, which will probably reduce the overall contention
 also.
 
 2. Perform ExtendClog() as a background activity
 Background process can look at the next transactionid once each cycle
 without holding any lock. If the xid is almost at the point where a new
 clog page would be allocated, then it will allocate one prior to the new
 page being absolutely required. Doing this as a background task would
 mean that we do not need to hold the XidGenLock in exclusive mode while
 we do this, which means that GetSnapshotData() and CommitTransaction()
 would also be less likely to block. Also, if any clog writes need to be
 performed when the page is moved forwards this would also be performed
 in the background.

 3. Consider whether ProcArrayLock should use a new queued-shared lock
 mode that puts a maximum wait time on ExclusiveLock requests. It would
 be fairly hard to implement this well as a timer, but it might be
 possible to place a limit on queue length. i.e. allow Share locks to be
 granted immediately if a Shared holder already exists, but only if there
 is a queue of no more than N exclusive mode requests queued. This might
 prevent the worst cases of exclusive lock starvation. 

(4) is a general concern that remains valid.

 4. Since shared locks are currently queued behind exclusive requests
 when they cannot be immediately satisfied, it might be worth
 reconsidering the way LWLockRelease works also. When we wake up the
 queue we only wake the Shared requests that are adjacent to the head of
 the queue. Instead we could wake *all* waiting Shared requestors.
 
 e.g. with a lock queue like this:
 (HEAD)S-S-X-S-X-S-X-S
 Currently we would wake the 1st and 2nd waiters only. 
 
 If we were to wake the 3rd, 5th and 7th waiters also, then the queue
 would reduce in length very quickly, if we assume generally uniform
 service times. (If the head of the queue is X, then we wake only that
 one process and I'm not proposing we change that). That would mean queue
 jumping right? Well thats what already happens in other circumstances,
 so there cannot be anything intrinsically wrong with allowing it, the
 only question is: would it help? 
 
 We need not wake the whole queue, there may be some generally more
 beneficial heuristic. The reason for 

[HACKERS] SET TRANSACTION not compliant with SQL:2003

2007-09-05 Thread Simon Riggs
The SQL:2003 standard definition of SET TRANSACTION differs in major
ways from PostgreSQL's, which produces some interesting behaviour. 

We currently claim conformance, though this is not accurate.

...
SQL2003
If a set transaction statement that does not specify LOCAL is
executed, then
Case:
i) If an SQL-transaction is currently active, then an exception
condition is raised: invalid transaction
   state — active SQL-transaction.
/SQL2003
...
SQL2003
Case:
a) If LOCAL is not specified, then let TXN be the next SQL-transaction
for the SQL-agent.
b) Otherwise, let TXN be the branch of the active SQL-transaction at the
current SQL-connection.
/SQL2003

The standard behaviour is that SET TRANSACTION defines the mode used in
the *next* transaction, not the current one. We should allow this
meaning, since programs written to spec will act differently with the
current implementation. We currently only change the *current*
transaction. Executing within the current transaction is supposed to
throw an error; that's probably too late to change, but the standard
does give some clues for other errors.

Proposed changes:

1. Solo SET TRANSACTION statements produce no WARNING, nor do anything.
This isn't the way the SQL:2003 standard specifies it should work.
We should take the values from SET TRANSACTION and apply them to the
*next* transaction:
- these will apply to next TXN, unless specifically overridden during
the START TRANSACTION command
- these values apply for one transaction only, after which we revert
back to the session default.

2. Duplicate calls to SET TRANSACTION are allowed within a transaction.
= Should be ERROR: Transaction mode already set.

postgres=# begin;
BEGIN
postgres=# set transaction read only;
SET
postgres=# set transaction read only;
SET
postgres=# commit;
COMMIT

3. Multiple conflicting calls to SET TRANSACTION are allowed within a
transaction.
= Should be ERROR: Transaction mode already set.

postgres=# begin;
BEGIN
postgres=# set transaction isolation level read committed;
SET
postgres=# set transaction isolation level serializable;
SET
postgres=# commit;
COMMIT

4. SET TRANSACTION can be called after a SAVEPOINT, i.e. it can be
called in a subtransaction.
= Should be ERROR: SET TRANSACTION must not be called in a
subtransaction.
(Calling SET TRANSACTION ISOLATION LEVEL already throws this error, so
change should be small)

5. The standard uses the keyword LOCAL like this:
SET LOCAL TRANSACTION ...
which in this context means the part of a distributed (two-phased)
commit on this database.
We should accept, but ignore this keyword.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Hash index todo list item

2007-09-05 Thread Kenneth Marshall
On Sun, Sep 02, 2007 at 01:04:04PM -0500, Kenneth Marshall wrote:
 Dear PostgreSQL Hackers:
 
 After following the hackers mailing list for quite a while,
 I am going to start investigating what will need to be done
 to improve hash index performance. Below are the pieces of
 this project that I am currently considering:
 
 1. Characterize the current hash index implementation against
the BTree index, with a focus on space utilization and
lookup performance against a collection of test data. This
will give a baseline performance test to evaluate the impact
of changes. I initially do not plan to bench the hash creation
process since my initial focus will be on lookup performance.
 

Here are very basic results for a table with 1.6m entries:

postgres=# CREATE TABLE dict (word varchar(100));
CREATE TABLE

postgres=# COPY dict FROM '/tmp/words';
COPY 1648379
postgres=# select count(*) from dict;
  count  
-
 1648379
(1 row)

Time: 11187.418 ms
postgres=# select count(*) from dict;
  count  
-
 1648379
(1 row)

Time: 6040.912 ms
postgres=# CREATE INDEX wordhash ON dict USING hash (word);
CREATE INDEX
Time: 11108707.160 ms

postgres=# select * from dict where word = 'avatar';
  word  

 avatar
(1 row)

Time: 79.823 ms
postgres=# select * from dict where word = 'zebra';
 word  
---
 zebra
(1 row)

Time: 9.864 ms
postgres=# select * from dict where word = 'turkey'; 
  word  

 turkey
(1 row)

Time: 18.418 ms
Time: 1.045 ms
Time: 1.257 ms
Time: 1.080 ms

postgres=# CREATE INDEX wordbtree ON dict USING btree (word);
CREATE INDEX

Time: 25438.884 ms

postgres=# select * from dict where word = 'avatar';
  word  

 avatar
(1 row)

Time: 13.400 ms
postgres=# select * from dict where word = 'zebra';
 word  
---
 zebra
(1 row)

Time: 1.173 ms
postgres=# select * from dict where word = 'turkey';
  word  

 turkey
(1 row)

Time: 1.186 ms
Time: 1.103 ms
Time: 1.099 ms
Time: 1.108 ms

--
Size of table =   87556096

Size of hash index = 268451840
Size of btree index = 53510144

From my very small sample on an unloaded machine, a hash index lookup
took the least amount of time. It had a much larger initial time which
could be attributable to cache population effects. The size is 5X that
of the Btree index. I will continue to improve the test suite as more
granularity is needed. If anyone has a good data generator, please let
me know. Otherwise I will just roll my own.

Regards,
Ken

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Final background writer cleanup for 8.3

2007-09-05 Thread Kevin Grittner
 On Wed, Sep 5, 2007 at  1:54 PM, in message
[EMAIL PROTECTED], Greg Smith
[EMAIL PROTECTED] wrote: 
 On Wed, 5 Sep 2007, Josh Berkus wrote:
 
 While there certainly are some cases where we've heard about people whose 
 workloads were such that the background writer worked successfully for 
 them, I consider those lucky rather than normal.  I'd like those people to 
 test 8.3 because I'd hate to see the changes made to improve the general 
 case cause a regression for them.
 
Being one of the lucky ones, I'm still hopeful that I'll be able to do
these tests.  I think I know how to tailor the load so that we see the
problem often enough to get useful benchmarks (we tended to see the
problem a few times per day in actual 24/7 production).
 
My plan would be to run 8.2.4 with the background writer turned off to
establish a baseline.  I think that any test, to be meaningful would need
to run for several hours, with the first half hour discarded as just being
enough to establish the testing state.
 
Then I would test our aggressive background writer settings under 8.2.4 to
confirm that those settings do handle the problem in this test
environment.
 
Then I would test the new background writer with synchronous commits under
the 8.3 beta, using various settings.  The 0.5, 0.7 and 0.9 settings you
recommended for a test are how far from the LRU end of the cache to look
for dirty pages to write, correct?  Is there any upper bound, as long as I
keep it below 1?  Are the current shared memory and the 1 GB you suggested
enough of a spread for these tests?  (At several hours per test in order
to get meaningful results, I don't want to get into too many permutations.)
 
Finally, I would try the new checkpoint techniques, with and without the
new background writer.  Any suggestions on where to set the knobs for
those runs?
 
I'm inclined to think that it would be interesting to try the benchmarks
with the backend writing any dirty page through to the OS at the same time
they are written to the PostgreSQL cache, as a reference point at the
opposite extreme from having the cache hold onto dirty pages for as long
as possible before sharing them with the OS.  Do you see any value in
getting actual numbers for that?
 
 this causes a bit of a problem for testing 
 8.3 in beta, because if you come from a world-view where the 8.2.4 
 background writer was never successful it's hard to figure out a starting 
 point for comparing it to the one in 8.3.
 
In terms of comparing the new technique to the old, one would approach the
new technique by turning off the all scan and setting the lru scan
percentage to 50% or more, right?  (I mean, obviously there would be more
CPU time used as it scanned through clean pages repeatedly, but it would
be a rough analogy otherwise, yes?)
 
-Kevin
 


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Lazy xid assignment V4

2007-09-05 Thread Robert Treat
moving to -hackers since the patch is already in...

On Wednesday 05 September 2007 18:12, Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
  Florian G. Pflug wrote:
  So, in essence, you get the old pg_locks format back by doing
  select l1.*, l2.transactionid as transaction from pg_locks l1,
  pg_locks l2
  where l1.vxid = l2.vxid and l2.locktype = 'transaction'
  and l2.mode='exclusive' and l2.granted=true.

 You'd want some sort of left join, no doubt, else you're not going to
 see transactions that have not got an XID.

  or make it a system view?

 That would be a bit silly.  If there's actually still a use-case for the
 XID column then we should just put it back. 

I agree, adding a system view to make up for removing a column seems like the 
wrong solution to me. 

 I don't actually see a 
 reasonable use-case for it though.  As Florian points out, you can get
 it if you really need it --- but that view is already annoyingly wide,
 and I'm not eager to burden it with columns that are usually useless.


I'm trying to decide how reasonable the use case is. We have transactions that 
run several hours long, often touching a number of tables, and I've used the 
transaction to get a list of all of the relations a given transaction is 
touching. This makes the above solution more annoying by far, but I don't do 
it often, and I think I generally use the pid to get that information anyway; 
if I used prepared transactions I'm sure I'd feel stronger about this.  

 Also, I still agree with Florian's earlier argument that we should
 deliberately break any code that's depending on the transaction column.
 Any such code is unlikely to be prepared for the column containing nulls.


I don't buy this argument really only so far as the column can already be 
null, so apps already need a way to deal with that. I would agree that the 
behavior of the column has changed, so it might cause some odd behvaior in 
apps that rely on it.  

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [PATCHES] Lazy xid assignment V4

2007-09-05 Thread Tom Lane
Robert Treat [EMAIL PROTECTED] writes:
 I'm trying to decide how reasonable the use case is. We have transactions 
 that 
 run several hours long, often touching a number of tables, and I've used the 
 transaction to get a list of all of the relations a given transaction is 
 touching. This makes the above solution more annoying by far, but I don't do 
 it often, and I think I generally use the pid to get that information anyway; 
 if I used prepared transactions I'm sure I'd feel stronger about this.  

I don't see why you wouldn't start using the VXID for this purpose?

 Also, I still agree with Florian's earlier argument that we should
 deliberately break any code that's depending on the transaction column.
 Any such code is unlikely to be prepared for the column containing nulls.

 I don't buy this argument really only so far as the column can already be 
 null, so apps already need a way to deal with that.

No, it was not possible for the XID column to be null before.  Up to
now, if you didn't have an XID you weren't holding a lock.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] Lazy xid assignment V4

2007-09-05 Thread Robert Treat
On Wednesday 05 September 2007 18:40, Tom Lane wrote:
 Robert Treat [EMAIL PROTECTED] writes:
  I'm trying to decide how reasonable the use case is. We have transactions
  that run several hours long, often touching a number of tables, and I've
  used the transaction to get a list of all of the relations a given
  transaction is touching. This makes the above solution more annoying by
  far, but I don't do it often, and I think I generally use the pid to get
  that information anyway; if I used prepared transactions I'm sure I'd
  feel stronger about this.

 I don't see why you wouldn't start using the VXID for this purpose?


I'm not sure either :-)  Though, it would be nice to have an easy way to see 
which transactions actually modified tables.  Again, I not sure the use case 
is reasonable, but it's there. If no one else feels strongly, let's document 
a query to mimic the old column and move on. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Has anyone tried out the PL/pgSQL debugger?

2007-09-05 Thread John DeSoi

Hi Dave,

On Sep 5, 2007, at 3:54 AM, Dave Page wrote:

That's odd - I cannot reproduce that on OS X using beta 4 (which  
has no

important changes in the debugger over beta 3).

Can you provide a simple test case?


I get the same error with this:

create or replace function debug_test(out t text, out i integer)
returns record as $$
begin
t := 'test 1';
i := 10;
return;
end;
$$ language plpgsql;


I did the following:

1. Right click the function and chose Debug from the Debugging  
submenu.

2. Clicked the OK button on the dialog.


Best,

John





John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] Just-in-time Background Writer Patch+Test Results

2007-09-05 Thread Greg Smith
Tom gets credit for naming the attached patch, which is my latest attempt to 
finalize what has been called the Automatic adjustment of 
bgwriter_lru_maxpages patch for 8.3; that's not what it does anymore but 
that's where it started.


Background on testing
-

I decided to use pgbench for running my tests.  The scripting framework to 
collect all that data and usefully summarize it is now available as 
pgbench-tools-0.2 at 
http://www.westnet.com/~gsmith/content/postgresql/pgbench-tools.htm


I hope to expand and actually document use of pgbench-tools in the future but 
didn't want to hold the rest of this up on that work.  That page includes basic 
information about what my testing environment was and why I felt this was an 
appropriate way to test background writer efficiency.


Quite a bit of raw data for all of the test sets summarized here is at 
http://www.westnet.com/~gsmith/content/bgwriter/


The patches attached to this message are also available at: 
http://www.westnet.com/~gsmith/content/postgresql/buf-alloc-2.patch 
http://www.westnet.com/~gsmith/content/postgresql/jit-cleaner.patch
(This is my second attempt to send this message, don't know why the 
earlier one failed; using gzip'd patches for this one and hopefully there 
won't be a dupe)


Baseline test results
-

The first patch to apply attached to this message is the latest buf-alloc-2 
that adds counters to pgstat_bgwriter for everything the background writer is 
doing. Here's what we get out of the standard 8.3 background writer before and 
after applying that patch, at various settings:


info| set | tps  | cleaner_pct
+-+--+-
 HEAD nobgwriter|   5 |  994 |
 HEAD+buf-alloc-2 nobgwriter|   6 | 1012 |   0
 HEAD+buf-alloc-2 LRU=0.5%/500  |  16 |  974 |   15.94
 HEAD+buf-alloc-2 LRU=5%/500|  19 |  983 |   98.47
 HEAD+buf-alloc-2 LRU=10%/500   |   7 |  997 |   99.95

cleaner_pct is what percentage of the writes the BGW LRU cleaner did relative 
to a total that includes the client backend writes; writes done by checkpoints 
are not included in this summary computation, it just shows the balance of 
backend vs. BGW writes.


The /500 means bgwriter_lru_maxpages=500, which I already knew was about as 
many pages as this server ever dirties in a 200ms cycle.  Without the 
buf-alloc-2 patch I don't get statistics on the LRU cleaner, I include that 
number as a baseline just to suggest that the buf-alloc-2 patch itself isn't 
pulling down results.


Here we see that in order to get most of the writes to happen via the LRU 
cleaner rather than having the backends handle them, you'd need to play with 
the settings until the bgwriter_lru_percent was somewhere between 5% and 10%, 
and it seems obvious that doing this doesn't improve the TPS results.  The 
margin of error here is big enough that I consider all these basically the same 
performance.  The question then is how to get this high level of writes by the 
background writer automatically, without having to know what percentage to 
scan; I wanted to remove bgwriter_lru_percent, while still keeping 
bgwriter_lru_maxpages strictly as a way to throttle overall BGW activity.


First JIT Implementation


The method I described in my last message on this topic ( 
http://archives.postgresql.org/pgsql-hackers/2007-08/msg00887.php ) implemented 
a weighted moving average of how many pages were allocated, and based on 
feedback from that I improved the code to allow a multiplier factor on top of 
that.  Here's the summary of those results:


info| set | tps  | cleaner_pct
+-+--+-
 jit cleaner multiplier=1.0/500 |   9 |  981 |94.3
 jit cleaner multiplier=2.0/500 |   8 | 1005 |   99.78
 jit multiplier=1.0/100 |  10 |  985 |   68.14

That's pretty good.  As long as maxpages is set intelligently, it gets most of 
the writes even with the multiplier of 1.0, and cranking it up to the 2.0 
suggested by the original Itagaki Takahiro patch gets nearly all of them. 
Again, there's really no performance change here in throughput by any of this.


Coping with idle periods


While I was basically happy with these results, the data Kevin Grittner 
submitted in response to my last call for commentary left me concerned. While 
the JIT approach works fine as long as your system is active, it does 
absolutely nothing if the system is idle.  I noticed that a lot of the writes 
that were being done by the client backends were after idle periods where the 
JIT writer just didn't react fast enough during the ramp-up.  For example, if 
the system went from idle for a while to full-speed just as the 200ms sleep 
started, by the time the BGW woke up again the backends 

Re: [HACKERS] [PATCHES] Lazy xid assignment V4

2007-09-05 Thread Tom Lane
Robert Treat [EMAIL PROTECTED] writes:
 On Wednesday 05 September 2007 18:40, Tom Lane wrote:
 I don't see why you wouldn't start using the VXID for this purpose?

 I'm not sure either :-)  Though, it would be nice to have an easy way to see 
 which transactions actually modified tables.

Moving the goal posts, aren't we?  It was not possible to find that out
at all from the pg_locks view before.  (Well, you could guess based on
the type of table locks held, but you were only guessing.)

As of CVS HEAD you *can* determine that from pg_locks, to high
probability anyway, by looking to see which VXIDs have transaction IDs
locked.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster