Re: [HACKERS] Has anyone tried out the PL/pgSQL debugger?
John DeSoi wrote: In some simple tests it seems to work OK with pgAdmin (1.8b3) on OS X. There appears to be a pgAdmin bug when you start a debug session on a function that has no parameters: ERROR: syntax error at or near ) LINE 1: SELECT * FROM myschema.myfunction) ^ That's odd - I cannot reproduce that on OS X using beta 4 (which has no important changes in the debugger over beta 3). Can you provide a simple test case? Regards, Dave ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Per-function GUC settings: trickier than it looked
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Wed, Sep 05, 2007 at 02:13:57PM +1000, Brendan Jurd wrote: On 9/5/07, Tom Lane [EMAIL PROTECTED] wrote: Brendan Jurd [EMAIL PROTECTED] writes: Am I on the right page? Got it in one, I believe. In that case, +1 for your proposed changes. At first, like Florian, I found the idea of a SET LOCAL ever persisting beyond a function astonishing, but that's because I was approaching the term LOCAL from a programming frame of mind, not an SQL one [...] As an unqualified POV, seeing that this got at least two people confused - -- wouldn't it make sense to be more verbose and call the thing SET TRANSACTION LOCAL (not just TRANSACTION, which is ambiguous as we have already seen). May be even SET LOCAL TO TRANSACTION (that gives at least some room for possible extensibility). I know too little about the parser to have even an idea whether this would be feasible at all. Regards - -- tomás -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFG3mFABcgs9XrR2kYRAug1AJ9FJdFEjDGpYWSj09+LgRv218efdwCcDBR8 kjE8O+QCdD/DMntr6mjHBoA= =FI+2 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Has anyone tried out the PL/pgSQL debugger?
Hi Dave, On Sep 5, 2007, at 3:54 AM, Dave Page wrote: That's odd - I cannot reproduce that on OS X using beta 4 (which has no important changes in the debugger over beta 3). Can you provide a simple test case? I'll try to come up with a simple test case and send it sometime this evening. Possible hint: the function had no IN parameters, but many OUT parameters. John John DeSoi, Ph.D. http://pgedit.com/ Power Tools for PostgreSQL ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] HEAD build troubles, buildfarm misconfigurations
On Wednesday 05 September 2007 00:06, Tom Lane wrote: Robert Treat [EMAIL PROTECTED] writes: I get the following error during make (configure runs fine) /usr/include/netdb.h:560: error: syntax error before =E2=80=98[=E2=80=99 to= ken Which line of netdb.h is that exactly? I'm thinking we've redefined some name as a macro that's breaking the include file, but I don't see which name exactly ... Sorry, it was the line int __unused[5]; BTW, on re-reading that, it seems a tad surprising to get an error right there --- if postgres_fe.h or anything it includes were broken, then the build should have failed earlier. Is the failure reproducible in repeated tries? Yes. Just for giggles I played with the configure flags as well, but it always errors in that spot. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Has anyone tried out the PL/pgSQL debugger?
Is there any documentation that describes how to use the SQL functions? Some are obvious enough, but a simple example showing a debugging session would be helpful. I'll add that to the README file and let you know when it's ready. -- Korry ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] pl/pgsql function result cache
I was thinking about pl/pgsql and comparing it with pl/sql function result cache, featured in Oracle 11g - see http://www.oracle.com/technology/oramag/oracle/07-sep/o57plsql.html Is it possible to get pl/pgsql function result cache functionality into PostgreSQL real soon? Best, Peter _ Discover the new Windows Vista http://search.msn.com/results.aspx?q=windows+vistamkt=en-USform=QBRE
Re: [HACKERS] pl/pgsql function result cache
Peter Manchev wrote: I was thinking about pl/pgsql and comparing it with pl/sql function result cache, featured in Oracle 11g - see http://www.oracle.com/technology/oramag/oracle/07-sep/o57plsql.html Is it possible to get pl/pgsql function result cache functionality into PostgreSQL real soon? Well it's unlikely to be real soon since 8.3 is currently approaching beta-test. Surely memcached+pgmemcache does basically this anyway, except: - it's not restricted to function outputs - you can cache application objects - you can spread your cache across multiple machines -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] pl/pgsql function result cache
Richard, Surely memcached+pgmemcache does basically this anyway, except: - it's not restricted to function outputs - you can cache application objects - you can spread your cache across multiple machines Sure, but using memcached is far from automatic ... you'd have to roll your own scheme for caching previously called functions, and there would be no way to make it performant for iterative execution due to needing to make a pgmemcached call to check for cached results with each execution. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] HEAD build troubles, buildfarm misconfigurations
Robert Treat [EMAIL PROTECTED] writes: On Wednesday 05 September 2007 00:06, Tom Lane wrote: BTW, on re-reading that, it seems a tad surprising to get an error right there --- if postgres_fe.h or anything it includes were broken, then the build should have failed earlier. Is the failure reproducible in repeated tries? Yes. Just for giggles I played with the configure flags as well, but it always errors in that spot. Hmph. I checked the CVS logs and there haven't been any recent changes that seem like they could be related. It might be worth getting gcc -E output and looking to see what that part of netdb.h looks like after macro expansion. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Oddity with psql \d and pg_table_is_visible
I have a database where I'd created a copy of pg_class in public. pgAdmin shows that the table exists, but \d doesn't. This is because of how pg_table_is_visible works, specifically this comment: /* * If it is in the path, it might still not be visible; it could be * hidden by another relation of the same name earlier in the path. So * we must do a slow check for conflicting relations. */ While this is correct on a per-relation level, I'm thinking that it's not what we'd really like to have happen in psql. What I'd like \d to do is show me everything in any schema that's in my search_path, even if there's something higher in the search_path that would over-ride it. ISTM that's what most people would expect out of \d. If no one objects I'll come up with a patch for this. -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpW1fVBGsZXy.pgp Description: PGP signature
Re: [HACKERS] Final background writer cleanup for 8.3
On Tue, 4 Sep 2007, Josh Berkus wrote: In about 200 benchmark test runs, I don't feel like we ever came up with a set of bgwriter settings we'd happily recommend to others. SO it's hard for me to tell whether this is true or not. Are you talking about 200 runs with 8.2.4 or 8.3? If you've collected a bunch of 8.3 data, that's something I haven't been able to do; if what you're saying is that you never found settings with 8.2.4 that you'd recommend, that's consistant with what I was saying. Can you send me the current version of the patch, plus some bgwriter settings to try with it, so we can throw it on some of the Sun benchmarks? Am in the middle of wrapping this up today, will send out a patch for everyone to try shortly. Tests are done, patch is done for now, just writing the results up and making my tests reproducible. I had some unexpected inspiration the other day that dragged things out, but with useful improvements. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Final background writer cleanup for 8.3
Greg, Are you talking about 200 runs with 8.2.4 or 8.3? 8.2.4. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] loose ends in lazy-XID-assigment patch
I've committed Florian's patch, but there remain a couple of things that need work: * Should CSV-mode logging include the virtual transaction ID (VXID) in addition to, or instead of, XID? There will be many situations where there is no XID. * As things stand, when a two-phase transaction is prepared, it drops its lock on the original VXID; this seems necessary for reasons previously discussed. I made the code put an invalid VXID into the gxact structure for the prepared xact, which means that pg_locks shows things like this: regression=# select * from pg_locks; locktype| database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted ---+--+--+--+---++---+-+---+--++---+-+- transactionid | | | | || 21774 | | | | -1/0 | | ExclusiveLock | t relation | 126093 | 126124 | | || | | | | -1/0 | | AccessShareLock | t relation | 126093 |10969 | | || | | | | 1/260 | 20592 | AccessShareLock | t virtualxid| | | | | 1/260 | | | | | 1/260 | 20592 | ExclusiveLock | t (4 rows) This seems fairly undesirable :-( not least because you can't tell one prepared xact from another and thus can't see which locks belong to each. But I'm unsure what to do about it. We could have the prepared xact continue to display the original VXID, but there would be no certainty about the VXID remaining unique, which seems bad. Another possibility is to put back the transaction ID column, but since that's not unique for read-only transactions, we still don't have anything usable as a join key. The best idea I can think of is to make the virtualtransaction column read out the VXID for regular transactions and the transaction ID for prepared transactions, or maybe the transaction ID for any transaction that has one and VXID just for read-only xacts. We can get away with that because the column is only text and not any better-defined datatype. It seems mighty ugly though; and changing the ID shown for a transaction mid-stream isn't very pleasant either. Anyone have a better idea? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Should pointers to PGPROC be volatile-qualified?
There are a bunch of places where we do things like PGPROC *proc = arrayP-procs[index]; /* Fetch xid just once - see GetNewTransactionId */ TransactionId pxid = proc-xid; ... use pxid several times ... In the case where use pxid involves a function call, this is probably safe enough, because the compiler can't assume that the called function won't access and modify the PGPROC. However, if use pxid is straight line code, I am suddenly wondering if the C compiler could think it can get away with loading proc-xid more than once instead of expending a register on it. As far as it knows there isn't any way for the value to change underneath it. The correct fix for this is probably to make the fetch happen through a volatile-qualified pointer, eg volatile PGPROC *proc = arrayP-procs[index]; /* Fetch xid just once - see GetNewTransactionId */ TransactionId pxid = proc-xid; I'm wondering how far to go with that. In the extreme we could try to make MyProc and all other PGPROC pointers volatile-qualified; is that overkill? Or maybe I'm worried over nothing. I can't recall any bug reports that seem like they could be tied to such a thing, and given that we can only set, not clear, MyProc-xid and xmin without exclusive lock, there might not actually be a bug here. But it seems a bit risky. Comments? Does anyone think the C standard forbids what I'm worried about? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] loose ends in lazy-XID-assigment patch
Tom Lane wrote: * Should CSV-mode logging include the virtual transaction ID (VXID) in addition to, or instead of, XID? There will be many situations where there is no XID. But will there be any where we care? Isn't the point of this to restrict allocation of a real XID to situations where having one might actually matter? All the rest seems to me just bookkeeping. cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Final background writer cleanup for 8.3
On Wed, 5 Sep 2007, Josh Berkus wrote: Are you talking about 200 runs with 8.2.4 or 8.3? 8.2.4. Right, then we're in agreement here. I did something like 4000 small test runs with dozens of settings under various 8.2.X releases and my conclusion was that in the general case, it just didn't work at reducing checkpoint spikes the way it was supposed to. Your statement that you never found a set of bgwriter settings we'd happily recommend to others was also the case for me. While there certainly are some cases where we've heard about people whose workloads were such that the background writer worked successfully for them, I consider those lucky rather than normal. I'd like those people to test 8.3 because I'd hate to see the changes made to improve the general case cause a regression for them. You are certainly spot-on that this causes a bit of a problem for testing 8.3 in beta, because if you come from a world-view where the 8.2.4 background writer was never successful it's hard to figure out a starting point for comparing it to the one in 8.3. Maybe I'll spark some ideas when I get the rest of my data out here soon. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] loose ends in lazy-XID-assigment patch
Tom Lane wrote: I've committed Florian's patch, but there remain a couple of things that need work: * Should CSV-mode logging include the virtual transaction ID (VXID) in addition to, or instead of, XID? There will be many situations where there is no XID. Maybe make %x show both, or only the xid if that is set, and the vxid otherwise? That would probably be what most existing users of %x want. For those who want them seperated, we'd have %v (vxid), and maybe %X (xid only). Seems a bit like overkills, though... * As things stand, when a two-phase transaction is prepared, it drops its lock on the original VXID; this seems necessary for reasons previously discussed. I made the code put an invalid VXID into the gxact structure for the prepared xact, which means that pg_locks shows things like this: regression=# select * from pg_locks; locktype| database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted ---+--+--+--+---++---+-+---+--++---+-+- transactionid | | | | || 21774 | | | | -1/0 | | ExclusiveLock | t relation | 126093 | 126124 | | || | | | | -1/0 | | AccessShareLock | t relation | 126093 |10969 | | || | | | | 1/260 | 20592 | AccessShareLock | t virtualxid| | | | | 1/260 | | | | | 1/260 | 20592 | ExclusiveLock | t (4 rows) This seems fairly undesirable :-( not least because you can't tell one prepared xact from another and thus can't see which locks belong to each. But I'm unsure what to do about it. We could have the prepared xact continue to display the original VXID, but there would be no certainty about the VXID remaining unique, which seems bad. Another possibility is to put back the transaction ID column, but since that's not unique for read-only transactions, we still don't have anything usable as a join key. The best idea I can think of is to make the virtualtransaction column read out the VXID for regular transactions and the transaction ID for prepared transactions, or maybe the transaction ID for any transaction that has one and VXID just for read-only xacts. We can get away with that because the column is only text and not any better-defined datatype. It seems mighty ugly though; and changing the ID shown for a transaction mid-stream isn't very pleasant either. We could make the VXID in the gxact struct be backendId=InvalidBackendId, lxid=xid. That'd be still an invalid vxid, but not the same for every prepared transaction. If we take this further, we could get rid of the lock on the xid completely, I believe. We'd define some PermanentBackendId (lets say, -2, since -1 is taken). When preparing the xact, we'd drop the lock on the old VXID, and instead acquire one on (PermanentBackendId, xid). Waiting on an xid would become a bit tricky, but doable I think. We'd have to first check the procarray - if we find the xid there, we translate it to a vxid, and wait on that. Aftwards (whether we found a vxid, or not) we wait on (PermanentBackendId, xid). That doesn't exactly make XactLockTableWait cheaper, but that might be OK, since we I haven't really thought this through, though. I think that with carefull ordering of things we can control the race conditions this might posses - but I'm not sure at this point. greetings, Florian Pflug ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Should pointers to PGPROC be volatile-qualified?
Tom Lane wrote: Comments? Does anyone think the C standard forbids what I'm worried about? My understanding of the C spec is that it explicitly *allows* for exactly what you're afraid of. It's even possible if the uses include function calls, as the compiler might inline the function calls. The downside of litering the code with volatile qualifications is that it's an optimization stopper. For example, if proc is declared volatile, the compiler couldn't merge multiple different proc-foo references into a single load into a register. Note that all sorts of weirdnesses are possible when you have shared mutable state between multiple different threads. For example, assume you have two threads, and two global ints x and y, initially both 0. Thread 1 do: y = 1; r1 = x; (where r1 is a local variable in thread 1), while thread 2 does: x = 1; r2 = y; (with r2 being a local variable in thread 2). Here's the thing: both r1 and r2 can end up 0! I've seen this in real code. What happens is that the compiler notices that in both cases, the load and stores are independent, so it can reorder them. And as loads tend to be expensive, and nothing can progress until the load completes, it moves the loads up before the stores, assuming the program won't notice. Unfortunately, it does, as the impossible can then happen. Brian ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] loose ends in lazy-XID-assigment patch
Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: * Should CSV-mode logging include the virtual transaction ID (VXID) in addition to, or instead of, XID? There will be many situations where there is no XID. But will there be any where we care? Isn't the point of this to restrict allocation of a real XID to situations where having one might actually matter? All the rest seems to me just bookkeeping. Well, the point is that a transaction might well emit log messages when it hasn't changed anything and hence hasn't got an XID. If you would like to be able to match up the messages generated by a single transaction, you'd need to have VXID available to use as a join key. In any case it seems a bit odd that we have a value that's available in log_line_prefix and not available to users of CSV log output. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] loose ends in lazy-XID-assigment patch
Florian G. Pflug [EMAIL PROTECTED] writes: Tom Lane wrote: This seems fairly undesirable :-( not least because you can't tell one prepared xact from another and thus can't see which locks belong to each. But I'm unsure what to do about it. We could make the VXID in the gxact struct be backendId=InvalidBackendId, lxid=xid. That'd be still an invalid vxid, but not the same for every prepared transaction. Hmm, that would work. If we take this further, we could get rid of the lock on the xid completely, Maybe, but let's not go there for now. I was already bending the rules to push this into 8.3 --- I think further improvement needs to wait for 8.4. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Should pointers to PGPROC be volatile-qualified?
Brian Hurt [EMAIL PROTECTED] writes: Note that all sorts of weirdnesses are possible when you have shared mutable state between multiple different threads. Yeah. In the majority of places this isn't a big problem because access to shared memory looks like LWLockAcquire(some_lock); ... mess with shared state ... LWLockRelease(some_lock); and as long as the LWLock functions are extern and not inlineable all is well. But we do need to be more careful in places where we're violating that coding rule, and the various stuff that looks at or changes xid and xmin is violating it. What I'm inclined to do for now is put volatile into those places in procarray.c where there seems to be a risk. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Oddity with psql \d and pg_table_is_visible
Decibel! [EMAIL PROTECTED] writes: While this is correct on a per-relation level, I'm thinking that it's not what we'd really like to have happen in psql. What I'd like \d to do is show me everything in any schema that's in my search_path, even if there's something higher in the search_path that would over-ride it. ISTM that's what most people would expect out of \d. I don't agree with that reasoning in the least, particularly not if you intend to fix it by redefining pg_table_is_visible() ... What will happen if we change \d to work that way is that it will show you a table, and you'll try to access it, and you'll get the wrong table because the access will go to the one that really is visible. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Reducing Transaction Start/End Contention
On Mon, 2007-07-30 at 20:20 +0100, Simon Riggs wrote: Jignesh Shah's scalability testing on Solaris has revealed further tuning opportunities surrounding the start and end of a transaction. Tuning that should be especially important since async commit is likely to allow much higher transaction rates than were previously possible. There is strong contention on the ProcArrayLock in Exclusive mode, with the top path being CommitTransaction(). This becomes clear as the number of connections increases, but it seems likely that the contention can be caused in a range of other circumstances. My thoughts on the causes of this contention are that the following 3 tasks contend with each other in the following way: CommitTransaction(): takes ProcArrayLock Exclusive but only needs access to one ProcArray element waits for GetSnapshotData():ProcArrayLock Shared ReadNewTransactionId():XidGenLock Shared which waits for GetNextTransactionId() takes XidGenLock Exclusive ExtendCLOG(): takes ClogControlLock Exclusive, WALInsertLock Exclusive two possible place where I/O is required ExtendSubtrans(): takes SubtransControlLock() one possible place where I/O is required Avoids lock on ProcArrayLock: atomically updates one ProcArray element or more simply: CommitTransaction() -- i.e. once per transaction waits for GetSnapshotData() -- i.e. once per SQL statement which waits for GetNextTransactionId() -- i.e. once per transaction This gives some goals for scalability improvements and some proposals. (1) and (2) are proposals for 8.3 tuning, the others are directions for further research. Goal: Reduce total time that GetSnapshotData() waits for GetNextTransactionId() The latest patch for lazy xid allocation reduces the number of times GetNextTransactionId() is called by eliminating the call entirely for read only transactions. That will reduce the number of waits and so will for most real world cases increase the scalability of Postgres. Right-mostly workloads will be slightly less scalable, so we should expect our TPC-C numbers to be slightly worse than our TPC-E numbers. We should retest to see whether the bottleneck has been moved sufficiently to allow us to avoid doing techniques (1), (2), (3), (5) or (6) at all. 1. Increase size of Clog-specific BLCKSZ Clog currently uses BLCKSZ to define the size of clog buffers. This can be changed to use CLOG_BLCKSZ, which would then be set to 32768. This will naturally increase the amount of memory allocated to the clog, so we need not alter CLOG_BUFFERS above 8 if we do this (as previously suggested, with successful results). This will also reduce the number of ExtendClog() calls, which will probably reduce the overall contention also. 2. Perform ExtendClog() as a background activity Background process can look at the next transactionid once each cycle without holding any lock. If the xid is almost at the point where a new clog page would be allocated, then it will allocate one prior to the new page being absolutely required. Doing this as a background task would mean that we do not need to hold the XidGenLock in exclusive mode while we do this, which means that GetSnapshotData() and CommitTransaction() would also be less likely to block. Also, if any clog writes need to be performed when the page is moved forwards this would also be performed in the background. 3. Consider whether ProcArrayLock should use a new queued-shared lock mode that puts a maximum wait time on ExclusiveLock requests. It would be fairly hard to implement this well as a timer, but it might be possible to place a limit on queue length. i.e. allow Share locks to be granted immediately if a Shared holder already exists, but only if there is a queue of no more than N exclusive mode requests queued. This might prevent the worst cases of exclusive lock starvation. (4) is a general concern that remains valid. 4. Since shared locks are currently queued behind exclusive requests when they cannot be immediately satisfied, it might be worth reconsidering the way LWLockRelease works also. When we wake up the queue we only wake the Shared requests that are adjacent to the head of the queue. Instead we could wake *all* waiting Shared requestors. e.g. with a lock queue like this: (HEAD)S-S-X-S-X-S-X-S Currently we would wake the 1st and 2nd waiters only. If we were to wake the 3rd, 5th and 7th waiters also, then the queue would reduce in length very quickly, if we assume generally uniform service times. (If the head of the queue is X, then we wake only that one process and I'm not proposing we change that). That would mean queue jumping right? Well thats what already happens in other circumstances, so there cannot be anything intrinsically wrong with allowing it, the only question is: would it help? We need not wake the whole queue, there may be some generally more beneficial heuristic. The reason for
[HACKERS] SET TRANSACTION not compliant with SQL:2003
The SQL:2003 standard definition of SET TRANSACTION differs in major ways from PostgreSQL's, which produces some interesting behaviour. We currently claim conformance, though this is not accurate. ... SQL2003 If a set transaction statement that does not specify LOCAL is executed, then Case: i) If an SQL-transaction is currently active, then an exception condition is raised: invalid transaction state — active SQL-transaction. /SQL2003 ... SQL2003 Case: a) If LOCAL is not specified, then let TXN be the next SQL-transaction for the SQL-agent. b) Otherwise, let TXN be the branch of the active SQL-transaction at the current SQL-connection. /SQL2003 The standard behaviour is that SET TRANSACTION defines the mode used in the *next* transaction, not the current one. We should allow this meaning, since programs written to spec will act differently with the current implementation. We currently only change the *current* transaction. Executing within the current transaction is supposed to throw an error; that's probably too late to change, but the standard does give some clues for other errors. Proposed changes: 1. Solo SET TRANSACTION statements produce no WARNING, nor do anything. This isn't the way the SQL:2003 standard specifies it should work. We should take the values from SET TRANSACTION and apply them to the *next* transaction: - these will apply to next TXN, unless specifically overridden during the START TRANSACTION command - these values apply for one transaction only, after which we revert back to the session default. 2. Duplicate calls to SET TRANSACTION are allowed within a transaction. = Should be ERROR: Transaction mode already set. postgres=# begin; BEGIN postgres=# set transaction read only; SET postgres=# set transaction read only; SET postgres=# commit; COMMIT 3. Multiple conflicting calls to SET TRANSACTION are allowed within a transaction. = Should be ERROR: Transaction mode already set. postgres=# begin; BEGIN postgres=# set transaction isolation level read committed; SET postgres=# set transaction isolation level serializable; SET postgres=# commit; COMMIT 4. SET TRANSACTION can be called after a SAVEPOINT, i.e. it can be called in a subtransaction. = Should be ERROR: SET TRANSACTION must not be called in a subtransaction. (Calling SET TRANSACTION ISOLATION LEVEL already throws this error, so change should be small) 5. The standard uses the keyword LOCAL like this: SET LOCAL TRANSACTION ... which in this context means the part of a distributed (two-phased) commit on this database. We should accept, but ignore this keyword. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Hash index todo list item
On Sun, Sep 02, 2007 at 01:04:04PM -0500, Kenneth Marshall wrote: Dear PostgreSQL Hackers: After following the hackers mailing list for quite a while, I am going to start investigating what will need to be done to improve hash index performance. Below are the pieces of this project that I am currently considering: 1. Characterize the current hash index implementation against the BTree index, with a focus on space utilization and lookup performance against a collection of test data. This will give a baseline performance test to evaluate the impact of changes. I initially do not plan to bench the hash creation process since my initial focus will be on lookup performance. Here are very basic results for a table with 1.6m entries: postgres=# CREATE TABLE dict (word varchar(100)); CREATE TABLE postgres=# COPY dict FROM '/tmp/words'; COPY 1648379 postgres=# select count(*) from dict; count - 1648379 (1 row) Time: 11187.418 ms postgres=# select count(*) from dict; count - 1648379 (1 row) Time: 6040.912 ms postgres=# CREATE INDEX wordhash ON dict USING hash (word); CREATE INDEX Time: 11108707.160 ms postgres=# select * from dict where word = 'avatar'; word avatar (1 row) Time: 79.823 ms postgres=# select * from dict where word = 'zebra'; word --- zebra (1 row) Time: 9.864 ms postgres=# select * from dict where word = 'turkey'; word turkey (1 row) Time: 18.418 ms Time: 1.045 ms Time: 1.257 ms Time: 1.080 ms postgres=# CREATE INDEX wordbtree ON dict USING btree (word); CREATE INDEX Time: 25438.884 ms postgres=# select * from dict where word = 'avatar'; word avatar (1 row) Time: 13.400 ms postgres=# select * from dict where word = 'zebra'; word --- zebra (1 row) Time: 1.173 ms postgres=# select * from dict where word = 'turkey'; word turkey (1 row) Time: 1.186 ms Time: 1.103 ms Time: 1.099 ms Time: 1.108 ms -- Size of table = 87556096 Size of hash index = 268451840 Size of btree index = 53510144 From my very small sample on an unloaded machine, a hash index lookup took the least amount of time. It had a much larger initial time which could be attributable to cache population effects. The size is 5X that of the Btree index. I will continue to improve the test suite as more granularity is needed. If anyone has a good data generator, please let me know. Otherwise I will just roll my own. Regards, Ken ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Final background writer cleanup for 8.3
On Wed, Sep 5, 2007 at 1:54 PM, in message [EMAIL PROTECTED], Greg Smith [EMAIL PROTECTED] wrote: On Wed, 5 Sep 2007, Josh Berkus wrote: While there certainly are some cases where we've heard about people whose workloads were such that the background writer worked successfully for them, I consider those lucky rather than normal. I'd like those people to test 8.3 because I'd hate to see the changes made to improve the general case cause a regression for them. Being one of the lucky ones, I'm still hopeful that I'll be able to do these tests. I think I know how to tailor the load so that we see the problem often enough to get useful benchmarks (we tended to see the problem a few times per day in actual 24/7 production). My plan would be to run 8.2.4 with the background writer turned off to establish a baseline. I think that any test, to be meaningful would need to run for several hours, with the first half hour discarded as just being enough to establish the testing state. Then I would test our aggressive background writer settings under 8.2.4 to confirm that those settings do handle the problem in this test environment. Then I would test the new background writer with synchronous commits under the 8.3 beta, using various settings. The 0.5, 0.7 and 0.9 settings you recommended for a test are how far from the LRU end of the cache to look for dirty pages to write, correct? Is there any upper bound, as long as I keep it below 1? Are the current shared memory and the 1 GB you suggested enough of a spread for these tests? (At several hours per test in order to get meaningful results, I don't want to get into too many permutations.) Finally, I would try the new checkpoint techniques, with and without the new background writer. Any suggestions on where to set the knobs for those runs? I'm inclined to think that it would be interesting to try the benchmarks with the backend writing any dirty page through to the OS at the same time they are written to the PostgreSQL cache, as a reference point at the opposite extreme from having the cache hold onto dirty pages for as long as possible before sharing them with the OS. Do you see any value in getting actual numbers for that? this causes a bit of a problem for testing 8.3 in beta, because if you come from a world-view where the 8.2.4 background writer was never successful it's hard to figure out a starting point for comparing it to the one in 8.3. In terms of comparing the new technique to the old, one would approach the new technique by turning off the all scan and setting the lru scan percentage to 50% or more, right? (I mean, obviously there would be more CPU time used as it scanned through clean pages repeatedly, but it would be a rough analogy otherwise, yes?) -Kevin ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] Lazy xid assignment V4
moving to -hackers since the patch is already in... On Wednesday 05 September 2007 18:12, Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Florian G. Pflug wrote: So, in essence, you get the old pg_locks format back by doing select l1.*, l2.transactionid as transaction from pg_locks l1, pg_locks l2 where l1.vxid = l2.vxid and l2.locktype = 'transaction' and l2.mode='exclusive' and l2.granted=true. You'd want some sort of left join, no doubt, else you're not going to see transactions that have not got an XID. or make it a system view? That would be a bit silly. If there's actually still a use-case for the XID column then we should just put it back. I agree, adding a system view to make up for removing a column seems like the wrong solution to me. I don't actually see a reasonable use-case for it though. As Florian points out, you can get it if you really need it --- but that view is already annoyingly wide, and I'm not eager to burden it with columns that are usually useless. I'm trying to decide how reasonable the use case is. We have transactions that run several hours long, often touching a number of tables, and I've used the transaction to get a list of all of the relations a given transaction is touching. This makes the above solution more annoying by far, but I don't do it often, and I think I generally use the pid to get that information anyway; if I used prepared transactions I'm sure I'd feel stronger about this. Also, I still agree with Florian's earlier argument that we should deliberately break any code that's depending on the transaction column. Any such code is unlikely to be prepared for the column containing nulls. I don't buy this argument really only so far as the column can already be null, so apps already need a way to deal with that. I would agree that the behavior of the column has changed, so it might cause some odd behvaior in apps that rely on it. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [PATCHES] Lazy xid assignment V4
Robert Treat [EMAIL PROTECTED] writes: I'm trying to decide how reasonable the use case is. We have transactions that run several hours long, often touching a number of tables, and I've used the transaction to get a list of all of the relations a given transaction is touching. This makes the above solution more annoying by far, but I don't do it often, and I think I generally use the pid to get that information anyway; if I used prepared transactions I'm sure I'd feel stronger about this. I don't see why you wouldn't start using the VXID for this purpose? Also, I still agree with Florian's earlier argument that we should deliberately break any code that's depending on the transaction column. Any such code is unlikely to be prepared for the column containing nulls. I don't buy this argument really only so far as the column can already be null, so apps already need a way to deal with that. No, it was not possible for the XID column to be null before. Up to now, if you didn't have an XID you weren't holding a lock. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] Lazy xid assignment V4
On Wednesday 05 September 2007 18:40, Tom Lane wrote: Robert Treat [EMAIL PROTECTED] writes: I'm trying to decide how reasonable the use case is. We have transactions that run several hours long, often touching a number of tables, and I've used the transaction to get a list of all of the relations a given transaction is touching. This makes the above solution more annoying by far, but I don't do it often, and I think I generally use the pid to get that information anyway; if I used prepared transactions I'm sure I'd feel stronger about this. I don't see why you wouldn't start using the VXID for this purpose? I'm not sure either :-) Though, it would be nice to have an easy way to see which transactions actually modified tables. Again, I not sure the use case is reasonable, but it's there. If no one else feels strongly, let's document a query to mimic the old column and move on. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Has anyone tried out the PL/pgSQL debugger?
Hi Dave, On Sep 5, 2007, at 3:54 AM, Dave Page wrote: That's odd - I cannot reproduce that on OS X using beta 4 (which has no important changes in the debugger over beta 3). Can you provide a simple test case? I get the same error with this: create or replace function debug_test(out t text, out i integer) returns record as $$ begin t := 'test 1'; i := 10; return; end; $$ language plpgsql; I did the following: 1. Right click the function and chose Debug from the Debugging submenu. 2. Clicked the OK button on the dialog. Best, John John DeSoi, Ph.D. http://pgedit.com/ Power Tools for PostgreSQL ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Just-in-time Background Writer Patch+Test Results
Tom gets credit for naming the attached patch, which is my latest attempt to finalize what has been called the Automatic adjustment of bgwriter_lru_maxpages patch for 8.3; that's not what it does anymore but that's where it started. Background on testing - I decided to use pgbench for running my tests. The scripting framework to collect all that data and usefully summarize it is now available as pgbench-tools-0.2 at http://www.westnet.com/~gsmith/content/postgresql/pgbench-tools.htm I hope to expand and actually document use of pgbench-tools in the future but didn't want to hold the rest of this up on that work. That page includes basic information about what my testing environment was and why I felt this was an appropriate way to test background writer efficiency. Quite a bit of raw data for all of the test sets summarized here is at http://www.westnet.com/~gsmith/content/bgwriter/ The patches attached to this message are also available at: http://www.westnet.com/~gsmith/content/postgresql/buf-alloc-2.patch http://www.westnet.com/~gsmith/content/postgresql/jit-cleaner.patch (This is my second attempt to send this message, don't know why the earlier one failed; using gzip'd patches for this one and hopefully there won't be a dupe) Baseline test results - The first patch to apply attached to this message is the latest buf-alloc-2 that adds counters to pgstat_bgwriter for everything the background writer is doing. Here's what we get out of the standard 8.3 background writer before and after applying that patch, at various settings: info| set | tps | cleaner_pct +-+--+- HEAD nobgwriter| 5 | 994 | HEAD+buf-alloc-2 nobgwriter| 6 | 1012 | 0 HEAD+buf-alloc-2 LRU=0.5%/500 | 16 | 974 | 15.94 HEAD+buf-alloc-2 LRU=5%/500| 19 | 983 | 98.47 HEAD+buf-alloc-2 LRU=10%/500 | 7 | 997 | 99.95 cleaner_pct is what percentage of the writes the BGW LRU cleaner did relative to a total that includes the client backend writes; writes done by checkpoints are not included in this summary computation, it just shows the balance of backend vs. BGW writes. The /500 means bgwriter_lru_maxpages=500, which I already knew was about as many pages as this server ever dirties in a 200ms cycle. Without the buf-alloc-2 patch I don't get statistics on the LRU cleaner, I include that number as a baseline just to suggest that the buf-alloc-2 patch itself isn't pulling down results. Here we see that in order to get most of the writes to happen via the LRU cleaner rather than having the backends handle them, you'd need to play with the settings until the bgwriter_lru_percent was somewhere between 5% and 10%, and it seems obvious that doing this doesn't improve the TPS results. The margin of error here is big enough that I consider all these basically the same performance. The question then is how to get this high level of writes by the background writer automatically, without having to know what percentage to scan; I wanted to remove bgwriter_lru_percent, while still keeping bgwriter_lru_maxpages strictly as a way to throttle overall BGW activity. First JIT Implementation The method I described in my last message on this topic ( http://archives.postgresql.org/pgsql-hackers/2007-08/msg00887.php ) implemented a weighted moving average of how many pages were allocated, and based on feedback from that I improved the code to allow a multiplier factor on top of that. Here's the summary of those results: info| set | tps | cleaner_pct +-+--+- jit cleaner multiplier=1.0/500 | 9 | 981 |94.3 jit cleaner multiplier=2.0/500 | 8 | 1005 | 99.78 jit multiplier=1.0/100 | 10 | 985 | 68.14 That's pretty good. As long as maxpages is set intelligently, it gets most of the writes even with the multiplier of 1.0, and cranking it up to the 2.0 suggested by the original Itagaki Takahiro patch gets nearly all of them. Again, there's really no performance change here in throughput by any of this. Coping with idle periods While I was basically happy with these results, the data Kevin Grittner submitted in response to my last call for commentary left me concerned. While the JIT approach works fine as long as your system is active, it does absolutely nothing if the system is idle. I noticed that a lot of the writes that were being done by the client backends were after idle periods where the JIT writer just didn't react fast enough during the ramp-up. For example, if the system went from idle for a while to full-speed just as the 200ms sleep started, by the time the BGW woke up again the backends
Re: [HACKERS] [PATCHES] Lazy xid assignment V4
Robert Treat [EMAIL PROTECTED] writes: On Wednesday 05 September 2007 18:40, Tom Lane wrote: I don't see why you wouldn't start using the VXID for this purpose? I'm not sure either :-) Though, it would be nice to have an easy way to see which transactions actually modified tables. Moving the goal posts, aren't we? It was not possible to find that out at all from the pg_locks view before. (Well, you could guess based on the type of table locks held, but you were only guessing.) As of CVS HEAD you *can* determine that from pg_locks, to high probability anyway, by looking to see which VXIDs have transaction IDs locked. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster