Re: [HACKERS] XID wraparound and busy databases
On Wed, 2007-08-15 at 12:49 -0400, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Maybe we can do something to reduce the xid consumption? For example, reuse xids for read-only queries. Hmm, that's an idea. That would be a very useful idea for additional scalability and it does sound like a good idea. We kicked around the idea of reusing snapshots previously also, as a way of reducing access to the ProcArray. That idea had some problems, but if your workload was all read-only and we reused xids then everybody would have an identical snapshot, so reusing it would make lots of sense. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] XID wraparound and busy databases
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Is enlarging the xid field something we should consider for 8.4? No. We just got the tuple header down to 24 bytes, we are not going to give that back and then some. If you are processing 6K transactions per second, you can afford to vacuum every couple days... and probably need to vacuum much more often than that anyway, to avoid table bloat. Possibly your respondent should think about trying to do more than one thing per transaction? I'm wondering how many of those 6k xacts/second are actually modifying data. If a large percentage of those are readonly queries, than the need for vacuuming could be reduced if postgres assigned an xid only if that xid really hits the disk. Otherwise (for purely select-type queries) it could use some special xid value. This is what I'm doing in my Readonly-Queries-On-PITR-Slave patch. greetings, Florian Pflug ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] XID wraparound and busy databases
Added to TODO: * Reduce XID consumption of read-only queries http://archives.postgresql.org/pgsql-hackers/2007-08/msg00516.php --- Florian G. Pflug wrote: Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Is enlarging the xid field something we should consider for 8.4? No. We just got the tuple header down to 24 bytes, we are not going to give that back and then some. If you are processing 6K transactions per second, you can afford to vacuum every couple days... and probably need to vacuum much more often than that anyway, to avoid table bloat. Possibly your respondent should think about trying to do more than one thing per transaction? I'm wondering how many of those 6k xacts/second are actually modifying data. If a large percentage of those are readonly queries, than the need for vacuuming could be reduced if postgres assigned an xid only if that xid really hits the disk. Otherwise (for purely select-type queries) it could use some special xid value. This is what I'm doing in my Readonly-Queries-On-PITR-Slave patch. greetings, Florian Pflug -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] XID wraparound and busy databases
On Thu, Aug 16, 2007 at 04:13:55PM +0200, Florian G. Pflug wrote: Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Is enlarging the xid field something we should consider for 8.4? No. We just got the tuple header down to 24 bytes, we are not going to give that back and then some. If you are processing 6K transactions per second, you can afford to vacuum every couple days... and probably need to vacuum much more often than that anyway, to avoid table bloat. Possibly your respondent should think about trying to do more than one thing per transaction? I'm wondering how many of those 6k xacts/second are actually modifying data. If a large percentage of those are readonly queries, than the need for vacuuming could be reduced if postgres assigned an xid only if that xid really hits the disk. Otherwise (for purely select-type queries) it could use some special xid value. This is what I'm doing in my Readonly-Queries-On-PITR-Slave patch. Bruce talked to me and others at Hi5. We're the ones with this high workload, and yes, the traffic is predominantly SELECTs. Our application issues parallel SELECT queries to 32 partitioned tables in 32 separate threads. For our purposes this is much, much faster than a UNION view or table inheritance. These tables are partitioned on one key, but we also need to search on other indexed keys, thus we use what we call a 'broadcast query'. We'd be happy with better faster vacuums or limits on xids (or both!) -- Paul Lindner| | | | | | | | | | [EMAIL PROTECTED] pgpD0OBStu40z.pgp Description: PGP signature
[HACKERS] XID wraparound and busy databases
I was talking to someone at LinuxWorld and they mentioned they often have activity of 6k SELECTs per second, and that they were needing to autovacuum every few days because of xid wraparound. I did some calculations and found that: 6000 * 60 * 60 * 24 51840 or 500 million xids per day, confirming they would need the autovacuum to run every few days. Is enlarging the xid field something we should consider for 8.4? Is the autovacuum I/O less then the extra I/O needed for an expanded xid fields on every row? -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] XID wraparound and busy databases
Bruce Momjian [EMAIL PROTECTED] writes: Is enlarging the xid field something we should consider for 8.4? No. We just got the tuple header down to 24 bytes, we are not going to give that back and then some. If you are processing 6K transactions per second, you can afford to vacuum every couple days... and probably need to vacuum much more often than that anyway, to avoid table bloat. Possibly your respondent should think about trying to do more than one thing per transaction? regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] XID wraparound and busy databases
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Is enlarging the xid field something we should consider for 8.4? No. We just got the tuple header down to 24 bytes, we are not going to give that back and then some. If you are processing 6K transactions per second, you can afford to vacuum every couple days... and probably need to vacuum much more often than that anyway, to avoid table bloat. Possibly your respondent should think about trying to do more than one thing per transaction? OK, yea, I think that makes sense. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] XID wraparound and busy databases
Bruce Momjian wrote: Is enlarging the xid field something we should consider for 8.4? Is the autovacuum I/O less then the extra I/O needed for an expanded xid fields on every row? I doubt that's going to happen... Maybe we can do something to reduce the xid consumption? For example, reuse xids for read-only queries. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] XID wraparound and busy databases
Heikki Linnakangas [EMAIL PROTECTED] writes: Maybe we can do something to reduce the xid consumption? For example, reuse xids for read-only queries. Hmm, that's an idea. More simply, just keep the current transaction open (resetting everything but the XID) if we have made no changes by the time we're told to commit or rollback ... which is something we track already, so as not to waste cycles on useless commit XLOG records. You'd want some upper limit on transaction lifetime, so as to avoid the long lived transactions hurt VACUUM problem, but even reusing a single xact for a few seconds would pretty much eliminate this issue, I bet. It's hard to see how anyone could be doing 6K xacts/sec unless most are read-only. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] XID wraparound and busy databases
On Wed, Aug 15, 2007 at 12:49:52PM -0400, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Maybe we can do something to reduce the xid consumption? For example, reuse xids for read-only queries. Hmm, that's an idea. More simply, just keep the current transaction open (resetting everything but the XID) if we have made no changes by the time we're told to commit or rollback ... which is something we track already, so as not to waste cycles on useless commit XLOG records. You'd want some upper limit on transaction lifetime, so as to avoid the long lived transactions hurt VACUUM problem, but even reusing a single xact for a few seconds would pretty much eliminate this issue, I bet. It's hard to see how anyone could be doing 6K xacts/sec unless most are read-only. Aren't there potential issues with keeping the same XID if a transaction in the middle has modified data? IE: SELECT * FROM a DELETE FROM a ... SELECT * FROM a I'm thinking that in any system that's doing an enormous transaction rate, most will be read-only, which means there'd still be a lot of benefit to simply using one transaction until the next write transaction came along. Something else to think about... any app that's doing that kind of transaction rate is likely going to have a large number of backends, so it would be even better if one XID could be shared across backends. -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgp8a0Ey4UZrF.pgp Description: PGP signature
Re: [HACKERS] XID wraparound and busy databases
Decibel! [EMAIL PROTECTED] writes: Aren't there potential issues with keeping the same XID if a transaction in the middle has modified data? I don't see any, as long as you take a new snapshot. Something else to think about... any app that's doing that kind of transaction rate is likely going to have a large number of backends, so it would be even better if one XID could be shared across backends. Not sane in the least. What happens if two of them start to modify data, and then one commits and one rolls back? In any case, if we can cut the xact rate to one every few seconds per backend, the problem goes away. Actually ... an idea that might be worth investigating is to do something similar to what we already did for subtransactions: don't assign an XID at all until the transaction makes a data change. I think that the main reason for assigning an XID earlier is just that the ProcArray routines ignore backends that are not currently showing an active XID when they figure global XMIN. But a backend could probably advertise an XMIN, indicating the age of the oldest active snapshot it's got, without advertising an active XID. (This could also tie into the idea we discussed awhile back of tracking minimum XMIN better by maintaining a central list of open snapshots within each backend. Perhaps that should be totally decoupled from whether we are advertising an active XID ...) regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] XID wraparound and busy databases
On Wednesday 15 August 2007 09:49:52 Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Maybe we can do something to reduce the xid consumption? For example, reuse xids for read-only queries. Hmm, that's an idea. More simply, just keep the current transaction open (resetting everything but the XID) if we have made no changes by the time we're told to commit or rollback ... which is something we track already, so as not to waste cycles on useless commit XLOG records. Jan and myself were discussing something like this as it would relate to a subscribe process for slony. Jan care to summerize your thoughts on this? You'd want some upper limit on transaction lifetime, so as to avoid the long lived transactions hurt VACUUM problem, but even reusing a single xact for a few seconds would pretty much eliminate this issue, I bet. It's hard to see how anyone could be doing 6K xacts/sec unless most are read-only. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] XID wraparound and busy databases
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Is enlarging the xid field something we should consider for 8.4? No. We just got the tuple header down to 24 bytes, we are not going to give that back and then some. If you are processing 6K transactions per second, you can afford to vacuum every couple days... and probably need to vacuum much more often than that anyway, to avoid table bloat. Speaking of vacuum, hopefully we'll get some sort of dead space map in 8.4. If we keep track of frozen pages there, vacuuming to avoid xid wraparound will be much cheaper. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] XID wraparound and busy databases
Tom Lane írta: It's hard to see how anyone could be doing 6K xacts/sec unless most are read-only. regards, tom lane In a recent stress test with our PostgreSQL-based cluster between two machines 3 million transaction were performed with pgbench -c 150 -t 2 -s 200 in about _ten hours_. The primary machine (desktop-level machine for development) used a real disk, the secondary used tmpfs as PGDATA. Say whatever you want about my disk lying about flush, its 75MB/sec transfer rate transfer rate is real. So 5 million real transaction in 24 hours is not unrealistic. -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] XID wraparound and busy databases
Heikki Linnakangas írta: Zoltan Boszormenyi wrote: Tom Lane írta: It's hard to see how anyone could be doing 6K xacts/sec unless most are read-only. In a recent stress test with our PostgreSQL-based cluster between two machines 3 million transaction were performed with pgbench -c 150 -t 2 -s 200 in about _ten hours_. The primary machine (desktop-level machine for development) used a real disk, the secondary used tmpfs as PGDATA. Say whatever you want about my disk lying about flush, its 75MB/sec transfer rate transfer rate is real. So 5 million real transaction in 24 hours is not unrealistic. 6k xacts / s is five *hundred* million transactions, not five million... Blush. :-) You're right. However a single machine with ramdisk is able to do that. -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] XID wraparound and busy databases
Zoltan Boszormenyi wrote: Tom Lane írta: It's hard to see how anyone could be doing 6K xacts/sec unless most are read-only. In a recent stress test with our PostgreSQL-based cluster between two machines 3 million transaction were performed with pgbench -c 150 -t 2 -s 200 in about _ten hours_. The primary machine (desktop-level machine for development) used a real disk, the secondary used tmpfs as PGDATA. Say whatever you want about my disk lying about flush, its 75MB/sec transfer rate transfer rate is real. So 5 million real transaction in 24 hours is not unrealistic. 6k xacts / s is five *hundred* million transactions, not five million... -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] XID wraparound and busy databases
Zoltan Boszormenyi wrote: used a real disk, the secondary used tmpfs as PGDATA. Say whatever you want about my disk lying about flush, its 75MB/sec transfer rate transfer rate is real. So 5 million real transaction in 24 hours is not unrealistic. 6k xacts / s is five *hundred* million transactions, not five million... Blush. :-) You're right. However a single machine with ramdisk is able to do that. A battery-backed hard disk controller card can do that too. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] XID wraparound and busy databases
On Wednesday 15 August 2007 13:54, Tom Lane wrote: Decibel! [EMAIL PROTECTED] writes: Aren't there potential issues with keeping the same XID if a transaction in the middle has modified data? I don't see any, as long as you take a new snapshot. I'm a little confused, wouldnt the transaction that waits 30 minutes before modifying data need to get an XID that jives with the system when it's transaction started, not when it began manipulating data? Would it really be safe to take a new snapshot at that time, istm concurrent writers might have caused potential issues by that point. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] XID wraparound and busy databases
Robert Treat [EMAIL PROTECTED] writes: I'm a little confused, wouldnt the transaction that waits 30 minutes before modifying data need to get an XID that jives with the system when it's transaction started, not when it began manipulating data? Why? Would it really be safe to take a new snapshot at that time, You wouldn't take a new snapshot. The thought that occurs to me is that there's no reason that a transaction has to have an XID for itself before it takes a snapshot. We always special-case our own XID anyway. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] XID wraparound and busy databases
On Wed, Aug 15, 2007 at 5:06 PM, in message [EMAIL PROTECTED], Tom Lane [EMAIL PROTECTED] wrote: Robert Treat [EMAIL PROTECTED] writes: I'm a little confused, wouldnt the transaction that waits 30 minutes before modifying data need to get an XID that jives with the system when it's transaction started, not when it began manipulating data? Why? Would it really be safe to take a new snapshot at that time, You wouldn't take a new snapshot. The thought that occurs to me is that there's no reason that a transaction has to have an XID for itself before it takes a snapshot. We always special-case our own XID anyway. I'm having trouble picturing how that would work with a transaction using the SERIALIZABLE transaction isolation level, or would this just be done at the READ COMMITTED level? -Kevin ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] XID wraparound and busy databases
Kevin Grittner [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] wrote: You wouldn't take a new snapshot. The thought that occurs to me is that there's no reason that a transaction has to have an XID for itself before it takes a snapshot. We always special-case our own XID anyway. I'm having trouble picturing how that would work with a transaction using the SERIALIZABLE transaction isolation level, Why? You take a snapshot, you use it. If you later need to allocate an XID for yourself, you do that --- your own XID is surely uncommitted in any case, so this doesn't affect the validity of the snapshot. The bottom line here is that we need own-XID-if-any to be = snapshot xmin, but there's no obvious reason why it has to be snapshot xmax. This is, in fact, *not* the case for subtransaction XIDs, and I see no fundamental reason why it need be true for the top transaction XID. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq