Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Luke Lonergan
Csaba, Regularly clustering is out of question as it would render the system unusable for hours. There's no 0 activity hour we could use for such stuff. There's always something happening, only the overall load is smaller at night... We are planning to implement a btree organized table,

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
The example is a very active web site, the flow is this: query for session information process HTTP request update session information This happens for EVERY http request. Chances are that you won't have concurrent requests for the same row, but you may have well over 100 HTTP server

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
I suppose you have a table memberships (user_id, group_id) or something like it ; it should have as few columns as possible ; then try regularly clustering on group_id (maybe once a week) so that all the records for a particular group are close together. Getting the members of a group to

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Csaba Nagy
Let me ask a question, you have this hundred million row table. OK, how much of that table is read/write? Would it be posible to divide the table into two (or more) tables where one is basically static, only infrequent inserts and deletes, and the other is highly updated? Well, all of it is

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Zeugswetter Andreas DCP SD
back and forth the data between an archive table and the live table, based on how active the groups are, I can't imagine any other way of partitioning it. And that would also mean some quite big load given the pretty high dynamics of the groups. You said the activity comes in bursts per

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
Let me ask a question, you have this hundred million row table. OK, how much of that table is read/write? Would it be posible to divide the table into two (or more) tables where one is basically static, only infrequent inserts and deletes, and the other is highly updated? Well, all of it is

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Csaba Nagy
You said the activity comes in bursts per group, so the obvious partitioning would be per group. If you have too many groups to have one partition per group you could try to find some modulo or other rule to spread them into separate partitions. This could be a solution... but then I'm not

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Csaba Nagy
Well, the only thing left is to cluster the database. There are a couple ways to do this, one switch to a platform that supports clustering or create an API to wrap multiple databases. If your queries are simple and limited, you could create an HTTP/XML service that wraps a number of

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Hannu Krosing
Ühel kenal päeval, N, 2006-06-22 kell 12:41, kirjutas Mark Woodward: Depending on exact details and optimisations done, this can be either slower or faster than postgresql's way, but they still need to do something to get transactional visibility rules implemented. I think they have a

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Martijn van Oosterhout
On Fri, Jun 23, 2006 at 03:30:45PM +0200, Csaba Nagy wrote: It sounds like you have a big problem and you need a big solution. Well, Postgres does a decent job as it is. The problem is under peek load, sometimes it gets bogged down and the usual things like vacuum will not help immediately.

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Alvaro Herrera
Csaba Nagy wrote: This could be a solution... but then I'm not sure how well would do queries which need the first 10 records based on some criteria which does not include the group id. I guess limit queries across the union of the partitions don't work too well for now, and we do have such

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Jonah H. Harris
On 6/22/06, Alvaro Herrera [EMAIL PROTECTED] wrote: Not in all systems. A few now perform in-memory UNDO and only write it to disk if and when it is required. How does that work? If the last transaction is not finished after it wrote the tuple when the power goes out, and the UNDO is not

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
Ühel kenal päeval, N, 2006-06-22 kell 12:41, kirjutas Mark Woodward: Depending on exact details and optimisations done, this can be either slower or faster than postgresql's way, but they still need to do something to get transactional visibility rules implemented. I think they have a

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Alvaro Herrera
Mark Woodward wrote: In case of the number of actively modified rows being in only tens or low hundreds of thousands of rows, (i.e. the modified set fits in memory) the continuous vacuum process shows up as just another backend, not really taking order of magnitude more resources. It

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread A.M.
On Fri, June 23, 2006 9:56 am, Martijn van Oosterhout wrote: On Fri, Jun 23, 2006 at 03:30:45PM +0200, Csaba Nagy wrote: It sounds like you have a big problem and you need a big solution. Well, Postgres does a decent job as it is. The problem is under peek load, sometimes it gets bogged

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
Mark Woodward wrote: In case of the number of actively modified rows being in only tens or low hundreds of thousands of rows, (i.e. the modified set fits in memory) the continuous vacuum process shows up as just another backend, not really taking order of magnitude more resources. It

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Csaba Nagy
These sort of solutions, IMHO, don't show how good PostgreSQL is, but show where it is very lacking. We all know Postgres is lacking; some of us try to improve it (some with more success than others). People who know the current limitations but like the capabilities, try to find

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Zeugswetter Andreas DCP SD
This could be a solution... but then I'm not sure how well would do queries which need the first 10 records based on some criteria which does not include the group id. I guess limit queries across the union of the partitions don't work too well for now, and we do have such queries.

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Hannu Krosing
Ühel kenal päeval, N, 2006-06-22 kell 13:49, kirjutas Mark Woodward: Christopher Browne [EMAIL PROTECTED] writes: Basically there's no free lunch: if you want the benefits of MVCC it's going to cost you somewhere. In the Postgres design you pay by having to do VACUUM pretty often for

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Csaba Nagy
First of all, I base my assumptions on what I recall to have read on this list, as I didn't try yet partitioning using inheritance. It's not trivial to set up and I didn't have the time to play with it yet. So I wouldn't know for sure that it won't work fine with our application, and that will

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread mark
On Fri, Jun 23, 2006 at 10:24:06AM -0400, Alvaro Herrera wrote: Mark Woodward wrote: In case of the number of actively modified rows being in only tens or low hundreds of thousands of rows, (i.e. the modified set fits in memory) the continuous vacuum process shows up as just another

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Hannu Krosing
Ühel kenal päeval, R, 2006-06-23 kell 10:30, kirjutas Mark Woodward: What is interesting is setting up the server so that you can service your loads comfortably. Running the server at 100% lead is not anything you want to do on production server. There will be things you need to do

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Csaba Nagy
Usually it gets really bad if you *don't* run vacuum continuously, maybe hopeing to do it in slower times at night. For high-update db you have to run it continuously, maybe having some 5-15 sec pauses between runs. And how much I/O does this take? Surprisingly its mostly WAL

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Tom Lane
Csaba Nagy [EMAIL PROTECTED] writes: Surprisingly its mostly WAL traffic, the heap/index pages themselves are often not yet synced to disk by time of vacuum, so no additional traffic there. If you had made 5 updates per page and then vacuum it, then you make effectively 1 extra WAL write

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
Bottom line: there's still lots of low-hanging fruit. Why are people feeling that we need to abandon or massively complicate our basic architecture to make progress? regards, tom lane I, for one, see a particularly nasty unscalable behavior in the implementation of

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread David Fetter
On Fri, Jun 23, 2006 at 02:30:29PM -0400, Mark Woodward wrote: Bottom line: there's still lots of low-hanging fruit. Why are people feeling that we need to abandon or massively complicate our basic architecture to make progress? regards, tom lane I, for one, see

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Jonah H. Harris
On 6/23/06, Mark Woodward [EMAIL PROTECTED] wrote: I, for one, see a particularly nasty unscalable behavior in the implementation of MVCC with regards to updates. I think this is a fairly common acceptance. The overhead required to perform an UPDATE in PostgreSQL is pretty heavy. Actually,

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Jochem van Dieten
On 6/23/06, Mark Woodward wrote: For each update to a row additional work needs to be done to access that row. Surely a better strategy can be done, especially considering that the problem being solved is a brief one. The only reason why you need previous versions of a row is for transactions

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Bruce Momjian
Tom Lane wrote: If you're doing heavy updates of a big table then it's likely to end up visiting most of the table anyway, no? There is talk of keeping a map of dirty pages, but I think it'd be a win for infrequently-updated tables, not ones that need constant vacuuming. I think a lot of

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Florian Weimer
* Gavin Sherry: Tom Lane's MVCC talk: http://www.postgresql.org/files/developer/transactions.pdf Is this still up-to-date with regard to to partial page writes? I hope that has been fixed (like the fsync issue). ---(end of broadcast)--- TIP 9:

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
On 6/23/06, Mark Woodward [EMAIL PROTECTED] wrote: I, for one, see a particularly nasty unscalable behavior in the implementation of MVCC with regards to updates. I think this is a fairly common acceptance. The overhead required to perform an UPDATE in PostgreSQL is pretty heavy.

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes: I think at some point we have to admit that _polling_ the tables, which is what autovacuum does, just isn't going to work well, no matter how much it is tweeked, and another approach should be considered for certain workload cases. Autovacuum polls in

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Bruce Momjian
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: I think at some point we have to admit that _polling_ the tables, which is what autovacuum does, just isn't going to work well, no matter how much it is tweeked, and another approach should be considered for certain workload cases.

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Hannu Krosing
Ühel kenal päeval, R, 2006-06-23 kell 18:05, kirjutas Csaba Nagy: Usually it gets really bad if you *don't* run vacuum continuously, maybe hopeing to do it in slower times at night. For high-update db you have to run it continuously, maybe having some 5-15 sec pauses between runs.

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Jonah H. Harris
On 6/23/06, Mark Woodward [EMAIL PROTECTED] wrote: Rather than keep references to all versions of all rows in indexes, keep only a reference to the first or key row of each row, and have the first version of a row form the head of a linked list to subsequent versions of each row. The list will

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
Tom Lane wrote: If you're doing heavy updates of a big table then it's likely to end up visiting most of the table anyway, no? There is talk of keeping a map of dirty pages, but I think it'd be a win for infrequently-updated tables, not ones that need constant vacuuming. I think a lot of

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Jonah H. Harris
On 6/23/06, Mark Woodward [EMAIL PROTECTED] wrote: Vacuum is findimentally inefficient The theory of database cleanup (VACUUM) is good, but has nothing to do with the overhead we currently encounter in performing an update. I have been ranting about a first row strategy, one where the first

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Stefan Kaltenbrunner
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: I think at some point we have to admit that _polling_ the tables, which is what autovacuum does, just isn't going to work well, no matter how much it is tweeked, and another approach should be considered for certain workload cases.

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Jonah H. Harris
On 6/23/06, Tom Lane [EMAIL PROTECTED] wrote: What I see in this discussion is a huge amount of the grass must be greener on the other side syndrome, and hardly any recognition that every technique has its downsides and complications. I'm being totally objective. I don't think we should

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Bruce Momjian
Jonah H. Harris wrote: On 6/23/06, Tom Lane [EMAIL PROTECTED] wrote: What I see in this discussion is a huge amount of the grass must be greener on the other side syndrome, and hardly any recognition that every technique has its downsides and complications. I'm being totally objective.

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Rick Gigger
Just out of curiosity Mark, didn't you write your session daemon so that you don't have to put sessions in postgres anymore? Or are you just giving that as an example of a very wide, very heavily updated table? My session tables have been an extreme case of this problem, but no other

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Rick Gigger
On Jun 22, 2006, at 10:33 AM, Mark Woodward wrote: Ühel kenal päeval, N, 2006-06-22 kell 09:59, kirjutas Mark Woodward: After a long battle with technology, [EMAIL PROTECTED] (Mark Woodward), an earthling, wrote: Clinging to sanity, [EMAIL PROTECTED] (Mark Woodward) mumbled into It

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Rick Gigger
On Jun 22, 2006, at 2:36 PM, Mark Woodward wrote: What you seem not to grasp at this point is a large web-farm, about 10 or more servers running PHP, Java, ASP, or even perl. The database is usually the most convenient and, aside from the particular issue we are talking about, best

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread mark
On Fri, Jun 23, 2006 at 03:08:34PM -0400, Bruce Momjian wrote: Tom Lane wrote: ... suggesting. We're having a hard enough time debugging and optimizing *one* storage model. I think the correct path forward is to stick with the same basic storage model and vacuuming concept, and address

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
Just out of curiosity Mark, didn't you write your session daemon so that you don't have to put sessions in postgres anymore? The original project started as a shared key/value system for a beowulf cluster in the late 90s, but got reworked to be a session handler for PHP when I worked with Stig,

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Mark Woodward
Clinging to sanity, [EMAIL PROTECTED] (Mark Woodward) mumbled into her beard: We all know that PostgreSQL suffers performance problems when rows are updated frequently prior to a vacuum. The most serious example can be seen by using PostgreSQL as a session handler for a busy we site. You may

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Zeugswetter Andreas DCP SD
Each time the record is updated, a new version is created, thus lengthening the correct version search each time row is accessed, until, of course, the next vacuum comes along and corrects the index to point to the latest version of the record. Is that a fair explanation? No,

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Christopher Browne
After a long battle with technology, [EMAIL PROTECTED] (Mark Woodward), an earthling, wrote: Clinging to sanity, [EMAIL PROTECTED] (Mark Woodward) mumbled into her beard: We all know that PostgreSQL suffers performance problems when rows are updated frequently prior to a vacuum. The most

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Mark Woodward
After a long battle with technology, [EMAIL PROTECTED] (Mark Woodward), an earthling, wrote: Clinging to sanity, [EMAIL PROTECTED] (Mark Woodward) mumbled into her beard: [snip] 1. The index points to all the versions, until they get vacuumed out. It can't point to all versions, it points

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Alvaro Herrera
Mark Woodward wrote: Hmm, OK, then the problem is more serious than I suspected. This means that every index on a row has to be updated on every transaction that modifies that row. Is that correct? Add an index entry, yes. I am attaching some code that shows the problem with regard to

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Jonah H. Harris
On 6/22/06, Alvaro Herrera [EMAIL PROTECTED] wrote: Hmm, OK, then the problem is more serious than I suspected. This means that every index on a row has to be updated on every transaction that modifies that row. Is that correct? Add an index entry, yes. Again, this is a case for

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Csaba Nagy
[...] There has to be a more linear way of handling this scenario. So vacuum the table often. Good advice, except if the table is huge :-) Here we have for example some tables which are frequently updated but contain 100 million rows. Vacuuming that takes hours. And the dead row

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Mario Weilguni
Am Donnerstag, 22. Juni 2006 16:09 schrieb Csaba Nagy: [...] There has to be a more linear way of handling this scenario. So vacuum the table often. Good advice, except if the table is huge :-) Here we have for example some tables which are frequently updated but contain 100 million

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Hannu Krosing
Ühel kenal päeval, N, 2006-06-22 kell 09:59, kirjutas Mark Woodward: After a long battle with technology, [EMAIL PROTECTED] (Mark Woodward), an earthling, wrote: Clinging to sanity, [EMAIL PROTECTED] (Mark Woodward) mumbled into It pointed to *ALL* the versions. Hmm, OK, then the

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Hannu Krosing
Ühel kenal päeval, N, 2006-06-22 kell 10:20, kirjutas Jonah H. Harris: On 6/22/06, Alvaro Herrera [EMAIL PROTECTED] wrote: Hmm, OK, then the problem is more serious than I suspected. This means that every index on a row has to be updated on every transaction that modifies that row. Is

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Mark Woodward
Ühel kenal päeval, N, 2006-06-22 kell 09:59, kirjutas Mark Woodward: After a long battle with technology, [EMAIL PROTECTED] (Mark Woodward), an earthling, wrote: Clinging to sanity, [EMAIL PROTECTED] (Mark Woodward) mumbled into It pointed to *ALL* the versions. Hmm, OK, then the

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Jonah H. Harris
On 6/22/06, Hannu Krosing [EMAIL PROTECTED] wrote: I guess that MySQL on its original storage does that, but they allow only one concurrent update per table and no transactions. More like practically every commercial database. As ~97% of transactions commit (yes, some can argue that number),

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Mark Woodward
Ühel kenal päeval, N, 2006-06-22 kell 10:20, kirjutas Jonah H. Harris: On 6/22/06, Alvaro Herrera [EMAIL PROTECTED] wrote: Hmm, OK, then the problem is more serious than I suspected. This means that every index on a row has to be updated on every transaction that modifies that row. Is

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Tom Lane
Christopher Browne [EMAIL PROTECTED] writes: After a long battle with technology, [EMAIL PROTECTED] (Mark Woodward), an earthling, wrote: Not true. Oracle does not seem to exhibit this problem. Oracle suffers a problem in this regard that PostgreSQL doesn't; in Oracle, rollbacks are quite

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Jochem van Dieten
On 6/22/06, Mark Woodward wrote: (..) thousand active sessions (..) If an active user causes a session update once a second (..) Generally speaking, sessions aren't updated when they change, they are usually updated per HTTP request. The data in a session may not change, but the session

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Lukas Smith
Tom Lane wrote: Basically there's no free lunch: if you want the benefits of MVCC it's going to cost you somewhere. In the Postgres design you pay by having to do VACUUM pretty often for heavily-updated tables. I don't think that decision is fundamentally wrong --- the attractive thing about

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Lukas Smith
Jochem van Dieten wrote: make the session handler smarter? And if you can't do that, put some logic in the session table that turns an update without changes into a no-op? err isnt that one the job of the database? regards, Lukas ---(end of

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Chris Browne
[EMAIL PROTECTED] (Csaba Nagy) writes: [...] There has to be a more linear way of handling this scenario. So vacuum the table often. Good advice, except if the table is huge :-) ... Then the table shouldn't be designed to be huge. That represents a design error. Here we have for

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Jonah H. Harris
On 6/22/06, Tom Lane [EMAIL PROTECTED] wrote: The Oracle design has got other drawbacks: if you need to access a row version other than than the very latest, you need to go searching in the rollback segments for it. There are ways to implement this functionality without implementing it exactly

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread D'Arcy J.M. Cain
On Thu, 22 Jun 2006 19:01:38 +0200 Lukas Smith [EMAIL PROTECTED] wrote: Jochem van Dieten wrote: make the session handler smarter? And if you can't do that, put some logic in the session table that turns an update without changes into a no-op? err isnt that one the job of the

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Rod Taylor
Here we have for example some tables which are frequently updated but contain 100 million rows. Vacuuming that takes hours. And the dead row candidates are the ones which are updated again and again and looked up frequently... This demonstrates that archival material and active data

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Mark Woodward
Christopher Browne [EMAIL PROTECTED] writes: After a long battle with technology, [EMAIL PROTECTED] (Mark Woodward), an earthling, wrote: Not true. Oracle does not seem to exhibit this problem. Oracle suffers a problem in this regard that PostgreSQL doesn't; in Oracle, rollbacks are quite

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Rod Taylor
You mean systems that are designed so exactly, that they can't take 10% performance change ? No, that's not really the point, performance degrades over time, in one minute it degraded 10%. The update to session ratio has a HUGE impact on PostgreSQL. If you have a thousand active

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Mark Woodward
You mean systems that are designed so exactly, that they can't take 10% performance change ? No, that's not really the point, performance degrades over time, in one minute it degraded 10%. The update to session ratio has a HUGE impact on PostgreSQL. If you have a thousand active

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Jonah H. Harris
On 6/22/06, Rod Taylor [EMAIL PROTECTED] wrote: If you INSERT into multiple partitions (by time -- say one per minute) and TRUNCATE periodically (30 minute old partitions for 30 minute expiry) it works much better. Expiring the session is quite fast as well since they'll go away with the

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Rod Taylor
On Thu, 2006-06-22 at 13:42 -0400, Jonah H. Harris wrote: On 6/22/06, Rod Taylor [EMAIL PROTECTED] wrote: If you INSERT into multiple partitions (by time -- say one per minute) and TRUNCATE periodically (30 minute old partitions for 30 minute expiry) it works much better. Expiring the

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Heikki Linnakangas
On Thu, 22 Jun 2006, Jonah H. Harris wrote: On 6/22/06, Rod Taylor [EMAIL PROTECTED] wrote: If you INSERT into multiple partitions (by time -- say one per minute) and TRUNCATE periodically (30 minute old partitions for 30 minute expiry) it works much better. Expiring the session is quite fast

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Greg Stark
Tom Lane [EMAIL PROTECTED] writes: The Oracle design has got other drawbacks: if you need to access a row version other than than the very latest, you need to go searching in the rollback segments for it. This is slow (no index help) Just for the record, if i understood correctly -- this

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Lukas Smith
Greg Stark wrote: There are other solutions too. I never used DB2 but I was led to believe they used their transaction log to retrieve old versions of the records. Someone else here claimed DB2 didn't implement MVCC at all so perhaps that's wrong though. I would be suprised giving this paper:

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Tom Lane
Lukas Smith [EMAIL PROTECTED] writes: Jochem van Dieten wrote: make the session handler smarter? And if you can't do that, put some logic in the session table that turns an update without changes into a no-op? err isnt that one the job of the database? No. That idea has been suggested and

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: The Oracle design has got other drawbacks: if you need to access a row version other than than the very latest, you need to go searching in the rollback segments for it. This is slow (no index help) Just for the

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread PFC
As you can see, in about a minute at high load, this very simple table lost about 10% of its performance, and I've seen worse based on update frequency. Before you say this is an obscure problem, I can tell you it isn't. I have worked with more than a few projects that had to switch away

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Alvaro Herrera
Adding back pgsql-hackers. Mark Woodward wrote: Mark Woodward wrote: Hmm, OK, then the problem is more serious than I suspected. This means that every index on a row has to be updated on every transaction that modifies that row. Is that correct? Add an index entry, yes. I am

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Mark Woodward
As you can see, in about a minute at high load, this very simple table lost about 10% of its performance, and I've seen worse based on update frequency. Before you say this is an obscure problem, I can tell you it isn't. I have worked with more than a few projects that had to switch away

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Greg Stark
Tom Lane [EMAIL PROTECTED] writes: Yeah, you should be able to find the older version easily enough, if you arrived at the newer version and realized you needed to visit the older version. But this fails in scenarios where you are searching on a column that's been updated --- the index

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread PFC
What you seem not to grasp at this point is a large web-farm, about 10 or more servers running PHP, Java, ASP, or even perl. The database is usually the most convenient and, aside from the particular issue we are talking about, best suited. The answer is sticky sessions : each user is

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Mark Woodward
What you seem not to grasp at this point is a large web-farm, about 10 or more servers running PHP, Java, ASP, or even perl. The database is usually the most convenient and, aside from the particular issue we are talking about, best suited. The answer is sticky sessions : each user

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Andrew Dunstan
PFC wrote: What you seem not to grasp at this point is a large web-farm, about 10 or more servers running PHP, Java, ASP, or even perl. The database is usually the most convenient and, aside from the particular issue we are talking about, best suited. The answer is sticky sessions

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread David Fetter
On Thu, Jun 22, 2006 at 07:01:38PM +0200, Lukas Smith wrote: Jochem van Dieten wrote: make the session handler smarter? And if you can't do that, put some logic in the session table that turns an update without changes into a no-op? err isnt that one the job of the database? By no means.

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Jim Nasby
On Jun 22, 2006, at 12:56 PM, Greg Stark wrote: Just for the record, if i understood correctly -- this was all a bit black magicky -- Oracle found the data in the rollback segment by storing a pointer to it in the block header where the updated data is. Ie, it could jump straight to the

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Jim Nasby
On Jun 22, 2006, at 2:00 PM, Mark Woodward wrote: I actually have a good number of years of experience in this topic, and memcached or file system files are NOT the best solutions for a server farm. What's wrong with memcached for session data? -- Jim C. Nasby, Sr. Engineering Consultant

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Jim Nasby
On Jun 22, 2006, at 1:09 PM, Tom Lane wrote: Lukas Smith [EMAIL PROTECTED] writes: Jochem van Dieten wrote: make the session handler smarter? And if you can't do that, put some logic in the session table that turns an update without changes into a no-op? err isnt that one the job of the

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Tom Lane
Jim Nasby [EMAIL PROTECTED] writes: Question: do we currently create new index entries even if the index key hasn't changed? Yes. If so, what's the purpose of storing the CTID of the next version in the old version of the row? So that UPDATE can always find the newest version of the

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Tom Lane
Jim Nasby [EMAIL PROTECTED] writes: What would be nice to add is the ability to perform that check more easily. As of 8.1... ... if NEW=OLD then ... ERROR: operator does not exist: test = test HINT: No operator matches the given name and argument type(s). You may need to add explicit

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Christopher Kings-Lynne
The example is a very active web site, the flow is this: query for session information process HTTP request update session information This happens for EVERY http request. Chances are that you won't have concurrent requests for the same row, but you may have well over 100 HTTP server

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Steve Atkins
On Jun 22, 2006, at 6:56 PM, Agent M wrote: On Jun 22, 2006, at 9:56 PM, Christopher Kings-Lynne wrote: The example is a very active web site, the flow is this: query for session information process HTTP request update session information This happens for EVERY http request. Chances are

Row comparison for tables (was Re: [HACKERS] vacuum, performance, and MVCC)

2006-06-22 Thread Tom Lane
I wrote: Jim Nasby [EMAIL PROTECTED] writes: What would be nice to add is the ability to perform that check more easily. As of 8.1... ... if NEW=OLD then ... ERROR: operator does not exist: test = test HINT: No operator matches the given name and argument type(s). You may need to

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Gavin Sherry
On Thu, 22 Jun 2006, Agent M wrote: On Jun 22, 2006, at 9:56 PM, Christopher Kings-Lynne wrote: The example is a very active web site, the flow is this: query for session information process HTTP request update session information This happens for EVERY http request. Chances are that

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Christopher Browne
The world rejoiced as [EMAIL PROTECTED] (Tom Lane) wrote: Lukas Smith [EMAIL PROTECTED] writes: Jochem van Dieten wrote: make the session handler smarter? And if you can't do that, put some logic in the session table that turns an update without changes into a no-op? err isnt that one the

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Jonah H. Harris
On 6/22/06, Gavin Sherry [EMAIL PROTECTED] wrote: There seems to be a bit of confusion about what MVCC is. Yes, PostgreSQL implements MVTO-style concurrency, Oracle implements MVRC, ... Let's not go into theory here, because there's plenty of papers and books on the subject. The other MVCC

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Gavin Sherry
On Thu, 22 Jun 2006, Jonah H. Harris wrote: Not in all systems. A few now perform in-memory UNDO and only write it to disk if and when it is required. Interesting... Overwriting MVCC comes with its own baggage. Ask any Oracle user about error ORA-01555[1]. There's also the added cost of

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Alvaro Herrera
Jonah H. Harris wrote: On 6/22/06, Gavin Sherry [EMAIL PROTECTED] wrote: When an update occurs, the existing row version is copied to te UNDO file Not in all systems. A few now perform in-memory UNDO and only write it to disk if and when it is required. How does that work? If the last

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Heikki Linnakangas
On Thu, 22 Jun 2006, Jim Nasby wrote: DB2 switched to MVCC in version 8. ... Um, no it didn't. - Heikki ---(end of broadcast)--- TIP 6: explain analyze is your friend

[HACKERS] vacuum, performance, and MVCC

2006-06-21 Thread Mark Woodward
We all know that PostgreSQL suffers performance problems when rows are updated frequently prior to a vacuum. The most serious example can be seen by using PostgreSQL as a session handler for a busy we site. You may have thousands or millions of active sessions, each being updated per page hit.

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-21 Thread Christopher Browne
Clinging to sanity, [EMAIL PROTECTED] (Mark Woodward) mumbled into her beard: We all know that PostgreSQL suffers performance problems when rows are updated frequently prior to a vacuum. The most serious example can be seen by using PostgreSQL as a session handler for a busy we site. You may

<    1   2   3   >