Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
On Fri, 28 May 2004 14:47:01 -0400, Tom Lane [EMAIL PROTECTED] wrote: If putting back xmax is the price we must pay for nested transactions, then we *will* pay that price. Maybe not in this release, but it will inevitably happen. we = every Postgres user, even those that do not use subtransactions. price = 2% to 5% performance loss for databases where the working set is larger than main memory. Don't bother hollering veto ;-) Ok, I'll shut up till I have something concrete to support my opinion. Servus Manfred ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
On Fri, 28 May 2004 21:59:53 -0400, Alvaro Herrera [EMAIL PROTECTED] wrote: So the assumption was that when we see that this has happenned, the Cmin is no longer important (== every future command can already see the tuple) If it was that simple, we wouldn't have had to invent the CMIN_IS_CMAX flag. This has been discussed two years ago. Did you follow the link I posted last week? Every future command is not enough. You have to consider the current command and even commands started before this. Servus Manfred ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
On Tue, Jun 01, 2004 at 02:37:37PM +0200, Manfred Koizar wrote: On Fri, 28 May 2004 21:59:53 -0400, Alvaro Herrera [EMAIL PROTECTED] wrote: So the assumption was that when we see that this has happenned, the Cmin is no longer important (== every future command can already see the tuple) If it was that simple, we wouldn't have had to invent the CMIN_IS_CMAX flag. This has been discussed two years ago. Did you follow the link I posted last week? Every future command is not enough. You have to consider the current command and even commands started before this. Yes, I did follow it (you mean XMAX_IS_XMIN, right? I suppose no tuple can really have Cmin == Cmax). I'm not claiming I understood it fully though. But as you see, since the assumption is not valid we have to drop the idea and put back the Xmax as a field on its own on HeapTupleHeader (which is what I had done before Bruce persuaded me not to). I don't really like this idea but I don't see other way out. A couple of days ago I was going to propose putting Xmax as a separate field only as needed, in a way similar to the way Oid is handled --- thus we would enlarge the tuple if and only if the creating transaction deletes it. This would be nice because one would expect that there are not that many tuples created and deleted by the same transaction, so we'd localize the inefficiency of storing both fields (Cmin and Xmax) only on tuples that need it. While I was writing the proposal I realised that it'd mean enlarging tuples that are already on disk, and there's no way we can do that. If you have other ideas I'm all ears. I'm the last one to want that nested xacts make everything else work slower. -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) The eagle never lost so much time, as when he submitted to learn of the crow. (William Blake) ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
Manfred Koizar [EMAIL PROTECTED] writes: On Thu, 27 May 2004 16:50:24 -0400, Alvaro Herrera At first I thought I'd have to add back Xmax as a field on its own Veto! This would increase heap tuple header size == less tuples per page == more pages per table == more I/O == performance loss. If putting back xmax is the price we must pay for nested transactions, then we *will* pay that price. Maybe not in this release, but it will inevitably happen. Don't bother hollering veto ;-) I suspect that in fact this may be necessary. The justification for overlapping cmin and xmax hinges on two things: 1. within the creating transaction, we can handle the xmax=xmin case specially; 2. another xact could only want to store xmax into a committed tuple, therefore the original xact is done and we don't need cmin anymore. However this breaks down with nested xacts. For instance imagine this situation: * Outer transaction creates a tuple. * Inner transaction deletes this tuple (which it presumably can see). * Inner transaction rolls back. The tuple must still be visible to the outer xact. However now we have a problem: we've wiped out its cmin, which we need for visibility tests in the outer xact. We could possibly avoid this particular issue with sufficiently complex visibility rules. (I am thinking that we might be able to say that the inner xact can't see the tuple in question unless the creating command was done in the terms of the outer transaction, in which case perhaps we don't need its cmin anymore. But I fear that that won't work either. For instance a serializable cursor opened before the tuple was created should not be able to see it, so it sure seems like we need cmin.) And I don't feel confident that there are no other, even harder-to-avoid, cases to worry about. Something that just now occurred to me: could we identify subtransactions with commands? That is, cmin *is* the subtransaction ID, and xmin/xmax are always the parent xact? I'm not sure this works either, but it might be something to think about. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
Alvaro Herrera [EMAIL PROTECTED] writes: On Fri, May 28, 2004 at 03:48:11PM -0400, Tom Lane wrote: Nope. Think about sub-subtransactions. They are all aborted if the parent is, so BEGIN; BEGIN; -- cid = 1 BEGIN; -- cid = 2 INSERT INTO foo VALUES (1) -- cid=3 COMMIT; ROLLBACK; -- aborts from Cid 1 to Cid 3 -- here we can't see the tuple because Xmin == my-xid -- and Cmin=1 is aborted COMMIT; I assume this is how you think it works, isn't it? [ thinks about it for a while... ] Yeah, I guess you are right. Since we don't have threading, an outer transaction cannot assign any new CIDs while a subtransaction is in progress. Therefore, when a subtransaction ends, all CIDs from its start to current belong to either itself or its subtransactions. On abort we can just mark *all* of these as aborted. If we had to do anything at subtrans commit, we'd need more state, but we don't have to do anything at subtrans commit. So you're right, the per-open-subtrans state is just its starting CID. Slick. However, I just remembered why we rejected this idea to start with :-(. If we do it this way then when the overall xact commits, we no longer have state that tells which particular tuples are good or not. We would have to trawl for tuples written by aborted subtransactions and mark them dead before committing, else other transactions would think they were good. What this says is that we still need persistent pg_subtrans status. I'm not sure if we can use CIDs as subtrans IDs this way and still have a reasonably efficient storage representation for the global pg_subtrans table. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
On Thu, 27 May 2004 16:50:24 -0400, Alvaro Herrera [EMAIL PROTECTED] wrote: Now you are on the subject, can I ask you to take a peek at what I did regarding tuple headers? I did read your patch, but I didn't understand it. :-( At first I thought I'd have to add back Xmax as a field on its own Veto! This would increase heap tuple header size == less tuples per page == more pages per table == more I/O == performance loss. is there a situation on which we should need to peek at Cmin after setting Xmax for a particusar tuple? http://archives.postgresql.org/pgsql-hackers/2002-05/msg00090.php Servus Manfred ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
On Fri, May 28, 2004 at 02:47:01PM -0400, Tom Lane wrote: We could possibly avoid this particular issue with sufficiently complex visibility rules. (I am thinking that we might be able to say that the inner xact can't see the tuple in question unless the creating command was done in the terms of the outer transaction, in which case perhaps we don't need its cmin anymore. But I fear that that won't work either. For instance a serializable cursor opened before the tuple was created should not be able to see it, so it sure seems like we need cmin.) And I don't feel confident that there are no other, even harder-to-avoid, cases to worry about. Hm, the serializable cursor was the example I was looking for to show why the current idea does not work. Something that just now occurred to me: could we identify subtransactions with commands? That is, cmin *is* the subtransaction ID, and xmin/xmax are always the parent xact? I'm not sure this works either, but it might be something to think about. This seems a nice idea. We wouldn't need pg_subtrans at all, for starters -- no multiple Xids for a transaction tree. And the cmin/cmax test would only be done inside the backend running the xact tree, so it doesn't need to be stored permanently, nor shared. We would need to be able to mark individual CommandIds as aborted, and while checking Cmin and Cmax, not only see how they compare to the CurrentCommandId, but also whether they aborted. It looks simpler to me than the current design. The only problem would be _how_ to mark a bunch of CommandIds as aborted -- keeping them all in memory seems too heavy. A bitmap could be an interesting idea, but for a very big transaction we could need at most 2^32 bits, which is way too much. Runlength encoding maybe? Any graphic-library hacker around here with knowledge about compressing bit strings? I know nothing of this stuff. -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) Si no sabes adonde vas, es muy probable que acabes en otra parte. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
On Fri, May 28, 2004 at 03:48:11PM -0400, Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: On Fri, May 28, 2004 at 03:19:29PM -0400, Tom Lane wrote: We'd still need a plain CommandCounterIncrement facility, which means that actually a subtransaction would have to be a group of CIDs not just one. Right. We only need to store the borders though. Not even that: only the start, because the end is what is current at AbortSubTransaction() time. Nope. Think about sub-subtransactions. They are all aborted if the parent is, so BEGIN; BEGIN; -- cid = 1 BEGIN; -- cid = 2 INSERT INTO foo VALUES (1) -- cid=3 COMMIT; ROLLBACK; -- aborts from Cid 1 to Cid 3 -- here we can't see the tuple because Xmin == my-xid -- and Cmin=1 is aborted COMMIT; I assume this is how you think it works, isn't it? Because if not I am in serious trouble. But if it is, then CommandIds from 1 (start Cid) and 3 (current Cid) are aborted. And this can be kept in the master bitmap --- in fact, I didn't think about another one. -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) Un poeta es un mundo encerrado en un hombre (Victor Hugo) ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
On Fri, May 28, 2004 at 03:19:29PM -0400, Tom Lane wrote: We'd still need a plain CommandCounterIncrement facility, which means that actually a subtransaction would have to be a group of CIDs not just one. Right, this is why I suggested runlength (the group is contiguous). So there'd also need to be a data structure showing the CIDs associated with each open subtransaction --- this is what you'd consult to go and set the aborted bits if the subxact rolls back. Right. We only need to store the borders though. Not even that: only the start, because the end is what is current at AbortSubTransaction() time. I'll try this. -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) El miedo atento y previsor es la madre de la seguridad (E. Burke) ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
On Fri, May 28, 2004 at 04:45:28PM -0400, Tom Lane wrote: However, I just remembered why we rejected this idea to start with :-(. If we do it this way then when the overall xact commits, we no longer have state that tells which particular tuples are good or not. We would have to trawl for tuples written by aborted subtransactions and mark them dead before committing, else other transactions would think they were good. What this says is that we still need persistent pg_subtrans status. I'm not sure if we can use CIDs as subtrans IDs this way and still have a reasonably efficient storage representation for the global pg_subtrans table. I'm not sure if I understand your last assertion. We can in no way use the CID as subtrans ID in pg_subtrans, of course (it starts from 0 at each main transaction start). So pg_subtrans remains the same, and we assign a new Xid to each subtransaction. Each tuple gets Xmin/Xmax according to the Xid of the current subtransaction. Within the transaction tree we don't use the Xid to check for visibility, but Cmin/Cmax and the abort bitmap. When the Xmin/xmax does not belong to our transaction tree, we use pg_subtrans and pg_clog. -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) We are who we choose to be, sang the goldfinch when the sun is high (Sandman) ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
I wrote a while ago So pg_subtrans remains the same, and we assign a new Xid to each subtransaction. Each tuple gets Xmin/Xmax according to the Xid of the current subtransaction. Within the transaction tree we don't use the Xid to check for visibility, but Cmin/Cmax and the abort bitmap. ... and meanwhile I forgot why this subthread started: I haven't solved the original problem, which is that I want to avoid enlarging the HeapTupleHeader. But currently there is need for storing both a Cmin and a Xmax :-( Manfred: the code you read adds a single bit to the infomask, which is there basically to be able to say whether the Cmin has been overwritten with a Xmax. So the assumption was that when we see that this has happenned, the Cmin is no longer important (== every future command can already see the tuple), and we have to check the Xmax in pg_clog to see if the deleting subtransaction has aborted (in which case the tuple is visible to us). This idea was a result of a couple of hour of chat with Bruce where I explained that I didn't want to enlarge the HeapTupleHeader, and he came up with the bit idea. -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) En las profundidades de nuestro inconsciente hay una obsesiva necesidad de un universo lógico y coherente. Pero el universo real se halla siempre un paso más allá de la lógica (Irulan) ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
On Wed, 26 May 2004, Tom Lane wrote: if you have to do an UPDATE that affects every row of a large table UPDATE tab SET col = col + 1 which leaves you with N live rows, N dead rows, and lots of pain to get back down to a less-than-twice-normal-size table. (Traditional way is VACUUM FULL; CLUSTER is better, but still painful.) As of 7.5 you could hack this with ALTER TABLE tab ALTER COLUMN col TYPE same-type USING col + 1 which will have the effect of rewriting a fresh table file containing the updated rows, and dropping the old file at commit. I'm not real sure where to document this trick but it seems like we ought to mention it someplace. Isn't it better to detect a UPDATE without a where and do that update in the same way as the alter table above? Then we don't need to document and learn a new non standard way of doing an update. -- /Dennis Björklund ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
What would be useful for that (as of 7.5) is a dummy ALTER TABLE TYPE, viz ALTER TABLE tab ALTER COLUMN col TYPE same-type-it-already-has I think we should add special syntax for this purpose, since I would like to (or someone else later on) see all possible cases of alter column short circuited. The syntax I would see fit is something like: ALTER TABLE tab [MOVE] TABLESPACE xy; For the above special case the tablespace would be the same as before. Andreas ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
Dennis Bjorklund [EMAIL PROTECTED] writes: On Wed, 26 May 2004, Tom Lane wrote: I'm not real sure where to document this trick but it seems like we ought to mention it someplace. Isn't it better to detect a UPDATE without a where and do that update in the same way as the alter table above? Then we don't need to document and learn a new non standard way of doing an update. No, because the locking implications are completely different. I don't want UPDATE to suddenly decide it needs an exclusive lock on the table based on the shape of the WHERE clause. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
I did a CLUSTER - it took just under 12 hours. The original load of the table (including parsing all the geometries) took about 4 hours (+4 to build the indexes and vacuum analyse). I dont know why the cluster is so much slower... My SELECT * FROM table LIMIT 1; is now fast. Thanks for the help! As a side note, I think the behavior of the Database is a bit counter-intuitive. If I were to do a: UPDATE table SET i = i + 1; My table will leak 10Gb of disk space, make queries extreamly slow, and require taking the database off-line for 12 hours to fix it! Both the VACUUM and the UPDATE manual should be updated to say that this is whats happening. If I do my above query 100 times, will I be leaking a Terrabyte of disk space? Will increasing my max_fsm_pages to 2,000,000 solve my problem? dave ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
On Wed, 26 May 2004 18:17:55 -0400, Tom Lane [EMAIL PROTECTED] wrote: (Some days I think we should dump VACUUM FULL, because it's optimized for a case that's no longer very interesting...) So we still have to stick with VACUUM FULL for some time, right? The next set of compatibility breakers I'm currently working on requires a change in VACUUM FULL behaviour. I would only move tuples that are visible to all running transactions. OTOH I wouldn't stop at the first unmovable tuple. With X active tuple . free space or dead tuple y new tuple, not yet visible to a running transaction z deleted tuple, still visible to a running transaction the current implementation transforms this relation XXyX XzXX into XzXX XXyX The new implementation would leave it as XX.. ..y. .z.. If there are concurrent long-running transactions, the new VACUUM FULL wouldn't truncate the relation as aggressively as it does now. It could leave the relation with lots of free space near the end. This was absolutely unacceptable at the time when VACUUM FULL was designed. But now we could use lazy VACUUM as an excuse for VACUUM FULL not working so hard. After the transaction still seeing z terminates, VACUUM (without FULL) can truncate the relation to XX.. ..y. and when y is updated the new version will be stored in a lower block and plain VACUUM can truncate the relation again: XXY. AFAICS this would make vacuum.c much simpler (no more chain moves). Clearly this change alone doesn't have any merit. But would such a patch have any chance of being accepted, if it facilitates improvements in other areas? Servus Manfred ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
Manfred Koizar [EMAIL PROTECTED] writes: XX.. ..y. and when y is updated the new version will be stored in a lower block Oh? What makes you think that? I see no guarantee of it. AFAICS this would make vacuum.c much simpler (no more chain moves). How will it do that? I think you'd have to not move *any* updated tuples to be sure you don't need the chain-move mechanism. Not moving the outdated tuple isn't sufficient, you can't move the one it points at either. Clearly this change alone doesn't have any merit. But would such a patch have any chance of being accepted, if it facilitates improvements in other areas? I'm disinclined to mess with VACUUM FULL without a clearer explanation of where you're headed. So far as I can see, you're simply going to make VACUUM FULL less effective with no stated benefit. (BTW, it now occurs to me that CLUSTER and ALTER TABLE in their present forms may be broken, because they only copy rows that are valid according to SnapshotNow; this means that rows that are still visible to old transactions could be lost. The lack of any attempt to preserve update chain relationships seems ungood as well, if we might have old transactions come along and try to update the table later.) regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
On Thu, 27 May 2004 14:23:07 -0400, Tom Lane [EMAIL PROTECTED] wrote: and when y is updated the new version will be stored in a lower block Oh? What makes you think that? I see no guarantee of it. You're right, I see only a tendency, because the majority of free space is before the last block (obviously). But don't we try to store the new version on the same block as the old version? That'd weaken my argument a bit. I think you'd have to not move *any* updated tuples to be sure you don't need the chain-move mechanism. Yes, per definitionem (move only tuples that are visible to all). I'm disinclined to mess with VACUUM FULL without a clearer explanation of where you're headed. Have no fear. I won't change anything in the near term. As you were talking about the future of VACUUM FULL, I thought this might be a good opportunity to ask. The fact that you didn't outright reject the idea is good enough for now. I have no clear explanation at the moment, just some fuzzy ideas that are beginning to crystallise. I'm messing around with heap tuple headers again, and the Xvac field seems to get in the way, unless I can cut down the number of different scenarios where it is needed. Servus Manfred ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
On Thu, May 27, 2004 at 09:52:30PM +0200, Manfred Koizar wrote: I have no clear explanation at the moment, just some fuzzy ideas that are beginning to crystallise. I'm messing around with heap tuple headers again, and the Xvac field seems to get in the way, unless I can cut down the number of different scenarios where it is needed. Now you are on the subject, can I ask you to take a peek at what I did regarding tuple headers? At first I thought I'd have to add back Xmax as a field on its own, but later (in chat with Bruce) I arrived to the conclusion that it isn't really necessary, and I only added a bit to the infomask to flag when the Cmin is overridden with Xmax. However I'm not convinced that this is enough --- is there a situation on which we should need to peek at Cmin after setting Xmax for a particusar tuple? The problem was BEGIN; insert into foo values (1) begin delete from foo rollback -- at this point the tuple shold be visible, -- but it has my Xid as Xmin and Cmin was -- overriden with Xmax commit I'd appreciate your (Manfred's and Tom's) comments on the topic. -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) And as an added bonus, now my computer goes to the toilet for me, leaving me free to spend time on more useful activities! yay slug codefests! (C. Parker) ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] SELECT * FROM table LIMIT 1; is really slow
I have a table with about 16,000,000 rows in it. When I do a: SELECT * FROM table LIMIT 1; it takes about 10 minutes (thats about how long it takes to do a full sequential scan). I had originally thought that there might be a large number of wasted/retired tuples in the table so I vacuum analysed the database. It had no effect. I had a vacuum full going on the table for 17 hours before I killed it. Any idea why its so slow? It used to be instant. dave ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
David Blasby wrote: I have a table with about 16,000,000 rows in it. When I do a: SELECT * FROM table LIMIT 1; it takes about 10 minutes (thats about how long it takes to do a full sequential scan). I had originally thought that there might be a large number of wasted/retired tuples in the table so I vacuum analysed the database. It had no effect. I had a vacuum full going on the table for 17 hours before I killed it. Are you sure that the vacuum full was running or sitting there to wait an idle transaction ? Regards Gaetano Mendola ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
Gaetano Mendola wrote: David Blasby wrote: I have a table with about 16,000,000 rows in it. When I do a: SELECT * FROM table LIMIT 1; it takes about 10 minutes (thats about how long it takes to do a full sequential scan). I had originally thought that there might be a large number of wasted/retired tuples in the table so I vacuum analysed the database. It had no effect. I had a vacuum full going on the table for 17 hours before I killed it. Are you sure that the vacuum full was running or sitting there to wait an idle transaction ? It was running at about 90% CPU. I just did another vacuum analyse on the table: cwb_prod_5_20=# VACUUM ANALYZE verbose csn_edges; INFO: vacuuming public.csn_edges INFO: index csn_edges_group_code_idx now contains 16289929 row versions in 75789 pages INFO: index csn_edges_edge_id_idx now contains 16289929 row versions in 55210 pages INFO: index csn_edges_code_idx now contains 16289929 row versions in 61203 pages INFO: index csn_edges_outside_idx now contains 16289929 row versions in 75719 pages INFO: index csn_edges_the_geom_idx now contains 16289929 row versions in 238795 pages INFO: csn_edges: found 0 removable, 16289929 nonremovable row versions in 2783986 pages INFO: vacuuming pg_toast.pg_toast_126945560 INFO: index pg_toast_126945560_index now contains 441432 row versions in 3064 pages INFO: pg_toast_126945560: found 0 removable, 441432 nonremovable row versions in 154691 pages INFO: analyzing public.csn_edges INFO: csn_edges: 2783986 pages, 3000 rows sampled, 6724 estimated total rows VACUUM Its still slow! dave ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
David Blasby [EMAIL PROTECTED] writes: I have a table with about 16,000,000 rows in it. When I do a: SELECT * FROM table LIMIT 1; it takes about 10 minutes (thats about how long it takes to do a full sequential scan). The only explanation that comes to mind is huge amounts of dead space before the first live row. But the output of vacuum verbose would probably be enough to tell whether this is really the correct diagnosis. I had originally thought that there might be a large number of wasted/retired tuples in the table so I vacuum analysed the database. It had no effect. It wouldn't; you'd need vacuum full to collapse out the dead space. You could also try CLUSTER which is faster than VACUUM FULL when most of the data has to be moved anyway. (Some days I think we should dump VACUUM FULL, because it's optimized for a case that's no longer very interesting...) regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
David Blasby [EMAIL PROTECTED] writes: I just did another vacuum analyse on the table: Ah, here we go: INFO: csn_edges: found 0 removable, 16289929 nonremovable row versions in 2783986 pages That works out to just under 6 rows per 8K page, which wouldn't be too bad if the rows are 1K wide on average, but are they? (You might want to run contrib/pgstattuple to get some exact information about average tuple size.) INFO: analyzing public.csn_edges INFO: csn_edges: 2783986 pages, 3000 rows sampled, 6724 estimated total rows This looks like a smoking gun to me. The huge underestimate of number of rows from ANALYZE is a known failure mode of the existing sampling method when the early pages of the table are thinly populated. (Manfred just fixed that for 7.5, btw.) I think you want to VACUUM FULL or CLUSTER the table, and then take a look at your FSM settings and routine vacuuming frequency to see if you need to adjust them to keep this from happening again. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
Tom Lane wrote: INFO: csn_edges: found 0 removable, 16289929 nonremovable row versions in 2783986 pages That works out to just under 6 rows per 8K page, which wouldn't be too bad if the rows are 1K wide on average, but are they? (You might want to run contrib/pgstattuple to get some exact information about average tuple size.) The rows are wide - there's a PostGIS geometry present. INFO: analyzing public.csn_edges INFO: csn_edges: 2783986 pages, 3000 rows sampled, 6724 estimated total rows This looks like a smoking gun to me. The huge underestimate of number of rows from ANALYZE is a known failure mode of the existing sampling method when the early pages of the table are thinly populated. (Manfred just fixed that for 7.5, btw.) I think you want to VACUUM FULL or CLUSTER the table, and then take a look at your FSM settings and routine vacuuming frequency to see if you need to adjust them to keep this from happening again. I'm now clustering - thanks for the help! The history of this table is quite short - I just created it last week. The original table had a bigint column that I converted to int (using the alter table csn_edges rename to csn_edges_backup;CREATE TABLE csn_edges AS SELECT a,b,c::int,d,e FROM csn_edges; delete table csn_edges_backup; trick). I dont think there were any changes to the current csn_edges table after it was created. I have another copy of this table in another database - vacuum analyse verbose says its only 1,500,000 pages (vs 2,800,000). Shouldnt vacuum know your table is wasting 10Gb of space and fix it for you? Or at least HINT? Or a TIDY command? Should I be upping my FSM to 2,000,000 pages? dave ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
David Blasby [EMAIL PROTECTED] writes: I just did another vacuum analyse on the table: Ah, here we go: INFO: csn_edges: found 0 removable, 16289929 nonremovable row versions in 2783986 pages That works out to just under 6 rows per 8K page, which wouldn't be too bad if the rows are 1K wide on average, but are they? (You might want to run contrib/pgstattuple to get some exact information about average tuple size.) INFO: analyzing public.csn_edges INFO: csn_edges: 2783986 pages, 3000 rows sampled, 6724 estimated total rows This looks like a smoking gun to me. The huge underestimate of number of rows from ANALYZE is a known failure mode of the existing sampling method when the early pages of the table are thinly populated. (Manfred just fixed that for 7.5, btw.) Tom, is there a way choose between a sample and full? I think you want to VACUUM FULL or CLUSTER the table, and then take a look at your FSM settings and routine vacuuming frequency to see if you need to adjust them to keep this from happening again. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
It wouldn't; you'd need vacuum full to collapse out the dead space. You could also try CLUSTER which is faster than VACUUM FULL when most of the data has to be moved anyway. (Some days I think we should dump VACUUM FULL, because it's optimized for a case that's no longer very interesting...) Out of interest, is CLUSTER another fast way of truly removing OIDs from a table, after going SET WITHOUT OIDS? Chris ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
Christopher Kings-Lynne [EMAIL PROTECTED] writes: It wouldn't; you'd need vacuum full to collapse out the dead space. You could also try CLUSTER which is faster than VACUUM FULL when most of the data has to be moved anyway. (Some days I think we should dump VACUUM FULL, because it's optimized for a case that's no longer very interesting...) Out of interest, is CLUSTER another fast way of truly removing OIDs from a table, after going SET WITHOUT OIDS? I think not --- IIRC, CLUSTER just copies the tuples verbatim. It'd do fine for getting rid of dead tuples and unused space, but not for making any actual changes in the individual tuples. What would be useful for that (as of 7.5) is a dummy ALTER TABLE TYPE, viz ALTER TABLE tab ALTER COLUMN col TYPE same-type-it-already-has There not being any special case to short-circuit this, the thing will sit there and reconstruct the tuples and write them into a fresh table file (and rebuild the indexes, too). Net effect is to collapse out any dropped columns (or OIDs) as well as losing dead tuples and free space. I was just noticing today that the ALTER TABLE documentation suggests a dummy UPDATE and VACUUM FULL to get rid of the space occupied by a dropped column. On reflection the ALTER TYPE method is likely to be an order of magnitude faster. Will update the docs. A further thought along this line: if you have to do an UPDATE that affects every row of a large table, the traditional way is, eg, UPDATE tab SET col = col + 1 which leaves you with N live rows, N dead rows, and lots of pain to get back down to a less-than-twice-normal-size table. (Traditional way is VACUUM FULL; CLUSTER is better, but still painful.) As of 7.5 you could hack this with ALTER TABLE tab ALTER COLUMN col TYPE same-type USING col + 1 which will have the effect of rewriting a fresh table file containing the updated rows, and dropping the old file at commit. Sweet. (Peak disk usage is 2x normal either way, but you don't pay through the nose to get back down to 1x.) I'm not real sure where to document this trick but it seems like we ought to mention it someplace. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
David Blasby [EMAIL PROTECTED] writes: I have another copy of this table in another database - vacuum analyse verbose says its only 1,500,000 pages (vs 2,800,000). Hmm ... this is consistent with the idea that you did an UPDATE affecting every row of the table. That would generate N new rows which would all get added to the end of the table, if there were no dead space available within the table. After commit you have a lot of pages worth of dead tuples followed by a lot of pages worth of live tuples. Plain VACUUM cannot do much to fix this since it doesn't move rows around. VACUUM FULL will fix it, but its index-update overhead is high enough that CLUSTER is a better deal. See followup message discussing ALTER TABLE as another alternative. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
Tom Lane wrote: of dead tuples followed by a lot of pages worth of live tuples. Plain VACUUM cannot do much to fix this since it doesn't move rows around. VACUUM FULL will fix it, but its index-update overhead is high enough that CLUSTER is a better deal. Tom: I was interested in performance improvements from cluster, so I tried to cluster a table on a spatial index: dra_working=# \d geomtest Table public.geomtest Column | Type | Modifiers +--+--- rd_segment_id | integer | admit_date | date | retire_date| date | most_recent| boolean | lineargeometry | geometry | Indexes: geomtest_idx gist (lineargeometry) dra_working=# cluster geomtest_idx on geomtest; ERROR: CLUSTER: cannot cluster when index access method does not handle nulls You may be able to work around this by marking column lineargeometry NOT NULL dra_working=# select version(); version --- PostgreSQL 7.3.6 on i686-pc-linux-gnu As of quite a while ago (7.2?) the GiST access method was made null-safe by Teodor and Oleg, I think. Is this a safety wrapper left over from before the upgrade to GiST? -- __ / | Paul Ramsey | Refractions Research | Email: [EMAIL PROTECTED] | Phone: (250) 885-0632 \_ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]