Re: [HACKERS] GIN fast insert
Why is tbm_add_page() not coded as a simple wrapper around tbm_mark_page_lossy()? As coded, it sometimes forces a whole bunch of pages *other than* the target page to become lossy too, which I cannot see a reason for it to be doing. [after digging in tidbitmap] Oops, I was wrong, I supposed that all pages in chunk should be lossy, but it's true only for chunk page. So, tbm_add_page() should only call tbm_mark_page_lossy()... -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
Teodor Sigaev teo...@sigaev.ru writes: Oops, I was wrong, I supposed that all pages in chunk should be lossy, but it's true only for chunk page. So, tbm_add_page() should only call tbm_mark_page_lossy()... OK, thanks, that's what I thought. I've changed it in the copy I'm editing here. I have another question. I added the following comment to ginInsertCleanup(); is it accurate? (If it isn't, I think the code is buggy ...) * This can be called concurrently by multiple backends, so it must cope. * On first glance it looks completely not concurrent-safe and not crash-safe * either. The reason it's okay is that multiple insertion of the same entry * is detected and treated as a no-op by gininsert.c. If we crash after * posting entries to the main index and before removing them from the * pending list, it's okay because when we redo the posting later on, nothing * bad will happen. Likewise, if two backends simultaneously try to post * a pending entry into the main index, one will succeed and one will do * nothing. We try to notice when someone else is a little bit ahead of * us in the process, but that's just to avoid wasting cycles. Only the * action of removing a page from the pending list really needs exclusive * lock. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
ginInsertCleanup(); is it accurate? (If it isn't, I think Exactly, that is right. * This can be called concurrently by multiple backends, so it must cope. * On first glance it looks completely not concurrent-safe and not crash-safe * either. The reason it's okay is that multiple insertion of the same entry * is detected and treated as a no-op by gininsert.c. If we crash after * posting entries to the main index and before removing them from the * pending list, it's okay because when we redo the posting later on, nothing * bad will happen. Likewise, if two backends simultaneously try to post * a pending entry into the main index, one will succeed and one will do * nothing. We try to notice when someone else is a little bit ahead of * us in the process, but that's just to avoid wasting cycles. Only the * action of removing a page from the pending list really needs exclusive * lock. regards, tom lane -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
I've committed this patch with additional editorialization --- mostly cosmetic changes, except for removing the stats-driven cleanup in favor of letting cleanups occur during auto-ANALYZE, as per my suggestion yesterday. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
On 3/24/09 1:18 PM, Tom Lane wrote: I've committed this patch with additional editorialization --- mostly cosmetic changes, except for removing the stats-driven cleanup in favor of letting cleanups occur during auto-ANALYZE, as per my suggestion yesterday. By my count, this was the last patch left for 8.4. No? --Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
On Tue, Mar 24, 2009 at 04:55:47PM -0700, Josh Berkus wrote: On 3/24/09 1:18 PM, Tom Lane wrote: I've committed this patch with additional editorialization --- mostly cosmetic changes, except for removing the stats-driven cleanup in favor of letting cleanups occur during auto-ANALYZE, as per my suggestion yesterday. By my count, this was the last patch left for 8.4. No? One more patch left according to the Commitfest wiki: B-Tree emulation for GIN. http://wiki.postgresql.org/wiki/CommitFestInProgress Cheers, David whisperbeta/ -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast-insert vs autovacuum scheduling
Tom Lane wrote: On top of those issues, there are implementation problems in the proposed relation_has_pending_indexes() check: it has hard-wired knowledge about GIN indexes, which means the feature cannot be extended to add-on index AMs; and it's examining indexes without any lock whatsoever on either the indexes or their parent table. (And we really would rather not let autovacuum take a lock here.) I wonder if it's workable to have GIN send pgstats a message with number of fast-inserted tuples, and have autovacuum check that number as well as dead/live tuples. ISTM this shouldn't be considered part of either vacuum or analyze at all, and have autovacuum invoke it separately from both, with its own decision equations and such. We could even have a scan over pg_class just for GIN indexes to implement this. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
While I'm looking at this thing... Why is tbm_add_page() not coded as a simple wrapper around tbm_mark_page_lossy()? As coded, it sometimes forces a whole bunch of pages *other than* the target page to become lossy too, which I cannot see a reason for it to be doing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast-insert vs autovacuum scheduling
Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane wrote: On top of those issues, there are implementation problems in the proposed relation_has_pending_indexes() check: I wonder if it's workable to have GIN send pgstats a message with number of fast-inserted tuples, and have autovacuum check that number as well as dead/live tuples. ISTM this shouldn't be considered part of either vacuum or analyze at all, and have autovacuum invoke it separately from both, with its own decision equations and such. We could even have a scan over pg_class just for GIN indexes to implement this. That's going in the wrong direction IMHO, because it's building GIN-specific infrastructure into the core system. There is no need for any such infrastructure if we just drive it off a post-ANALYZE callback. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast-insert vs autovacuum scheduling
On Mon, 2009-03-23 at 15:23 -0400, Tom Lane wrote: There is no need for any such infrastructure if we just drive it off a post-ANALYZE callback. That sounds reasonable, although it does seem a little strange for analyze to actually perform cleanup. Now that we have FSM, the cost of VACUUMing insert-only tables is a lot less. Does that possibly justify running VACUUM on insert-only tables? On tables without GIN indexes, that wouldn't be a complete waste, because it could set hint bits, which needs to be done sometime anyway. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast-insert vs autovacuum scheduling
Jeff Davis pg...@j-davis.com writes: On Mon, 2009-03-23 at 15:23 -0400, Tom Lane wrote: There is no need for any such infrastructure if we just drive it off a post-ANALYZE callback. That sounds reasonable, although it does seem a little strange for analyze to actually perform cleanup. My thought was to have GIN do cleanup only in an autovacuum-driven ANALYZE, not in a client-issued ANALYZE. You could argue it either way I suppose, but I agree that if a user says ANALYZE he's probably not expecting index cleanup to happen. Now that we have FSM, the cost of VACUUMing insert-only tables is a lot less. Well, not if you just did a huge pile of inserts, which is the case that we need to worry about here. On tables without GIN indexes, that wouldn't be a complete waste, because it could set hint bits, which needs to be done sometime anyway. True, but we have not chosen to make autovacuum do that, and whether we should or not seems to me to be orthogonal to when GIN index cleanup should happen. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
Teodor, can you confirm * we WAL log the insert into the pending list * we WAL log the move from the pending list to the main index Yes, I can and I confirm * that we maintain the pending list correctly during redo so that it can be accessed by index scans Not sure about correct locking in ginxlog.c - but it should fixable rather easy. The main thing with Hot Standby is that we can't do any writes. So a pending list cannot change solely because of a gingettuple call on the *standby*. But that's easy to disable. If all the inserts happened on Right. And suggest to increase work_mem. the primary node and all the reads happened on the standby, then pending list would never be cleaned up if the cleanup is triggered only by read. I would suggest that we trigger cleanup by read at threshold size X and trigger cleanup by insert at threshold size 5X. That avoids the strange threshold is not helpful here because for gininsert and gingettuple it depends on work_mem setting which could be very different on slave(s) and master. Next, it's possible that master never executes read-only queries (i.e. gingettuple will not be called) by application's design. This is a strong objection for invocation of cleanup from gingettuple. So, I'm inclined to Tom's idea about removing gingettuple interface although GIN with fastupdate=off doesn't affect discussed problem . I found many parts of the patch and docs quite confusing because of the way things are named. For me, this is a deferred or delayed insert technique to allow batching. I would prefer if everything used one description, rather than fast, pending, delayed etc. Terminology was taken from inverted index technique. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
it be? So far we've ruled out using the planner to prevent index scans when the pending list is long (because it's not reliable) and cleaning up the pending list during insert when needed (because it won't work with Hot Standby). We haven't decided what WILL work, During insert it will work with Hot Standby because there is no any limitation for number of pages touched or WAL records. There is a problem with cleanup invoked by gingettuple - slave could not start cleanup process at all and hence it could emit an error if tidbitmap becomes lossy. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
On Thu, Feb 26, 2009 at 11:41 AM, Teodor Sigaev teo...@sigaev.ru wrote: it be? So far we've ruled out using the planner to prevent index scans when the pending list is long (because it's not reliable) and cleaning up the pending list during insert when needed (because it won't work with Hot Standby). We haven't decided what WILL work, During insert it will work with Hot Standby because there is no any limitation for number of pages touched or WAL records. There is a problem with cleanup invoked by gingettuple - slave could not start cleanup process at all and hence it could emit an error if tidbitmap becomes lossy. I didn't think about that option - might be reasonable. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
Robert Haas robertmh...@gmail.com writes: On my system this takes about 45 ms to execute with default settings and about 90 ms to execute with index scan disabled. [ shrug... ] That's well within my threshold of pain for this. In any case, it might be possible to buy some/all of that back with minor optimization effort on the bitmap-scan code paths; nobody's ever really bothered to profile that AFAIK. There is no real difference in the useful work (page and tuple fetches) getting done in the two cases, so there's no reason in principle for bitmap scan to be much slower than indexscan here. The LIMIT case is the only one I'm aware of where there's a fundamental reason that bitmap scan should be slower. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
On Tue, Feb 24, 2009 at 10:39 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On my system this takes about 45 ms to execute with default settings and about 90 ms to execute with index scan disabled. [ shrug... ] That's well within my threshold of pain for this. In any case, it might be possible to buy some/all of that back with minor optimization effort on the bitmap-scan code paths; nobody's ever really bothered to profile that AFAIK. There is no real difference in the useful work (page and tuple fetches) getting done in the two cases, so there's no reason in principle for bitmap scan to be much slower than indexscan here. The LIMIT case is the only one I'm aware of where there's a fundamental reason that bitmap scan should be slower. Uh, a semi or anti join stops as soon as one matching row is found, doesn't it? ISTM that a semi or anti join is in essence an iterated limit 1 clause. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
On Tue, 2009-02-24 at 00:18 -0500, Robert Haas wrote: It only took me about 5 minutes to come up with a test case against CVS HEAD where disabling index scans resulted in a significant dropoff in performance. Here it is: On the other hand, Teodor showed a typical use case and a very substantial performance gain: http://archives.postgresql.org/message-id/497f4606.6070...@sigaev.ru My impression is that GIN is used almost entirely for full text search. Even though GIN is actually quite general (as the BTree patch shows), the current users we have to worry about are a fairly narrow group (correct me if I'm wrong here). I wonder how many people really use GIN with non-bitmap scans for some benefit? And even if the benefit exists, does the planner have a way to identify those cases reliably, or does it have to be done manually? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
Jeff Davis pg...@j-davis.com writes: On Tue, 2009-02-24 at 00:18 -0500, Robert Haas wrote: It only took me about 5 minutes to come up with a test case against CVS HEAD where disabling index scans resulted in a significant dropoff in performance. Here it is: On the other hand, Teodor showed a typical use case and a very substantial performance gain: Yeah. Whatever we do here is a tradeoff (and whether Robert likes it or not, reliability and code maintainability weigh heavily in the tradeoff). I wonder how many people really use GIN with non-bitmap scans for some benefit? And even if the benefit exists, does the planner have a way to identify those cases reliably, or does it have to be done manually? A relevant point there is that most of the estimator functions for GIN-amenable operators are just smoke and mirrors; so if the planner is making a good choice between indexscan and bitmapscan at all, it's mostly luck. This might get better someday, but not in 8.4. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
On Tue, Feb 24, 2009 at 2:56 PM, Tom Lane t...@sss.pgh.pa.us wrote: On the other hand, Teodor showed a typical use case and a very substantial performance gain: Yeah. Whatever we do here is a tradeoff (and whether Robert likes it or not, reliability and code maintainability weigh heavily in the tradeoff). I have no problem with reliability or code maintainability and I'm not sure what I said that would give that impression. If the consensus of the group is that the performance loss from dropping index scans is not important, then I'm fine with that, especially if that consensus is reached in the context of an educated knowledge of what that performance loss is likely to be. To me, a 2x slowdown on two-table anti-join seems pretty bad, but I just work here. Perhaps nobody else thinks that a semi-join or anti-join against a GIN index is a plausible use case (like, find all of the words from the following list that do not appear in any document)? If everyone agrees that we don't care about that case (or about ORDER-BY-without-LIMIT, which is certainly less compelling), then go ahead and remove it. I have no horse in this race other than having been asked to review the patch, which I did. On the other hand, if a significant number of people think that it might be a bad idea to make that case significantly worse, then some redesign work is called for, and that may mean the patch needs to get bumped. My own opinion is that it is better to decide on the right design and then figure out which release that design can go into than it is to start by deciding this has to go into 8.4 and then figuring out what can be done in that period of time. I don't think there is any question that making GIN continue to support both index scans and bitmap index scans will make the code more complex, but how bad will it be? So far we've ruled out using the planner to prevent index scans when the pending list is long (because it's not reliable) and cleaning up the pending list during insert when needed (because it won't work with Hot Standby). We haven't decided what WILL work, apart from ripping out index scans altogether, so to some degree we're comparing against an unknown. I wonder how many people really use GIN with non-bitmap scans for some benefit? And even if the benefit exists, does the planner have a way to identify those cases reliably, or does it have to be done manually? A relevant point there is that most of the estimator functions for GIN-amenable operators are just smoke and mirrors; so if the planner is making a good choice between indexscan and bitmapscan at all, it's mostly luck. This might get better someday, but not in 8.4. Based on the limited testing I've done thus far, it appears to pick an index scan for small numbers of rows and a bitmap index scan for larger number of rows. Index scans will have lower startup costs which can be valuable if you only need to scan part of the index (as in the semi and anti join cases). I haven't done enough testing to see if there is any benefit when scanning the whole index and only returning a few tuples. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
On Thu, 2009-02-19 at 22:43 -0500, Robert Haas wrote: I don't see a problems here, because indexes in postgres don't depend on any transaction's ids or modes as heap depends. WAL-logger works without that knowledge too. May be I missed something here or don't understand. Although heap's pages could be changed by setting commit-status bits on tuple even in read-only transaction, but that changes are not WAL-logged. That is correct for index's page too. It would be helpful if Heikki or Simon could jump in here, but my understanding is that cleaning up the pending list is a read-write operation. I don't think we can do that on a hot standby server. From reading the docs with the patch the pending list is merged into the main index when a VACUUM is performed. I (think I) can see that additions to the pending list are WAL logged, so that will work in Hot Standby. I also see ginEntryInsert() calls during the move from pending list to main index which means that is WAL logged also. AFAICS this *could* work during Hot Standby mode. Best check here http://wiki.postgresql.org/wiki/Hot_Standby#Usage rather than attempt to read the patch. Teodor, can you confirm * we WAL log the insert into the pending list * we WAL log the move from the pending list to the main index * that we maintain the pending list correctly during redo so that it can be accessed by index scans The main thing with Hot Standby is that we can't do any writes. So a pending list cannot change solely because of a gingettuple call on the *standby*. But that's easy to disable. If all the inserts happened on the primary node and all the reads happened on the standby, then pending list would never be cleaned up if the cleanup is triggered only by read. I would suggest that we trigger cleanup by read at threshold size X and trigger cleanup by insert at threshold size 5X. That avoids the strange case mentioned, but generally ensures only reads trigger cleanup. (But why do we want that??) I found many parts of the patch and docs quite confusing because of the way things are named. For me, this is a deferred or delayed insert technique to allow batching. I would prefer if everything used one description, rather than fast, pending, delayed etc. Personally, I see ginInsertCleanup() as a scheduled task unrelated to vacuum. Making the deferred tasks happen at vacuum time is just a convenient way of having a background task occur regularly. That's OK for now, but I would like to be able to request a background task without having to hook into AV. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
On Mon, Feb 23, 2009 at 4:56 AM, Simon Riggs si...@2ndquadrant.com wrote: It would be helpful if Heikki or Simon could jump in here, but my understanding is that cleaning up the pending list is a read-write operation. I don't think we can do that on a hot standby server. From reading the docs with the patch the pending list is merged into the main index when a VACUUM is performed. I (think I) can see that additions to the pending list are WAL logged, so that will work in Hot Standby. I also see ginEntryInsert() calls during the move from pending list to main index which means that is WAL logged also. AFAICS this *could* work during Hot Standby mode. Best check here http://wiki.postgresql.org/wiki/Hot_Standby#Usage rather than attempt to read the patch. Teodor, can you confirm * we WAL log the insert into the pending list * we WAL log the move from the pending list to the main index * that we maintain the pending list correctly during redo so that it can be accessed by index scans The main thing with Hot Standby is that we can't do any writes. So a pending list cannot change solely because of a gingettuple call on the *standby*. That's what I thought. Thanks for confirming. But that's easy to disable. If all the inserts happened on the primary node and all the reads happened on the standby, then pending list would never be cleaned up if the cleanup is triggered only by read. No, because the inserts would trigger VACUUM on the primary. I would suggest that we trigger cleanup by read at threshold size X and trigger cleanup by insert at threshold size 5X. That avoids the strange case mentioned, but generally ensures only reads trigger cleanup. (But why do we want that??) I think that's actually not what we want. What we want is for VACUUM to deal with it. Unfortunately that's hard to guarantee since, for example, someone might turn autovacuum off. So the issue is what do we do when we're in the midst of an index scan and our TIDBitmap has become lossy. Right now, the answer is that we clean up the pending list from inside the index scan and then retry the index scan. I don't think that's going to work. I'm starting to think that the right thing to do here is to create a non-lossy option for TIDBitmap. Tom has been advocating just losing the index scan AM altogether, but that risks losing performance in cases where a LIMIT would have stopped the scan well prior to completion. I found many parts of the patch and docs quite confusing because of the way things are named. For me, this is a deferred or delayed insert technique to allow batching. I would prefer if everything used one description, rather than fast, pending, delayed etc. I mentioned this in my previous review (perhaps not quite so articulately) and I completely agree with you. It's clear enough reading the patch because you know that all the changes in the patch must be related to each other, but once it's applied it's going to be tough to figure out. Personally, I see ginInsertCleanup() as a scheduled task unrelated to vacuum. Making the deferred tasks happen at vacuum time is just a convenient way of having a background task occur regularly. That's OK for now, but I would like to be able to request a background task without having to hook into AV. This has been discussed previously and I assume you will be submitting a patch at some point, since no one else has volunteered to implement it. I think autovacuum is the right way to handle this particular case because it is a cleanup operation that is not dependent on time but on write activity and hooks into more or less the same stats infrastructure, but I don't deny the existence of other cases that would benefit from a scheduler. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
Robert Haas robertmh...@gmail.com writes: I'm starting to think that the right thing to do here is to create a non-lossy option for TIDBitmap. Tom has been advocating just losing the index scan AM altogether, but that risks losing performance in cases where a LIMIT would have stopped the scan well prior to completion. Actually, I'm going to *insist* that we lose the index AM scan altogether. There might be a possibility to put it back in 8.5 or later if anyone actually makes the case (with some evidence) that it's worth the trouble. But right now, what this patch needs is to be made to work reliably, and having a great deal of complexity added by an inessential feature is a good way to make sure it doesn't go in at all. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
On Mon, Feb 23, 2009 at 10:05 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: I'm starting to think that the right thing to do here is to create a non-lossy option for TIDBitmap. Tom has been advocating just losing the index scan AM altogether, but that risks losing performance in cases where a LIMIT would have stopped the scan well prior to completion. Actually, I'm going to *insist* that we lose the index AM scan altogether. There might be a possibility to put it back in 8.5 or later if anyone actually makes the case (with some evidence) that it's worth the trouble. But right now, what this patch needs is to be made to work reliably, and having a great deal of complexity added by an inessential feature is a good way to make sure it doesn't go in at all. Except that the inessential feature in question is a feature that currently WORKS, and I don't believe that the testing you've done is anywhere near sufficient to show that no one will be upset if it goes away. Without some convincing evidence to support that proposition, I think it would be better to postpone the whole patch to 8.5 and use that time to fix the problem, rather than slam something through now and then possibly find out that we've introduced a performance regression in cases that used to work well. In fact, as far as I can see, the performance testing that has been done on this patch to date is fairly one-sided. It focuses on the improvement to index build speed, which is a good thing to test, but I don't believe anyone has done any testing at all of the cases where the patch might HURT. Previously, the relevant case was where someone has done a bunch of fast-inserts, but autovacuum has not kicked in yet, or the threshold to trigger autovacuum has not been reached. Now we're adding to that every query that currently uses an index scan, which seemed to be a pretty common plan in the simple cases I tested. Maybe I'm worrying over nothing? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
Robert Haas robertmh...@gmail.com writes: On Mon, Feb 23, 2009 at 10:05 AM, Tom Lane t...@sss.pgh.pa.us wrote: Actually, I'm going to *insist* that we lose the index AM scan altogether. Except that the inessential feature in question is a feature that currently WORKS, and I don't believe that the testing you've done is anywhere near sufficient to show that no one will be upset if it goes away. What feature is that --- the ability to get an undefined subset of rows quickly by using LIMIT without ORDER BY? Not much of a feature. Without some convincing evidence to support that proposition, I think it would be better to postpone the whole patch to 8.5 and use that time to fix the problem, Wouldn't bother me any. We are way overdue for 8.4 already. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
On Mon, Feb 23, 2009 at 1:35 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Feb 23, 2009 at 10:05 AM, Tom Lane t...@sss.pgh.pa.us wrote: Actually, I'm going to *insist* that we lose the index AM scan altogether. Except that the inessential feature in question is a feature that currently WORKS, and I don't believe that the testing you've done is anywhere near sufficient to show that no one will be upset if it goes away. What feature is that --- the ability to get an undefined subset of rows quickly by using LIMIT without ORDER BY? Not much of a feature. That's a huge over-generalization of the effect of removing index scans, as I am sure you are well aware. It only took me about 5 minutes to come up with a test case against CVS HEAD where disabling index scans resulted in a significant dropoff in performance. Here it is: create table foo (id serial, x int[], primary key (id)); create index foo_gin on foo using gin (x); insert into foo (x) select array[random()*1::integer, random()*1::integer, random()*1::integer, random()*1::integer] from generate_series(1,1); analyze foo; OK, now here's the query: select sum(1) from generate_series(1,1) g left join foo on array[g] @ x where x is null; On my system this takes about 45 ms to execute with default settings and about 90 ms to execute with index scan disabled. Without some convincing evidence to support that proposition, I think it would be better to postpone the whole patch to 8.5 and use that time to fix the problem, Wouldn't bother me any. We are way overdue for 8.4 already. I completely agree, but there are four patches on the CommitFest wiki that still need some committer attention before we close up shop: B-Tree Emulation for GIN Improve Performance of Multi-Batch Hash Join for Skewed Data Sets Proposal of PITR performance improvement SE-PostgreSQL Lite I have reviewed the second of these already and believe it's in pretty good shape and may review one or more of the others as time permits, especially if you or one of the other committers express an opinion that it would be helpful for me to do that and especially if you express an opinion on which of them it would be most helpful for. It is well past time to move Reducing some DDL Locks to ShareLock to committed and leave the unapplied portions for 8.5. As much as I would like to have the feature, it is probably also time to think about punting Hot Standby unless it's going to get committed RSN. At this point, we are definitely holding up both the release of 8.4 and development for 8.5. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
Right, can't do that on a hot standby server. Is anywhere applicable hot standby patch? Last version on wiki is 9g and it can't be applied cleanly. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
Teodor Sigaev wrote: Right, can't do that on a hot standby server. Is anywhere applicable hot standby patch? Last version on wiki is 9g and it can't be applied cleanly. The latest version is in Simon's git repository at: http://git.postgresql.org/?p=~sriggs/simon.git;a=shortlog;h=refs/heads/dev_hot_standby in the dev_hot_standby branch. I don't think he's posted an updated patch based on that work. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
and I still think that's bogus. It seems clear that this is going to change much faster than plans are going to be invalidated, and if autovacuum is doing its job, the pending list won't get long enough to matter much anyway, right? I don't think this patch should be touching gincostestimate at all. Why not? For any estimation function it's possible to invent workload which will change cost (or optimal plan) much faster than plan's invalidation. Gincostestimate depends on statistic on table, not on real index's state, and plan's invalidation will occur after analyze run. I am thinking that it is may not be necessary to remove the gingettuple interface (as you did in v0.28.2). Forcing a cleanup of the pending list seems like a reasonable workaround. We don't expect this situation to come up frequently, so if the method we use to Agree handle it is not terribly efficient, oh well. The one thing that concerns me is - what will happen in a hot standby environment, when that patch is committed? In that situation, I believe that we can't call modify any heap or index pages, so... I don't see a problems here, because indexes in postgres don't depend on any transaction's ids or modes as heap depends. WAL-logger works without that knowledge too. May be I missed something here or don't understand. Although heap's pages could be changed by setting commit-status bits on tuple even in read-only transaction, but that changes are not WAL-logged. That is correct for index's page too. 1. The description of the fastupdate reloption should be reworded for consistency with other options: Enables fast update feature for this GIN index Fixed 2. Why is this implemented as a string reloption rather than a boolean reloption? It seems like you want to model this on autovacuum_enabled. Fixed 3. Documentation wordsmithing. You have the following paragraph: Done 4. More wordsmithing. In the following paragraph, you have: Done 5. In textsearch.sgml, you say that GIN indexes are moderately slower to update, but about 10x slower without fastupdate. Can we provide a real number in place of moderately? I don't know whether to think this means 20% or 2x. Will make exact measurement some later :) 6. One of the comment changes in startScanEntry is simply a correction of a typographical error (deletion for deletition). You might as well commit this change separately and remove it from this patch. Sorry, I don't find what you mean. I found only begining (and fixed it) 7. pg_stat_get_fresh_inserted_tuples. I am not crazy about the fact that we call this the pending list in some places, fast update in some places, and now, here, fresh tuples. Let's just call it fast insert tuples. It's really inserted (fresh) tuples in table since last vacuum. This number doesn't depend on existence of GIN indexes or its modes. 8. tbm_check_tuple. The opening curly brace should be uncuddled. The curly braces around wordnum = bitnum = 0 are superfluous. It's a copy/paste code from tbm_add_tuples, and I'd like if { ... } else { ... } much more than if .. else { } 9. gincostestimate. There are a lot of superfluous whitespace changes here, and some debugging code that obviously wasn't fully removed. Cleaned. 10. GinPageOpaqueData. Surely we can come up with a better name than GIN_LIST. This is yet another name for the same concept. Why not call this GIN_FAST_INSERT_LIST? Like other defines here - GIN_DATA, not a GIN_DATA_PAGE, GIN_LEAF - not a GIN_LEAF_TREE_PAGE. 11. ginInsertCleanup. Inserion is a typo. fixed Unfortunately, I don't understand all of this patch well enough to give it as thorough a going-over as it deserves, so my apologies for whatever I've missed. Thank you a lot for your reviewing. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ fast_insert_gin-0.29.1.gz Description: Unix tar archive -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
On Thu, Feb 19, 2009 at 8:36 AM, Teodor Sigaev teo...@sigaev.ru wrote: and I still think that's bogus. It seems clear that this is going to change much faster than plans are going to be invalidated, and if autovacuum is doing its job, the pending list won't get long enough to matter much anyway, right? I don't think this patch should be touching gincostestimate at all. Why not? For any estimation function it's possible to invent workload which will change cost (or optimal plan) much faster than plan's invalidation. Gincostestimate depends on statistic on table, not on real index's state, and plan's invalidation will occur after analyze run. Hrm, hum, maybe you're right. I am thinking that it is may not be necessary to remove the gingettuple interface (as you did in v0.28.2). Forcing a cleanup of the pending list seems like a reasonable workaround. We don't expect this situation to come up frequently, so if the method we use to Agree handle it is not terribly efficient, oh well. The one thing that concerns me is - what will happen in a hot standby environment, when that patch is committed? In that situation, I believe that we can't call modify any heap or index pages, so... I don't see a problems here, because indexes in postgres don't depend on any transaction's ids or modes as heap depends. WAL-logger works without that knowledge too. May be I missed something here or don't understand. Although heap's pages could be changed by setting commit-status bits on tuple even in read-only transaction, but that changes are not WAL-logged. That is correct for index's page too. It would be helpful if Heikki or Simon could jump in here, but my understanding is that cleaning up the pending list is a read-write operation. I don't think we can do that on a hot standby server. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
Robert Haas wrote: On Thu, Feb 19, 2009 at 8:36 AM, Teodor Sigaev teo...@sigaev.ru wrote: handle it is not terribly efficient, oh well. The one thing that concerns me is - what will happen in a hot standby environment, when that patch is committed? In that situation, I believe that we can't call modify any heap or index pages, so... I don't see a problems here, because indexes in postgres don't depend on any transaction's ids or modes as heap depends. WAL-logger works without that knowledge too. May be I missed something here or don't understand. Although heap's pages could be changed by setting commit-status bits on tuple even in read-only transaction, but that changes are not WAL-logged. That is correct for index's page too. It would be helpful if Heikki or Simon could jump in here, but my understanding is that cleaning up the pending list is a read-write operation. I don't think we can do that on a hot standby server. Right, can't do that on a hot standby server. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
On Tue, Feb 17, 2009 at 2:28 PM, Teodor Sigaev teo...@sigaev.ru wrote: Hi there, we present two variants of GIN fast insert patch, since we're not sure what is a the best solution: v0.28.1 - remove disable cost in gincostestimate - per http://archives.postgresql.org/message-id/499466d2.4010...@sigaev.ru gingettuple could force cleanup of pending list if it got a lossy tidbitmap. If such cleanup occurs the gingettuple will rescan pending list again. v0.28.2 - remove disable cost in gincostestimate - per http://archives.postgresql.org/message-id/12795.1234379...@sss.pgh.pa.us AM can now have only one search method: amgettuple or amgetbitmap. - GIN now has only amgetbitmap interface I reviewed v0.28.1. I see that disable_cost is gone from gincostestimate, but you're still using the pending list to set costs, and I still think that's bogus. It seems clear that this is going to change much faster than plans are going to be invalidated, and if autovacuum is doing its job, the pending list won't get long enough to matter much anyway, right? I don't think this patch should be touching gincostestimate at all. I am thinking that it is may not be necessary to remove the gingettuple interface (as you did in v0.28.2). Forcing a cleanup of the pending list seems like a reasonable workaround. We don't expect this situation to come up frequently, so if the method we use to handle it is not terribly efficient, oh well. The one thing that concerns me is - what will happen in a hot standby environment, when that patch is committed? In that situation, I believe that we can't call modify any heap or index pages, so... Some other assorted minor comments on v0.28.1... 1. The description of the fastupdate reloption should be reworded for consistency with other options: Enables fast update feature for this GIN index 2. Why is this implemented as a string reloption rather than a boolean reloption? It seems like you want to model this on autovacuum_enabled. 3. Documentation wordsmithing. You have the following paragraph: As of productnamePostgreSQL/productname 8.4, this problem is alleviated by postponing most of the work until the next commandVACUUM/. Newly inserted index entries are temporarily stored in an unsorted list of pending entries commandVACUUM/ inserts all pending entries into the main acronymGIN/acronym index data structure, using the same bulk insert techniques used during initial index creation. This greatly improves acronymGIN/acronym index update speed, even counting the additional vacuum overhead. Here is my proposed rewrite: As of productnamePostgreSQL/productname 8.4, acronymGIN/ is capable of postponing much of this work by inserting new tuples into a temporary, unsorted list of pending entries. When the table is vacuumed, or in some cases when the pending list becomes too large, the entries are moved to the main acronymGIN/acronym data structure using the same bulk insert techniques used during initial index creation. This greatly improves acronymGIN/acronym index update speed, even counting the additional vacuum overhead. 4. More wordsmithing. In the following paragraph, you have: It's recommended to use properly-configured autovacuum with tables having acronymGIN/acronym indexes, to keep this overhead to reasonable levels. I think it is clearer and more succinct to write simply: Proper use of autovacuum can minimize this problem. 5. In textsearch.sgml, you say that GIN indexes are moderately slower to update, but about 10x slower without fastupdate. Can we provide a real number in place of moderately? I don't know whether to think this means 20% or 2x. 6. One of the comment changes in startScanEntry is simply a correction of a typographical error (deletion for deletition). You might as well commit this change separately and remove it from this patch. 7. pg_stat_get_fresh_inserted_tuples. I am not crazy about the fact that we call this the pending list in some places, fast update in some places, and now, here, fresh tuples. Let's just call it fast insert tuples. 8. tbm_check_tuple. The opening curly brace should be uncuddled. The curly braces around wordnum = bitnum = 0 are superfluous. 9. gincostestimate. There are a lot of superfluous whitespace changes here, and some debugging code that obviously wasn't fully removed. 10. GinPageOpaqueData. Surely we can come up with a better name than GIN_LIST. This is yet another name for the same concept. Why not call this GIN_FAST_INSERT_LIST? 11. ginInsertCleanup. Inserion is a typo. Unfortunately, I don't understand all of this patch well enough to give it as thorough a going-over as it deserves, so my apologies for whatever I've missed. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
So? Barring some evidence that there's a significant performance win from a conventional indexscan, this is a weak argument. AFAICS the only significant advantage of the conventional API is to support ordered scans, and GIN doesn't do that anyway. What about SELECT ... AND EXISTS (SELECT ... t @@ query) ? But I don't believe that is popular use-case. In most cases, GIN is used with bitmap scan. Your emails are so convincing and I'll remove support amgettuple interface in GIN. Do you think we need to add new pg_am boolean option? Like pg_am.amcangettuple or pg_am.amcanpertuplescan -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
Teodor Sigaev wrote: So? Barring some evidence that there's a significant performance win from a conventional indexscan, this is a weak argument. AFAICS the only significant advantage of the conventional API is to support ordered scans, and GIN doesn't do that anyway. What about SELECT ... AND EXISTS (SELECT ... t @@ query) ? But I don't believe that is popular use-case. In most cases, GIN is used with bitmap scan. Your emails are so convincing and I'll remove support amgettuple interface in GIN. SELECT * FROM foo WHERE t @@ query LIMIT 100 might be a fairly common use case. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
SELECT * FROM foo WHERE t @@ query LIMIT 100 might be a fairly common use case. It seems to me - SELECT * FROM foo WHERE t @@ query *ORDER BY rank* LIMIT 100; -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
On Fri, Feb 13, 2009 at 8:00 AM, Teodor Sigaev teo...@sigaev.ru wrote: SELECT * FROM foo WHERE t @@ query LIMIT 100 might be a fairly common use case. It seems to me - SELECT * FROM foo WHERE t @@ query *ORDER BY rank* LIMIT 100; I'm not sure you can really assume that the ORDER BY rank will always be in there. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
Teodor Sigaev teo...@sigaev.ru writes: Do you think we need to add new pg_am boolean option? Like pg_am.amcangettuple or pg_am.amcanpertuplescan I think it's sufficient to mark this by setting amgettuple to zero. We might as well allow amgetbitmap to be zero, too, to mark the case where the AM doesn't want to support bitmap scans. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: Teodor Sigaev wrote: But I don't believe that is popular use-case. In most cases, GIN is used with bitmap scan. Your emails are so convincing and I'll remove support amgettuple interface in GIN. SELECT * FROM foo WHERE t @@ query LIMIT 100 might be a fairly common use case. [ shrug... ] The proposed implementation fails to be particularly fast-start anyway, since it will process the entire pending queue before returning anything to the executor. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
[ shrug... ] The proposed implementation fails to be particularly fast-start anyway, since it will process the entire pending queue before returning anything to the executor. That is not true for fastupdate=off. But in any case it could be improved, but improvements doesn't solve the issue with lossy bitmap. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert database hang
This freezes the whole system even with autovacuum = off in postgresql.conf. As before, the backends wait on a semop() call. Fixed. There was a deadlock of LockBufferForCleanup and LockBuffer(SHARE). Redesign that place to downgrade LockBufferForCleanup to LockBuffer(EXCLUSIVE) with correction of page's locking during scan of pending list. I was able to reproduce the recovery failure this way once as well, but that part of the problem seems to be much more erratic. Most of Fixed too. I missed comments on XLogInsert: * NB: this routine feels free to scribble on the XLogRecData structs, * though not on the data they reference. This is OK since the XLogRecData * structs are always just temporaries in the calling code. and I reused once initialized XLogRecData many times in a loop. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ fast_insert_gin-0.27.gz Description: Unix tar archive -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
But the *real* problem is that you simply can not guarantee that someone doesn't increase the size of the pending list between the time If insertion process has bigger work_mem. Agree. What did you think of the idea of simply abandoning support for conventional indexscans in GIN? I agree that we could probably kluge something to make conventional scans still work reliably, but it seems to me that it'd be ugly, fragile, and quite possibly slow enough to not ever beat bitmap scans anyway. I don't like this idea because it forbids conventional indexscans even with fastupdate=off. May readonly query change the index? Index doesn't use xmin/xmax/cmin/cmax anyhow, so it doesn't depend on transaction state. If so, gingettuple could make cleanup of pending list if it got lossy bitmap and repeat search. Although it could be slow but it will never produce a failures and it will cause very rare (and GIN could emit WARNING/NOTICE/LOG message). And this solution allows to remove disabling of indexscan in gincostestimate. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
Teodor Sigaev teo...@sigaev.ru writes: What did you think of the idea of simply abandoning support for conventional indexscans in GIN? I don't like this idea because it forbids conventional indexscans even with fastupdate=off. So? Barring some evidence that there's a significant performance win from a conventional indexscan, this is a weak argument. AFAICS the only significant advantage of the conventional API is to support ordered scans, and GIN doesn't do that anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
On Thu, Feb 12, 2009 at 1:42 PM, Tom Lane t...@sss.pgh.pa.us wrote: Teodor Sigaev teo...@sigaev.ru writes: What did you think of the idea of simply abandoning support for conventional indexscans in GIN? I don't like this idea because it forbids conventional indexscans even with fastupdate=off. So? Barring some evidence that there's a significant performance win from a conventional indexscan, this is a weak argument. AFAICS the only significant advantage of the conventional API is to support ordered scans, and GIN doesn't do that anyway. Wouldn't it force you to recheck all tuples on the page, instead of just rechecking the one of interest? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
Robert Haas robertmh...@gmail.com writes: On Thu, Feb 12, 2009 at 1:42 PM, Tom Lane t...@sss.pgh.pa.us wrote: So? Barring some evidence that there's a significant performance win from a conventional indexscan, this is a weak argument. AFAICS the only significant advantage of the conventional API is to support ordered scans, and GIN doesn't do that anyway. Wouldn't it force you to recheck all tuples on the page, instead of just rechecking the one of interest? In the scenario at hand you'd have to do that anyway. Bear in mind that if the query is predicted to return more than a few rows, the planner is always going to pick bitmap scan anyway. So this whole issue is really only going to arise when you have a very bad rowcount prediction (or a very stale plan), leading to a choice of indexscan plan followed by enough rows actually returned to make the TID bitmap become lossy. That's certainly within the realm of possibility, particularly since we often don't have good estimators for GIN-compatible operators. But I think designing to squeeze every last bit of performance out of the case is a mistake. We should be satisfied to have correctness. In the end this is a tradeoff: how much complexity and loss of maintainability are we willing to accept to squeeze out a bit more performance? I'm leaning to the KISS end of that choice. The tests I did yesterday suggested to me that it would be difficult even to measure a performance gain from supporting conventional indexscan in GIN. IMHO the kinds of schemes that are being tossed around here are not remotely sane to choose if they don't lead to *big* wins. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
What would be wrong with letting it degrade to lossy? I suppose the reason it's trying to avoid that is to avoid having to recheck all the rows on that page when it comes time to do the index insertion; but surely having to do that is better than having arbitrary, unpredictable failure conditions. No, I don't think that's it. See here, beginning with the problem with lossy tbm has two aspects: http://archives.postgresql.org/message-id/4974b002.3040...@sigaev.ru Right. Some comments to that points: - amgettuple interface hasn't possibility to work with page-wide result instead of exact ItemPointer. amgettuple can not return just a block number as amgetbitmap can. It's not so difficult to teach GIN to return ItemPointer one by one from pending list and eliminate this point. GIN will not collect matched ItemPointers in tidbitmap and will return them immediately. But: - Because of concurrent vacuum process: while we scan pending list, it's content could be transferred into regular structure of index and then we will find the same tuple twice. Again, amgettuple hasn't protection from that, only amgetbitmap has it. So, we need to filter results from regular GIN by results from pending list. ANd for filtering we can't use lossy tbm. Again, we talk about amgettuple interface. We need to filter results from regular GIN by results from pending list. Now GIN does it by lookup matched ItemPointer in tidbitmap constructed from pending list. We could use ordered array of ItemPointers instead of tidbitmap, but I don't believe that it will take less memory. It's impossible to rescan pending list for two reasons: a) too slow, b) concurrent cleanup process (vacuum or insert now), because they could move tuples into regular structure. Is it acceptable to add option to tidbitmap which will forbid tidbitmap to become lossy? That removes disabling index scan in gincostestimate and checking of non-lossy tidbitmap. In current version, cleanup of pending list starts much earlier than non-lossy limit is reached in typical use-cases. Insertion process will start cleanup with at least one fired trigger: - number of heap tuples in pending list could produce lossy tidbitmap - total size of pending list is greater than work_mem. This trigger is developed to speedup bulk insertion (which is used in cleanup), because it will collect whole pending list in memory at once. And this trigger is more strict than first one because in typical use-case of GIN heap tuple is rather big. I believe that user could get GIN's error about work_mem only intentionally: - turn off autovacuum - set big work_mem - populate table with GIN index (by needed number of insertion) - prepare query which will return a lot of results (possibly, with seqscan=off because cost of scan of pending list grows fast) - decrease work_mem for at least ten times - execute query -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
I believe that user could get GIN's error about work_mem only intentionally: - turn off autovacuum Meanwhile, in the other thread, we're having a discussion about people wanting to do exactly this on a database-wide basis during peak load hours... - set big work_mem - populate table with GIN index (by needed number of insertion) - prepare query which will return a lot of results (possibly, with seqscan=off because cost of scan of pending list grows fast) - decrease work_mem for at least ten times - execute query Why would the new work_mem need to be 10x smaller than the old work mem? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
Robert Haas wrote: I believe that user could get GIN's error about work_mem only intentionally: - turn off autovacuum Meanwhile, in the other thread, we're having a discussion about people wanting to do exactly this on a database-wide basis during peak load hours... - decrease work_mem for at least ten times - execute query Why would the new work_mem need to be 10x smaller than the old work mem? That is is way to get GIN's error emitted. Work_mem should be decreased to catch a chance to get lossy tidbitmap. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
Teodor Sigaev teo...@sigaev.ru writes: Robert Haas wrote: Why would the new work_mem need to be 10x smaller than the old work mem? That is is way to get GIN's error emitted. Work_mem should be decreased to catch a chance to get lossy tidbitmap. But it only has to be marginally lower, not 10x lower. And there are plenty of scenarios where different backends might be running with different work_mem settings. But the *real* problem is that you simply can not guarantee that someone doesn't increase the size of the pending list between the time that someone else commits to an indexscan plan and the time that they execute that plan. This scheme will result in random failures for concurrent inserts/selects, and that's not acceptable. What did you think of the idea of simply abandoning support for conventional indexscans in GIN? I agree that we could probably kluge something to make conventional scans still work reliably, but it seems to me that it'd be ugly, fragile, and quite possibly slow enough to not ever beat bitmap scans anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert database hang
On Wed, Feb 11, 2009 at 10:03 PM, Robert Haas robertmh...@gmail.com wrote: I'm going to try to reproduce this, but here's approximately what I did. OK, I've managed to build a reproducible test case. Initial setup is just as I had before: create table foo (id serial, x int[], primary key (id)); create index foo_gin on foo using gin (x); Then just start these two commands running in different windows and wait: while true; do psql -c explain analyze select sum(1) from foo where array[1] @ x; done while true; do psql -c insert into foo (x) select array[(random() * 100)::int, (random() * 90)::int, (random()*80)::int] from generate_series(1,10);; done I did this four times, sometimes with the variant of adding set enable_bitmapscan to false; before the explain analyze. In three cases the database recovered succesfully after the immediate shutdown; in the other case, it crapped out much as described in my original email. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert database hang
I did this four times, sometimes with the variant of adding set enable_bitmapscan to false; before the explain analyze. In three cases the database recovered succesfully after the immediate shutdown; in the other case, it crapped out much as described in my original email. Sorry to keep replying to myself, but I've figured that autovacuum is not required to trigger this bug. In fact, I can reliably trigger it much more quickly just by starting two concurrent copies of: psql -c insert into foo (x) select array[(random() * 100)::int, (random() * 90)::int, (random()*80)::int] from generate_series(1,10); This freezes the whole system even with autovacuum = off in postgresql.conf. As before, the backends wait on a semop() call. I was able to reproduce the recovery failure this way once as well, but that part of the problem seems to be much more erratic. Most of the time after an immediate shutdown, pg_ctl start triggers recovery followed by normal startup. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
Robert Haas robertmh...@gmail.com writes: I think this is related to the problems with gincostestimate() that Tom Lane was complaining about here: http://archives.postgresql.org/message-id/20441.1234209...@sss.pgh.pa.us I am not 100% sure I'm understanding this correctly, but I think the reason why gincostestimate() is so desperate to avoid index scans when the pending list is long is because it knows that scanFastInsert() will blow up if an index scan is actually attempted because of the aforementioned TIDBitmap problem. This seems unacceptably fragile. Yipes. If that's really the reason then I agree, it's a nonstarter. I think this code needs to be somehow rewritten to make things degrade gracefully when the pending list is long - I'm not sure what the best way to do that is. Inventing a new data structure to store TIDs that is never lossy seems like it might work, but you'd have to think about what to do if it got too big. What would be wrong with letting it degrade to lossy? I suppose the reason it's trying to avoid that is to avoid having to recheck all the rows on that page when it comes time to do the index insertion; but surely having to do that is better than having arbitrary, unpredictable failure conditions. It strikes me that part of the issue here is that the behavior of this code is much better adapted to the bitmap-scan API than the traditional indexscan API. Since GIN doesn't support ordered scan anyway, I wonder whether it wouldn't be more sensible to simply allow it to not offer the traditional API. It should be easy to make the planner ignore plain indexscan plans for an AM that didn't support them. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
On Tue, Feb 10, 2009 at 10:38 PM, Tom Lane t...@sss.pgh.pa.us wrote: I think this code needs to be somehow rewritten to make things degrade gracefully when the pending list is long - I'm not sure what the best way to do that is. Inventing a new data structure to store TIDs that is never lossy seems like it might work, but you'd have to think about what to do if it got too big. What would be wrong with letting it degrade to lossy? I suppose the reason it's trying to avoid that is to avoid having to recheck all the rows on that page when it comes time to do the index insertion; but surely having to do that is better than having arbitrary, unpredictable failure conditions. No, I don't think that's it. See here, beginning with the problem with lossy tbm has two aspects: http://archives.postgresql.org/message-id/4974b002.3040...@sigaev.ru It strikes me that part of the issue here is that the behavior of this code is much better adapted to the bitmap-scan API than the traditional indexscan API. Since GIN doesn't support ordered scan anyway, I wonder whether it wouldn't be more sensible to simply allow it to not offer the traditional API. It should be easy to make the planner ignore plain indexscan plans for an AM that didn't support them. If that doesn't lead to a performance degradation, I think it would be a good idea. It certainly seems like it would allow this patch to be a LOT simpler, cleaner, and more robust. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
Robert Haas robertmh...@gmail.com writes: On Tue, Feb 10, 2009 at 10:38 PM, Tom Lane t...@sss.pgh.pa.us wrote: It strikes me that part of the issue here is that the behavior of this code is much better adapted to the bitmap-scan API than the traditional indexscan API. Since GIN doesn't support ordered scan anyway, I wonder whether it wouldn't be more sensible to simply allow it to not offer the traditional API. It should be easy to make the planner ignore plain indexscan plans for an AM that didn't support them. If that doesn't lead to a performance degradation, I think it would be a good idea. For queries that select only a single index entry, there might be some speed degradation, but a quick test suggests it's in the single-digit-percentage range if everything's cached; and of course if you have to go to disk then the extra CPU cycles to push a bitmap around are lost in the noise. In any case, as a wise man once said I can make this code arbitrarily fast, if it doesn't have to give the right answer. Blowing up in easily foreseeable circumstances isn't my idea of giving the right answer. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN fast insert
On Tue, Feb 10, 2009 at 11:18 PM, Tom Lane t...@sss.pgh.pa.us wrote: For queries that select only a single index entry, there might be some speed degradation, but a quick test suggests it's in the single-digit-percentage range if everything's cached; and of course if you have to go to disk then the extra CPU cycles to push a bitmap around are lost in the noise. In any case, as a wise man once said I can make this code arbitrarily fast, if it doesn't have to give the right answer. Blowing up in easily foreseeable circumstances isn't my idea of giving the right answer. Sure, but dropping index-scan support is not the only way of fixing that problem. It has a certain elegance to it, but if it produces a perceptible slowdown, it may not be the best approach. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gin fast insert performance
Here is a test of the fast insert patch. The patch has gone through some changes, so this set of tests is to see the current performance impact compared with HEAD. The test is simple: inserting a bunch of integer arrays into a table with a GIN index on the array column. I'm testing with small work_mem and large work_mem because the smaller work mem should be periodically flushing the pending list throughout a large insert, while large work_mem should allow larger batches to build up before flushing the pending list. For HEAD, larger work_mem should have no effect. You didn't provide distributions of array's element, number of unique element and so on. And I make simple test script, which generates data rather close to typical tsearch installation (see tst.sql). And results on my notebook (in seconds): fastupdate | insert | vacuum ++ 10 rows off|316.147 | 0.770 on | 65.461 | 12.795 100 rows off| 16 hours |- on | 6612.595 | 12.795 I stop the test with fastupdate=off and one million rows - it ran too long :). Changes in postgresql.conf: shared_buffers=128MB temp_buffers=16MB work_mem=16MB maintenance_work_mem=256MB effective_cache_size=1024MB autovacuum=off Fastest way is create table, fill it, create index and vacuum it (for 10 records): 17 secs to insert 27 secs to create an index 1 second to vacuum So, in summary, it takes 45 secs instead of 78 secs with fast update and 317 seconds without fast update. I think, it's a win in performance. With the fast insert patch, the total time for insert + vacuum isn't much different with increased work_mem, but increased work_mem clearly defers a lot of the work to VACUUM. but increased work_mem clearly *may* defer a lot of the work to VACUUM. Because in real world it's impossible to predict when clearing of pending list will be started. And autovacuum usually will fire the clearing earlier than pending list reaches the limit. So, unfortunately, the fast insert patch does not appear to bring the overall time anywhere close to building the index from scratch. When the work_mem is set to 1GB, the VACUUM took about twice as long to run than the entire index build. Teodor, can you explain why that might be? Yeah, creation of index is much more optimizable than sequential insertions. With enabled fast update there is a overhead of writing pending pages (and WAL too), and that pages should be readed to collect data into memory. Next, clearing process uses work_mem instead of maintenance_work_mem, which is usually greater. Algorithm of bulk insertion (it's used in creation and cleaning too) likes tids at the end of table, if lowest tid to insert is greater than lastest tid in current tree then algorithm could insert more than one tid at once. It does show improvement, and I think my test case might just be too If dataset is bigger then improvement is better :). small. It seems like a lot of time is used inserting into the pending list, but it seems like it should be a simple operation. Maybe that can be optimized? As you can see (ginfast.c), insertion into pending list could cause not fully filled pages, in worst case pending list will contain about 50% of unused space (if every indexed value takes GIN_PAGE_FREESIZE+1 bytes then value will takes two pages). This is a price to keep concurrency at high level :(. If you have an idea how to do compact, fast and concurrent insertion into pending list (or another structure) and keep reasonable time to search, please, don't be quiet :) -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gin fast insert performance
Sorry, lost test sript BTW, is btree_gin ready to commit by your opinion? -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ CREATE OR REPLACE FUNCTION gena() RETURNS _int4 AS $$ SELECT array( SELECT (10*random())::int FROM generate_series( 0, 2 + (100*random())::int ) ); $$ LANGUAGE SQL VOLATILE; \echo FU = off = DROP TABLE IF EXISTS ta; CREATE TABLE ta ( a int[] ); CREATE INDEX taidx ON ta USING gin (a) with (fastupdate=off); INSERT INTO ta (SELECT gena() FROM generate_series(1,10)); VACUUM ANALYZE ta; \echo FU = on = DROP TABLE IF EXISTS ta; CREATE TABLE ta ( a int[] ); CREATE INDEX taidx ON ta USING gin (a) with (fastupdate=on); INSERT INTO ta (SELECT gena() FROM generate_series(1,10)); VACUUM ANALYZE ta; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gin fast insert performance
On Tue, 2009-01-27 at 20:36 +0300, Teodor Sigaev wrote: You didn't provide distributions of array's element, number of unique element and so on. And I make simple test script, which generates data rather close to typical tsearch installation (see tst.sql). The arrays I was inserting were actually all identical. In the case of a 1000-element array inserted 1 times, it was just ARRAY[1, 2, ..., 1000]. My test case must have been much to simple, but I expected that it would still benefit from fast insert. but increased work_mem clearly *may* defer a lot of the work to VACUUM. Because in real world it's impossible to predict when clearing of pending list will be started. And autovacuum usually will fire the clearing earlier than pending list reaches the limit. Yes, that is the expected result and part of the design. It was just an observation, not a criticism. I will try with a better test case. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers