But the real bottom line is: if autovacuum is working properly, it
should clean up the index before the list ever gets to the point where
it'd be sane to turn off indexscans. So I don't see why we need to hack
the planner for this at all. If any hacking is needed, it should be
in the direction
Jeff Davis pg...@j-davis.com writes:
On Wed, 2009-02-04 at 14:40 -0500, Robert Haas wrote:
Well, there's nothing to force that plan to be invalidated when the
state of the pending list changes, is there?
Would it be unreasonable to invalidate cached plans during the pending
list cleanup?
If
I looked at this a little bit --- it needs proofreading (VACUUME?).
Sorry, VACUUME fixed
Do we really need an additional column in pgstat table entries in
order to store something that looks like it can be derived from the
other columns? The stats tables are way too big already.
It's not
On Mon, 2009-02-02 at 20:38 -0500, Tom Lane wrote:
Also, I really think it's a pretty bad idea to make index cost
estimation depend on the current state of the index's pending list
--- that state seems far too transient to base plan choices on.
I'm confused by this. Don't we want to base the
On Wed, Feb 4, 2009 at 1:39 PM, Jeff Davis pg...@j-davis.com wrote:
On Mon, 2009-02-02 at 20:38 -0500, Tom Lane wrote:
Also, I really think it's a pretty bad idea to make index cost
estimation depend on the current state of the index's pending list
--- that state seems far too transient to
On Wed, 2009-02-04 at 14:40 -0500, Robert Haas wrote:
Well, there's nothing to force that plan to be invalidated when the
state of the pending list changes, is there?
Would it be unreasonable to invalidate cached plans during the pending
list cleanup?
Anyway, it just strikes me as strange to
On Wed, Feb 4, 2009 at 4:23 PM, Jeff Davis pg...@j-davis.com wrote:
On Wed, 2009-02-04 at 14:40 -0500, Robert Haas wrote:
Well, there's nothing to force that plan to be invalidated when the
state of the pending list changes, is there?
Would it be unreasonable to invalidate cached plans
Teodor Sigaev teo...@sigaev.ru writes:
I'm very sorry, but v0.24 has a silly bug with not initialized value :(.
New version is attached
I looked at this a little bit --- it needs proofreading (VACUUME?).
Do we really need an additional column in pgstat table entries in
order to store something
I'm very sorry, but v0.24 has a silly bug with not initialized value :(.
New version is attached
--
Teodor Sigaev E-mail: teo...@sigaev.ru
WWW: http://www.sigaev.ru/
fast_insert_gin-0.25.gz
Description: Unix
BTW, gincostestimate could use that information for cost estimation, but is
index opening and metapge reading in amcostestimate acceptable?
That sounds reasonable to me. I think that's what the index-specific
cost estimators are for.
Done.
Do you expect a performance impact?
I'm afraid
- after limit is reached, force cleanup of pending list by calling
gininsertcleanup. Not very good, because users sometimes will see a huge
execution time of simple insert. Although users who runs a huge update should be
satisfied.
I have difficulties in a choice of way. Seems to me, the
On Wed, 2009-01-21 at 15:06 +0300, Teodor Sigaev wrote:
Done. Now GIN counts number of pending tuples and pages and stores they on
metapage. Index cleanup could start during normal insertion in two cases:
- number of pending tuples is too high to keep guaranteed non-lossy tidbitmap
- pending
On Mon, 2009-01-19 at 19:53 +0300, Teodor Sigaev wrote:
I see only two guaranteed solution of the problem:
- after limit is reached, force normal index inserts. One of the motivation
of
patch was frequent question from users: why update of whole table with GIN
index
is so slow? So this
Changes:
Results of pernding list's scan now are placed directly in resulting
tidbitmap. This saves cycles for filtering results and reduce memory usage.
Also, it allows to not check losiness of tbm.
Is this a 100% bulletproof solution, or is it still possible for a query
to fail due to
Teodor Sigaev wrote:
New version. Changes:
- synced with current CVS
I notice you added a fillfactor reloption in ginoptions, but does it
really make sense? I recall removing it because the original code
contained a comment that says this is here because default_reloptions
wants it, but it
I notice you added a fillfactor reloption in ginoptions, but does it
really make sense? I recall removing it because the original code
contained a comment that says this is here because default_reloptions
wants it, but it has no effect.
I didn't change a recognition of fillfactor value,
Teodor Sigaev wrote:
I notice you added a fillfactor reloption in ginoptions, but does it
really make sense? I recall removing it because the original code
contained a comment that says this is here because default_reloptions
wants it, but it has no effect.
I didn't change a recognition of
Alvaro Herrera alvhe...@commandprompt.com writes:
Teodor Sigaev wrote:
I didn't change a recognition of fillfactor value, although GIN doesn't
use it for now.
I suggest you take StdRdOptions out of the GinOptions struct, and leave
fillfactor out of ginoptions. I don't think there's much
I suggest you take StdRdOptions out of the GinOptions struct, and leave
fillfactor out of ginoptions. I don't think there's much point in
supporting options that don't actually do anything. If the user tries
to set fillfactor for a gin index, he will get an error. Which is a
good thing IMHO.
On Fri, 2009-01-16 at 15:39 +0300, Teodor Sigaev wrote:
START_CRIT_SECTION();
...
l = PageAddItem(...);
if (l == InvalidOffsetNumber)
elog(ERROR, failed to add item to index page in \%s\,
RelationGetRelationName(index));
It's no use using ERROR, because it will turn
New version. Changes:
- synced with current CVS
- added all your changes
- autovacuum will run if fast update mode is turned on and
trigger of fresh tuple is fired
- gincostestimate now tries to calculate cost of scan of pending pages.
gincostestimate set disable_cost if it believe that
Changes:
- added vacuum_delay_point() in gininsertcleanup
- add trigger to run vacuum by number of inserted tuples from
last vacuum. Number of inserted tuples represents number
of really inserted tuples in index and it is calculated as
tuples_inserted + tuples_updated -
On 3 Dec 2008, at 06:57 AM, Heikki Linnakangas [EMAIL PROTECTED]
wrote:
Tom Lane wrote:
Heikki Linnakangas [EMAIL PROTECTED] writes:
If *that* is a use case we're interested in, the incoming tuples
could be accumulated in backend-private memory, and inserted into
the index at commit.
Greg Stark [EMAIL PROTECTED] writes:
If we do this though it would be really nice to do it at a higher
level than the indexam. If we could do it for any indexam that
provides a kind of bulk insert method that would be great.
I'm just not sure how to support all the indexable operators for
Tom Lane wrote:
Greg Stark [EMAIL PROTECTED] writes:
If we do this though it would be really nice to do it at a higher
level than the indexam. If we could do it for any indexam that
provides a kind of bulk insert method that would be great.
I'm just not sure how to support all the
Teodor Sigaev wrote:
- Falling back to regular insert will take long time for update of whole
table - and that was one of reasons of that patch. Users forget to drop
GIN index before a global update and query runs forever.
If *that* is a use case we're interested in, the incoming tuples could
grovel through. The situation will only be rectified at the next vacuum,
but if there's no deletes or updates on the table, just inserts,
autovacuum won't happen until the next anti-wraparound vacuum.
There is not agreement here, see
http://archives.postgresql.org/message-id/[EMAIL PROTECTED]
Heikki Linnakangas [EMAIL PROTECTED] writes:
Teodor Sigaev wrote:
- Falling back to regular insert will take long time for update of whole
table - and that was one of reasons of that patch. Users forget to drop
GIN index before a global update and query runs forever.
If *that* is a use
Tom Lane wrote:
Heikki Linnakangas [EMAIL PROTECTED] writes:
If *that* is a use case we're interested in, the incoming tuples could
be accumulated in backend-private memory, and inserted into the index at
commit. That would be a lot simpler, with no need to worry about
concurrent inserts or
There's a pretty fundamental issue with this patch, which is that while
buffering the inserts in the list pages makes the inserts fast, all
subsequent queries become slower until the tuples have been properly
inserted into the index. I'm sure it's a good tradeoff in many cases,
but there has
Heikki Linnakangas [EMAIL PROTECTED] writes:
I think we need a hard limit on the number of list pages, before we can
consider accepting this patch. After the limit is full, the next inserter can
flush the list, inserting the tuples in the list into the tree, or just fall
back to regular,
Gregory Stark wrote:
Heikki Linnakangas [EMAIL PROTECTED] writes:
I think we need a hard limit on the number of list pages, before we can
consider accepting this patch. After the limit is full, the next inserter
can
flush the list, inserting the tuples in the list into the tree, or
Reworked version of fast insertion patch for GIN.
* shiftList does LockBufferForCleanup, which means that it can be blocked
for an indefinitely long time by a concurrent scan, and since it's
holding
exclusive lock on metapage no new scans or insertions can start
meanwhile.
This is not only
(a) that's not back-patchable and (b) it'll create a merge conflict with
your patch, if you're still going to add a new AM function column.
I think that aminsertcleanup per se isn't needed, but if we want an
amanalyze there'd still be a conflict. Where are we on that?
I'll revert
* shiftList() holds an exclusive lock on metapage throughout its run,
which means that it's impossible for two of them to run concurrently.
So why bother with concurrent deletion detection?
Because metapage is locked immediately before shiftList call, while metapage is
unlocked another process
Teodor Sigaev [EMAIL PROTECTED] writes:
* There is a bigger race condition, which is that after a scan has
returned a tuple from a pending page, vacuum could move the index entry
into the main index structure, and then that same scan could return that
same index entry a second time. This is a
I wrote:
Really? Then GiST needs to be fixed too. Otherwise you risk having
queries return the same row twice. A bitmap indexscan plan would mask
such an index bug ... but a plain indexscan won't.
BTW, there's another issue I forgot about yesterday, which is that
the planner assumes that
operations such as page splits. Do we need to change the planner to
assume that this only works nicely for btree?
It seems to that direction (backward or forward) has meaning only for indexes
with amcanorder = true. With amcanorder=false results will be occasionally for
any direction.
--
Teodor Sigaev [EMAIL PROTECTED] writes:
operations such as page splits. Do we need to change the planner to
assume that this only works nicely for btree?
It seems to that direction (backward or forward) has meaning only for
indexes with amcanorder = true. With amcanorder=false results will
queries return the same row twice. A bitmap indexscan plan would mask
such an index bug ... but a plain indexscan won't.
Fuh. :(. Well. Will fix.
GiST:
- GiST already supports both scan directions in theory, but page split may
change order between forward and backward scans (user-defined
Teodor Sigaev [EMAIL PROTECTED] writes:
- GiST already supports both scan directions in theory, but page split may
change order between forward and backward scans (user-defined pageSplit
doesn't
preserve order of tuples). Holding of split until end of scan will produce
unacceptable
once, for regular VACUUM I think you really have to call it within
each bulkdelete operation.
Exactly what I did in last patch.
There's probably no point in optimizing
it away in VACUUM FULL either, since surely it'll be fast to call
index_cleanup when there's nothing in the pending list?
Teodor Sigaev [EMAIL PROTECTED] writes:
So, may be we just move insertcleanup call to
ginbulkdelete/ginvacuumcleanup
but leave aminsertcleanup field in pg_proc for a future.
I'd be inclined not to add the extra AM call if we aren't going to use
it now. There's no very good reason to think
I wrote:
Yeah, I was going to complain about that next :-). Autovacuum isn't
going to trigger as a result of INSERT operations; somehow we have
to teach it what to do for GIN indexes. I remember we discussed this
at PGCon but I don't think we decided exactly what to do...
One simple idea is
Teodor Sigaev [EMAIL PROTECTED] writes:
Updated: http://www.sigaev.ru/misc/fast_insert_gin-0.9.gz
Here is the GIN fast-insert patch back again. Changes:
* Sync with CVS HEAD
* Clean up documentation and some of the code comments
* Fix up custom reloptions code
* Suppress some compiler warnings
Tom Lane wrote:
Teodor Sigaev [EMAIL PROTECTED] writes:
I didn't get much further than that because I got discouraged after
looking at the locking issues around the pending-insertions list.
It's a mess:
These are rather severe problems. Maybe there's a better solution, but
perhaps it would
Alvaro Herrera [EMAIL PROTECTED] writes:
Tom Lane wrote:
It's a mess:
These are rather severe problems. Maybe there's a better solution, but
perhaps it would be good enough to lock out concurrent access to the
index while the bulkinsert procedure is working.
Ugh...
The idea I was toying
Teodor Sigaev [EMAIL PROTECTED] writes:
Updated: http://www.sigaev.ru/misc/fast_insert_gin-0.9.gz
I still havn't clearness of acceptability for suggested aminsertcleanup
calling.
I started to look at this. I don't understand why VACUUM does an insert
cleanup before starting to vacuum, but
I started to look at this. I don't understand why VACUUM does an insert
cleanup before starting to vacuum, but VACUUM FULL doesn't?
Hmm. May be I missed something, but I don't understand where and what... I tried
to track all places of ambultdelete call. aminsertcleanup should be called
Teodor Sigaev [EMAIL PROTECTED] writes:
That's close to trivial to revert this piece to add cleanup call to
ginbulkdelete/ginvacuumcleanup. Early variants used this variant.
Yeah, I think we should do it that way.
On reflection I don't think you even need the amvacuumstartup call,
because it
Teodor Sigaev [EMAIL PROTECTED] writes:
I started to look at this. I don't understand why VACUUM does an insert
cleanup before starting to vacuum, but VACUUM FULL doesn't?
Hmm. May be I missed something, but I don't understand where and what... I
tried
to track all places of ambultdelete
Well, if that is required to be true then this whole design is pretty
broken, because VACUUM doesn't hold any lock that would guarantee that
no insert happens between the two calls. If we fold the two AM calls
into one call then it'd be okay for the index AM to take such a lock
transiently
Updated: http://www.sigaev.ru/misc/fast_insert_gin-0.9.gz
need more review of fast_insert yet? It looked like a number of people
commented on it already.
I still havn't clearness of acceptability for suggested aminsertcleanup
calling.
--
Teodor Sigaev
Teodor Sigaev [EMAIL PROTECTED] writes:
http://www.sigaev.ru/misc/fast_insert_gin-0.7.gz
http://www.sigaev.ru/misc/multicolumn_gin-0.3.gz
I've committed the multicolumn one with minor revisions (fix some poor
English in docs and comments, add regression-test coverage). Do you
need more review
I've committed the multicolumn one with minor revisions (fix some poor
English in docs and comments, add regression-test coverage). Do you
Thank you very much.
need more review of fast_insert yet? It looked like a number of people
commented on it already.
I should modify it to
On Tue, 2008-07-08 at 14:51 -0400, Tom Lane wrote:
I'd still like to take a look.
I was tasked with reviewing this for the current commit fest, although
so far I've just been working on grokking the rest of the GIN code. But
if you'd like to review it instead, that's fine with me.
-Neil
--
Neil,
I was tasked with reviewing this for the current commit fest, although
so far I've just been working on grokking the rest of the GIN code. But
if you'd like to review it instead, that's fine with me.
I have plenty of other stuff I could assign you if you're not needed on
GIN.
--
I looked this over and it looks good in general.
May I think that patch passed review and commit it?
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
--
Sent via pgsql-hackers mailing list
Teodor Sigaev [EMAIL PROTECTED] writes:
I looked this over and it looks good in general.
May I think that patch passed review and commit it?
I'd still like to take a look.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To
Teodor Sigaev wrote:
Sync with current CVS HEAD and post in hackers- too because patches-
close to the closing.
http://www.sigaev.ru/misc/multicolumn_gin-0.3.gz
I looked this over and it looks good in general. I was only wondering
about for single-column indexes -- we're storing attribute
I looked this over and it looks good in general. I was only wondering
about for single-column indexes -- we're storing attribute numbers too,
right?
No, GinState-oneCol field signals to GinFormTuple and
gin_index_getattr/gintuple_get_attrnum about actual storage.
Single column index is binary
Teodor Sigaev wrote:
I looked this over and it looks good in general. I was only wondering
about for single-column indexes -- we're storing attribute numbers too,
right?
No, GinState-oneCol field signals to GinFormTuple and
gin_index_getattr/gintuple_get_attrnum about actual storage.
Sync with current CVS HEAD and post in hackers- too because patches- close to
the closing.
http://www.sigaev.ru/misc/fast_insert_gin-0.7.gz
http://www.sigaev.ru/misc/multicolumn_gin-0.3.gz
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
63 matches
Mail list logo