On Tue, Mar 24, 2009 at 04:55:47PM -0700, Josh Berkus wrote:
> On 3/24/09 1:18 PM, Tom Lane wrote:
>> I've committed this patch with additional editorialization --- mostly
>> cosmetic changes, except for removing the stats-driven cleanup in favor
>> of letting cleanups occur during auto-ANALYZE, as
On 3/24/09 1:18 PM, Tom Lane wrote:
I've committed this patch with additional editorialization --- mostly
cosmetic changes, except for removing the stats-driven cleanup in favor
of letting cleanups occur during auto-ANALYZE, as per my suggestion
yesterday.
By my count, this was the last patch l
I've committed this patch with additional editorialization --- mostly
cosmetic changes, except for removing the stats-driven cleanup in favor
of letting cleanups occur during auto-ANALYZE, as per my suggestion
yesterday.
regards, tom lane
--
Sent via pgsql-hackers mailing
ginInsertCleanup(); is it accurate? (If it isn't, I think
Exactly, that is right.
* This can be called concurrently by multiple backends, so it must cope.
* On first glance it looks completely not concurrent-safe and not crash-safe
* either. The reason it's okay is that multiple inserti
Teodor Sigaev writes:
> Oops, I was wrong, I supposed that all pages in chunk should be lossy, but
> it's
> true only for chunk page. So, tbm_add_page() should only call
> tbm_mark_page_lossy()...
OK, thanks, that's what I thought. I've changed it in the copy I'm
editing here.
I have another
Why is tbm_add_page() not coded as a simple wrapper around
tbm_mark_page_lossy()? As coded, it sometimes forces a whole bunch of
pages *other than* the target page to become lossy too, which I cannot
see a reason for it to be doing.
[after digging in tidbitmap]
Oops, I was wrong, I supposed tha
Jeff Davis writes:
> On Mon, 2009-03-23 at 15:23 -0400, Tom Lane wrote:
>> There is no need for any such infrastructure if we just drive it off a
>> post-ANALYZE callback.
> That sounds reasonable, although it does seem a little strange for
> analyze to actually perform cleanup.
My thought was t
On Mon, 2009-03-23 at 15:23 -0400, Tom Lane wrote:
> There is no need for any such infrastructure if we just drive it off a
> post-ANALYZE callback.
That sounds reasonable, although it does seem a little strange for
analyze to actually perform cleanup.
Now that we have FSM, the cost of VACUUMing
Alvaro Herrera writes:
> Tom Lane wrote:
>> On top of those issues, there are implementation problems in the
>> proposed relation_has_pending_indexes() check:
> I wonder if it's workable to have GIN send pgstats a message with number
> of fast-inserted tuples, and have autovacuum check that numbe
While I'm looking at this thing...
Why is tbm_add_page() not coded as a simple wrapper around
tbm_mark_page_lossy()? As coded, it sometimes forces a whole bunch of
pages *other than* the target page to become lossy too, which I cannot
see a reason for it to be doing.
rega
Tom Lane wrote:
> On top of those issues, there are implementation problems in the
> proposed relation_has_pending_indexes() check: it has hard-wired
> knowledge about GIN indexes, which means the feature cannot be
> extended to add-on index AMs; and it's examining indexes without any
> lock whats
I'm looking again at the fast-insert patch, and I find myself still
desperately unhappy about the mechanism for scheduling autovacuum
cleanup of pending insertions. I complained about that before, but
I think I only cited a worry about adding overhead to statistics
tracking in order to have the "r
On Thu, Feb 26, 2009 at 11:41 AM, Teodor Sigaev wrote:
>> it be? So far we've ruled out using the planner to prevent index
>> scans when the pending list is long (because it's not reliable) and
>> cleaning up the pending list during insert when needed (because it
>> won't work with Hot Standby).
it be? So far we've ruled out using the planner to prevent index
scans when the pending list is long (because it's not reliable) and
cleaning up the pending list during insert when needed (because it
won't work with Hot Standby). We haven't decided what WILL work,
During insert it will work wi
Teodor, can you confirm
* we WAL log the insert into the pending list
* we WAL log the move from the pending list to the main index
Yes, I can and I confirm
* that we maintain the pending list correctly during redo so that it can
be accessed by index scans
Not sure about correct locking in gi
On Tue, Feb 24, 2009 at 2:56 PM, Tom Lane wrote:
>> On the other hand, Teodor showed a typical use case and a very
>> substantial performance gain:
>
> Yeah. Whatever we do here is a tradeoff (and whether Robert likes it
> or not, reliability and code maintainability weigh heavily in the
> tradeo
Jeff Davis writes:
> On Tue, 2009-02-24 at 00:18 -0500, Robert Haas wrote:
>> It only took me about 5 minutes to come up with a test case against CVS
>> HEAD where disabling index scans resulted in a significant dropoff in
>> performance. Here it is:
> On the other hand, Teodor showed a typical
On Tue, 2009-02-24 at 00:18 -0500, Robert Haas wrote:
> It only took me about 5 minutes to come up with a test case against CVS
> HEAD where disabling index scans resulted in a significant dropoff in
> performance. Here it is:
On the other hand, Teodor showed a typical use case and a very
subs
On Tue, Feb 24, 2009 at 10:39 AM, Tom Lane wrote:
> Robert Haas writes:
>> On my system this takes about 45 ms to execute with default settings
>> and about 90 ms to execute with index scan disabled.
>
> [ shrug... ] That's well within my threshold of pain for this.
> In any case, it might be po
Robert Haas writes:
> On my system this takes about 45 ms to execute with default settings
> and about 90 ms to execute with index scan disabled.
[ shrug... ] That's well within my threshold of pain for this.
In any case, it might be possible to buy some/all of that back with
minor optimization
On Mon, Feb 23, 2009 at 1:35 PM, Tom Lane wrote:
> Robert Haas writes:
>> On Mon, Feb 23, 2009 at 10:05 AM, Tom Lane wrote:
>>> Actually, I'm going to *insist* that we lose the index AM scan
>>> altogether.
>
>> Except that the "inessential" feature in question is a feature that
>> currently WOR
Robert Haas writes:
> On Mon, Feb 23, 2009 at 10:05 AM, Tom Lane wrote:
>> Actually, I'm going to *insist* that we lose the index AM scan
>> altogether.
> Except that the "inessential" feature in question is a feature that
> currently WORKS, and I don't believe that the testing you've done is
>
On Mon, Feb 23, 2009 at 10:05 AM, Tom Lane wrote:
> Robert Haas writes:
>> I'm starting to think that the right thing to do here is to create a
>> non-lossy option for TIDBitmap. Tom has been advocating just losing
>> the index scan AM altogether, but that risks losing performance in
>> cases wh
Robert Haas writes:
> I'm starting to think that the right thing to do here is to create a
> non-lossy option for TIDBitmap. Tom has been advocating just losing
> the index scan AM altogether, but that risks losing performance in
> cases where a LIMIT would have stopped the scan well prior to
> c
On Mon, Feb 23, 2009 at 4:56 AM, Simon Riggs wrote:
>> It would be helpful if Heikki or Simon could jump in here, but my
>> understanding is that cleaning up the pending list is a read-write
>> operation. I don't think we can do that on a hot standby server.
>
> >From reading the docs with the pa
On Thu, 2009-02-19 at 22:43 -0500, Robert Haas wrote:
> > I don't see a problems here, because indexes in postgres don't
> depend on any
> > transaction's ids or modes as heap depends. WAL-logger works without
> that
> > knowledge too. May be I missed something here or don't understand.
> >
> > Al
Teodor Sigaev wrote:
Right, can't do that on a hot standby server.
Is anywhere applicable hot standby patch? Last version on wiki is 9g and
it can't be applied cleanly.
The latest version is in Simon's git repository at:
http://git.postgresql.org/?p=~sriggs/simon.git;a=shortlog;h=refs/heads
Right, can't do that on a hot standby server.
Is anywhere applicable hot standby patch? Last version on wiki is 9g and it
can't be applied cleanly.
--
Teodor Sigaev E-mail: teo...@sigaev.ru
WWW: http://www.s
Robert Haas wrote:
On Thu, Feb 19, 2009 at 8:36 AM, Teodor Sigaev wrote:
handle it is not terribly efficient, oh well. The one thing that
concerns me is - what will happen in a hot standby environment, when
that patch is committed? In that situation, I believe that we can't
call modify any he
On Thu, Feb 19, 2009 at 8:36 AM, Teodor Sigaev wrote:
>> and I still think that's bogus. It seems clear that this is going to
>> change much faster than plans are going to be invalidated, and if
>> autovacuum is doing its job, the pending list won't get long enough to
>> matter much anyway, right
and I still think that's bogus. It seems clear that this is going to
change much faster than plans are going to be invalidated, and if
autovacuum is doing its job, the pending list won't get long enough to
matter much anyway, right? I don't think this patch should be
touching gincostestimate at
On Tue, Feb 17, 2009 at 2:28 PM, Teodor Sigaev wrote:
> Hi there,
>
> we present two variants of GIN fast insert patch, since we're not sure
> what is a the best solution:
>
> v0.28.1
> - remove disable cost in gincostestimate
> - per http://archives.postgresql.org/message-id/499466d2.4010...@siga
[ shrug... ] The proposed implementation fails to be particularly
fast-start anyway, since it will process the entire pending queue
before returning anything to the executor.
That is not true for fastupdate=off. But in any case it could be improved, but
improvements doesn't solve the issue wit
Heikki Linnakangas writes:
> Teodor Sigaev wrote:
>> But I don't believe that is popular use-case. In most cases, GIN is used
>> with bitmap scan. Your emails are so convincing and I'll remove support
>> amgettuple interface in GIN.
> SELECT * FROM foo WHERE t @@ query LIMIT 100
> might be a fa
Teodor Sigaev writes:
> Do you think we need to add new pg_am boolean option? Like
> pg_am.amcangettuple
> or pg_am.amcanpertuplescan
I think it's sufficient to mark this by setting amgettuple to zero.
We might as well allow amgetbitmap to be zero, too, to mark the case
where the AM doesn't wa
On Fri, Feb 13, 2009 at 8:00 AM, Teodor Sigaev wrote:
>> SELECT * FROM foo WHERE t @@ query LIMIT 100
>> might be a fairly common use case.
>
> It seems to me -
> SELECT * FROM foo WHERE t @@ query *ORDER BY rank* LIMIT 100;
I'm not sure you can really assume that the ORDER BY rank will always
be
SELECT * FROM foo WHERE t @@ query LIMIT 100
might be a fairly common use case.
It seems to me -
SELECT * FROM foo WHERE t @@ query *ORDER BY rank* LIMIT 100;
--
Teodor Sigaev E-mail: teo...@sigaev.ru
WWW: ht
Teodor Sigaev wrote:
So? Barring some evidence that there's a significant performance win
from a conventional indexscan, this is a weak argument. AFAICS the only
significant advantage of the conventional API is to support ordered
scans, and GIN doesn't do that anyway.
What about SELECT ... AN
So? Barring some evidence that there's a significant performance win
from a conventional indexscan, this is a weak argument. AFAICS the only
significant advantage of the conventional API is to support ordered
scans, and GIN doesn't do that anyway.
What about SELECT ... AND EXISTS (SELECT ... t
Robert Haas writes:
> On Thu, Feb 12, 2009 at 1:42 PM, Tom Lane wrote:
>> So? Barring some evidence that there's a significant performance win
>> from a conventional indexscan, this is a weak argument. AFAICS the only
>> significant advantage of the conventional API is to support ordered
>> sca
On Thu, Feb 12, 2009 at 1:42 PM, Tom Lane wrote:
> Teodor Sigaev writes:
>>> What did you think of the idea of simply abandoning support for
>>> conventional indexscans in GIN?
>
>> I don't like this idea because it forbids conventional indexscans even with
>> fastupdate=off.
>
> So? Barring som
Teodor Sigaev writes:
>> What did you think of the idea of simply abandoning support for
>> conventional indexscans in GIN?
> I don't like this idea because it forbids conventional indexscans even with
> fastupdate=off.
So? Barring some evidence that there's a significant performance win
from
But the *real* problem is that you simply can not guarantee that
someone doesn't increase the size of the pending list between the time
If insertion process has bigger work_mem. Agree.
What did you think of the idea of simply abandoning support for
conventional indexscans in GIN? I agree that
This freezes the whole system even with autovacuum = off in
postgresql.conf. As before, the backends wait on a semop() call.
Fixed. There was a deadlock of LockBufferForCleanup and LockBuffer(SHARE).
Redesign that place to downgrade LockBufferForCleanup to LockBuffer(EXCLUSIVE)
with correction
> I did this four times, sometimes with the variant of adding "set
> enable_bitmapscan to false;" before the explain analyze. In three
> cases the database recovered succesfully after the immediate shutdown;
> in the other case, it crapped out much as described in my original
> email.
Sorry to ke
On Wed, Feb 11, 2009 at 10:03 PM, Robert Haas wrote:
> I'm going to try to reproduce this, but here's approximately what I did.
OK, I've managed to build a reproducible test case. Initial setup is
just as I had before:
> create table foo (id serial, x int[], primary key (id));
> create index fo
While fooling around with the GIN fast insert patch tonight, I managed
to hang my test database. :-(
I'm going to try to reproduce this, but here's approximately what I did.
create table foo (id serial, x int[], primary key (id));
create index foo_gin on foo using gin (x);
insert into foo (x) se
Teodor Sigaev writes:
> Robert Haas wrote:
>> Why would the new work_mem need to be 10x smaller than the old work mem?
> That is is way to get GIN's error emitted. Work_mem should be decreased
> to catch a chance to get lossy tidbitmap.
But it only has to be marginally lower, not 10x lower. An
Robert Haas wrote:
I believe that user could get GIN's error about work_mem only intentionally:
- turn off autovacuum
Meanwhile, in the other thread, we're having a discussion about people
wanting to do exactly this on a database-wide basis during peak load
hours...
- decrease work_mem for at l
> I believe that user could get GIN's error about work_mem only intentionally:
> - turn off autovacuum
Meanwhile, in the other thread, we're having a discussion about people
wanting to do exactly this on a database-wide basis during peak load
hours...
> - set big work_mem
> - populate table with
What would be wrong with letting it degrade to lossy? I suppose the
reason it's trying to avoid that is to avoid having to recheck all the
rows on that page when it comes time to do the index insertion; but
surely having to do that is better than having arbitrary, unpredictable
failure conditions
On Tue, Feb 10, 2009 at 11:18 PM, Tom Lane wrote:
> For queries that select only a single index entry, there might be some
> speed degradation, but a quick test suggests it's in the
> single-digit-percentage range if everything's cached; and of course if
> you have to go to disk then the extra CPU
Robert Haas writes:
> On Tue, Feb 10, 2009 at 10:38 PM, Tom Lane wrote:
>> It strikes me that part of the issue here is that the behavior of this
>> code is much better adapted to the bitmap-scan API than the traditional
>> indexscan API. Since GIN doesn't support ordered scan anyway, I wonder
>
On Tue, Feb 10, 2009 at 10:38 PM, Tom Lane wrote:
>> I think this code needs to be somehow rewritten to make things degrade
>> gracefully when the pending list is long - I'm not sure what the best
>> way to do that is. Inventing a new data structure to store TIDs that
>> is never lossy seems like
Robert Haas writes:
> I think this is related to the problems with gincostestimate() that
> Tom Lane was complaining about here:
> http://archives.postgresql.org/message-id/20441.1234209...@sss.pgh.pa.us
> I am not 100% sure I'm understanding this correctly, but I think the
> reason why gincostes
Jeff Davis asked me if I'd be willing to do a review of the GIN fast
insert patch about two weeks ago, but I haven't actually had a chance
to read through it in detail until tonight. I can't say I really know
anything about GIN (though I did take this opportunity to RTM), so
apologies in advance i
On Tue, 2009-01-27 at 20:36 +0300, Teodor Sigaev wrote:
> You didn't provide distributions of array's element, number of unique element
> and so on. And I make simple test script, which generates data rather close
> to
> typical tsearch installation (see tst.sql).
The arrays I was inserting we
Sorry, lost test sript
BTW, is btree_gin ready to commit by your opinion?
--
Teodor Sigaev E-mail: teo...@sigaev.ru
WWW: http://www.sigaev.ru/
CREATE OR REPLACE FUNCTION gena()
RETURNS _int4 AS
$$
SELECT array(
Here is a test of the fast insert patch. The patch has gone through some
changes, so this set of tests is to see the current performance impact
compared with HEAD.
The test is simple: inserting a bunch of integer arrays into a table
with a GIN index on the array column.
I'm testing with small wo
Here is a test of the fast insert patch. The patch has gone through some
changes, so this set of tests is to see the current performance impact
compared with HEAD.
The test is simple: inserting a bunch of integer arrays into a table
with a GIN index on the array column.
I'm testing with small wo
60 matches
Mail list logo