Re: [PATCHES] [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit

2008-05-06 Thread Bruce Momjian
Added to TODO: o Consider changing error to warning for strings larger than one megabyte http://archives.postgresql.org/pgsql-bugs/2008-02/msg00190.php http://archives.postgresql.org/pgsql-patches/2008-03/msg00062.php

Re: [PATCHES] [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit

2008-03-09 Thread Euler Taveira de Oliveira
Tom Lane wrote: Well, there is exactly zero chance of that happening in 8.3.x, because the bit allocations for on-disk tsvector representation are already determined. It's fairly hard to see a way of doing it in future releases that would have acceptable costs, either. I think you missed my

Re: [PATCHES] [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit

2008-03-07 Thread Bruce Momjian
Tom Lane wrote: Euler Taveira de Oliveira [EMAIL PROTECTED] writes: Edwin Groothuis wrote: Is it possible to make it a WARNING instead of an ERROR? Right now I get: No. All of the other types emit an ERROR if you're trying an out of range value. I don't think that follows. A

Re: [PATCHES] [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit

2008-03-07 Thread Bruce Momjian
Edwin Groothuis wrote: On Thu, Mar 06, 2008 at 08:19:35PM -0300, Euler Taveira de Oliveira wrote: Edwin Groothuis wrote: Is it possible to make it a WARNING instead of an ERROR? Right now I get: No. All of the other types emit an ERROR if you're trying an out of range value.

Re: [PATCHES] [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit

2008-03-07 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: I don't think that follows. A tsearch index is lossy anyway, so there's Uh, the index is lossy but I thought it was lossy in a way that just required additional heap accesses, not lossy in that it doesn't index everything. Sure it's

Re: [PATCHES] [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit

2008-03-07 Thread Bruce Momjian
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: I don't think that follows. A tsearch index is lossy anyway, so there's Uh, the index is lossy but I thought it was lossy in a way that just required additional heap accesses, not lossy in that it doesn't index

Re: [PATCHES] [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit

2008-03-07 Thread Teodor Sigaev
To be precise about tsvector: 1) GiST index is lossy for any kind of tserach queries, GIN index for @@ operation is not lossy, for @@@ - is lossy. 2) Number of positions per word is limited to 256 number - bigger number of positions is not helpful for ranking, but produces a big tsvector. If

Re: [PATCHES] [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit

2008-03-06 Thread Euler Taveira de Oliveira
Edwin Groothuis wrote: Is it possible to make it a WARNING instead of an ERROR? Right now I get: No. All of the other types emit an ERROR if you're trying an out of range value. -- Euler Taveira de Oliveira http://www.timbira.com/ -- Sent via pgsql-patches mailing list

Re: [PATCHES] [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit

2008-03-06 Thread Edwin Groothuis
On Thu, Mar 06, 2008 at 08:19:35PM -0300, Euler Taveira de Oliveira wrote: Edwin Groothuis wrote: Is it possible to make it a WARNING instead of an ERROR? Right now I get: No. All of the other types emit an ERROR if you're trying an out of range value. Does that then mean that, because

Re: [PATCHES] [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit

2008-03-06 Thread Tom Lane
Euler Taveira de Oliveira [EMAIL PROTECTED] writes: Edwin Groothuis wrote: Is it possible to make it a WARNING instead of an ERROR? Right now I get: No. All of the other types emit an ERROR if you're trying an out of range value. I don't think that follows. A tsearch index is lossy

Re: [PATCHES] [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit

2008-03-06 Thread Euler Taveira de Oliveira
Tom Lane wrote: I don't think that follows. A tsearch index is lossy anyway, so there's no hard and fast reason why it should reject entries that it can't index completely. I think it would be more useful to index whatever it can (probably just the words in the first N bytes of the document)

Re: [PATCHES] [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit

2008-03-06 Thread Tom Lane
Euler Taveira de Oliveira [EMAIL PROTECTED] writes: The problem with this approach is how to select the part of the document to index. How will you ensure you're not ignoring the more important words of the document? That's *always* a risk, anytime you do any sort of processing or

Re: [PATCHES] [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit

2008-03-05 Thread Edwin Groothuis
On Wed, Mar 05, 2008 at 10:53:38AM -0500, Bruce Momjian wrote: Euler Taveira de Oliveira wrote: Edwin Groothuis wrote: Ouch. But... since very long words are already not indexed (is the length configurable anywhere because I don't mind setting it to 50 characters), I don't think