Re: [HACKERS] Google Summer of Code 2008

Oleg Bartunov Sat, 08 Mar 2008 18:41:08 -0800

On Sat, 8 Mar 2008, Jan Urbaski wrote:

Unfortunately, selectivity estimation for query is much difficult than justestimate frequency of individual word.
Sure, given something like 'cats & dogs'::tsquery the frequency of 'cat' and'dog' won't suffice. But at least it's a starting point and if we estimatethat 80% of the documents have 'dog' and 70% have 'cat' then we can tell forsure that at least 50% have both and that's a lot better than 0.1% that'sbeing returned now.


certainly yes and given that most popular queries are single word query
this would very helpful in most cases.

The reason I though about ts_stat() improvement is that we could use its

statistics for incomplete search feature people requested, whenAND query like ( a & b &c ) rewrites to a set of AND|OR queries depending

on the terms occurency.

        Regards,
                Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Google Summer of Code 2008

Reply via email to