Re: [GENERAL] FTS phrase searches

2010-12-20 Thread Oleg Bartunov

On Sun, 19 Dec 2010, Glenn Maynard wrote:


2010/12/19 Oleg Bartunov o...@sai.msu.su:

You might be interested in http://www.sai.msu.su/~megera/wiki/2009-08-12


Thanks, that looks pretty much like what I had in mind.  Hopefully
that'll get merged for 9.0+1; phrases are a major part of all text
searches.


Several companies interested in phrase search, but actually we got no 
support for this, so we postpone it.



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] FTS phrase searches

2010-12-19 Thread Glenn Maynard
I guess no response means it's not possible.  I ended up doing a
manual substring match for quoted strings, but that's a poor hack.
Maybe I'll take a poke at implementing something like
tsvector_contains_phrase; it seems like a natural extension of what's
in there now.


On Mon, Nov 1, 2010 at 4:35 PM, Glenn Maynard gl...@zewt.org wrote:
 How are adjacent word searches handled with FTS?  tsquery doesn't do
 this, so I assume this has to be done as a separate filter step, eg.:

  # large house sales
  SELECT * FROM data WHERE fts @@ to_tsquery('large  house  sales')
 AND tsvector_contains_phrase(fts, to_tsvector('large house')));

 to do an indexed search for large  house  sales and then to narrow
 the results to where large house actually appears as a phrase (eg.
 adjacent positions at the same weight).  I can't find any function to
 do that, though.  (Presumably, it would return true if all of the
 words in the second tsvector exist in the first, with the same
 positions relative to each other.)

 tsvector @ tsvector seems logical, but isn't supported.

 This isn't as simple as using LIKE, since that'll ignore stemming,
 tokenization rules, etc.  If the language rules allow this to match
 larger house or large-house, then a phrase restriction should,
 too.  It's also painful when the FTS column is an aggregate of several
 other columns (eg. title and body), since a LIKE match needs to know
 that and check all of them separately.

 Any hints?  This is pretty important to even simpler search systems.

-- 
Glenn Maynard

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] FTS phrase searches

2010-12-19 Thread Oleg Bartunov

You might be interested in http://www.sai.msu.su/~megera/wiki/2009-08-12

Oleg
On Sun, 19 Dec 2010, Glenn Maynard wrote:


I guess no response means it's not possible.  I ended up doing a
manual substring match for quoted strings, but that's a poor hack.
Maybe I'll take a poke at implementing something like
tsvector_contains_phrase; it seems like a natural extension of what's
in there now.


On Mon, Nov 1, 2010 at 4:35 PM, Glenn Maynard gl...@zewt.org wrote:

How are adjacent word searches handled with FTS?  tsquery doesn't do
this, so I assume this has to be done as a separate filter step, eg.:

 # large house sales
 SELECT * FROM data WHERE fts @@ to_tsquery('large  house  sales')
AND tsvector_contains_phrase(fts, to_tsvector('large house')));

to do an indexed search for large  house  sales and then to narrow
the results to where large house actually appears as a phrase (eg.
adjacent positions at the same weight).  I can't find any function to
do that, though.  (Presumably, it would return true if all of the
words in the second tsvector exist in the first, with the same
positions relative to each other.)

tsvector @ tsvector seems logical, but isn't supported.

This isn't as simple as using LIKE, since that'll ignore stemming,
tokenization rules, etc.  If the language rules allow this to match
larger house or large-house, then a phrase restriction should,
too.  It's also painful when the FTS column is an aggregate of several
other columns (eg. title and body), since a LIKE match needs to know
that and check all of them separately.

Any hints?  This is pretty important to even simpler search systems.





Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] FTS phrase searches

2010-12-19 Thread Glenn Maynard
2010/12/19 Oleg Bartunov o...@sai.msu.su:
 You might be interested in http://www.sai.msu.su/~megera/wiki/2009-08-12

Thanks, that looks pretty much like what I had in mind.  Hopefully
that'll get merged for 9.0+1; phrases are a major part of all text
searches.

-- 
Glenn Maynard

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] FTS phrase searches

2010-11-01 Thread Glenn Maynard
How are adjacent word searches handled with FTS?  tsquery doesn't do
this, so I assume this has to be done as a separate filter step, eg.:

  # large house sales
  SELECT * FROM data WHERE fts @@ to_tsquery('large  house  sales')
AND tsvector_contains_phrase(fts, to_tsvector('large house')));

to do an indexed search for large  house  sales and then to narrow
the results to where large house actually appears as a phrase (eg.
adjacent positions at the same weight).  I can't find any function to
do that, though.  (Presumably, it would return true if all of the
words in the second tsvector exist in the first, with the same
positions relative to each other.)

tsvector @ tsvector seems logical, but isn't supported.

This isn't as simple as using LIKE, since that'll ignore stemming,
tokenization rules, etc.  If the language rules allow this to match
larger house or large-house, then a phrase restriction should,
too.  It's also painful when the FTS column is an aggregate of several
other columns (eg. title and body), since a LIKE match needs to know
that and check all of them separately.

Any hints?  This is pretty important to even simpler search systems.

-- 
Glenn Maynard

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general