Odisseu21 wrote on 2/3/12 8:56 AM:
> I am new in Lucy and looking for fast and elegant, search solutions that are
> able to:
>
> - return an excerpt, HTML highlighted, around the MASTER_KEY_WORD
> - MASTER_KEY_WORD could be matched partial or not
> - must be possible define the size of excerpt (before and after the
> MASTER_KEY_WORD, maybe in terms of number of words or lines)
> - optional keywords, called INC_KEY_WORD, must be present, inside the
> excerpt, no matter the order
> - optional keywords, called EXC_KEY_WORD, must not be present, inside the
> excerpt, no matter the order
> - combinations of INC_KEY_WORD and EXC_KEY_WORD are possible
>
> Example:
> apple (partial) -> MASTER_KEY_WORD
> + (bag + blue, girl) -> INC_KEY_WORD combo
> - (black+ man, orange) -> EXC_KEY_WORD combo
>
> must return excerpts that the string 'apple' exists (apple, apples,
> applebees, ...)
> and ('bag' AND 'blue') or 'girl'
> but not ('black' AND 'man') or 'orange' surrounding the master keyword 'apple'
>
> Today we are using Postgres queries and some Perl code to do that in millions
> of docs. We have a good performance, for now.
>
> Is it possible to build such algorithm using Lucy? Fast an easy, in one step?
> Or maybe Lucy will be used just to retrieve the excerpt surroundig the master
> key word with subsequent Perl code to apply the rest?
you can do most of the above with Lucy, though not in one step. Some
post-processing for the INC_ and EXC_ key words would be necessary.
I use Search::Tools plus Lucy for this kind of thing, since Search::Tools will
let me highlight and excerpt from the original document as well.
--
Peter Karman . http://peknet.com/ . [email protected]