Odisseu21 wrote on 2/3/12 8:56 AM:
> I am new in Lucy and looking for fast and elegant, search solutions that are 
> able to:
> 
> - return an excerpt, HTML highlighted, around the MASTER_KEY_WORD
> - MASTER_KEY_WORD could be matched partial or not
> - must be possible define the size of excerpt (before and after the 
> MASTER_KEY_WORD, maybe in terms of number of words or lines)
> - optional keywords, called INC_KEY_WORD, must be present, inside the 
> excerpt, no matter the order
> - optional keywords, called EXC_KEY_WORD, must not be present, inside the 
> excerpt, no matter the order
> - combinations of INC_KEY_WORD and EXC_KEY_WORD are possible
> 
> Example: 
>               apple (partial)                -> MASTER_KEY_WORD
>               + (bag + blue, girl)         -> INC_KEY_WORD combo
>               -  (black+ man, orange)  -> EXC_KEY_WORD combo
> 
> must return excerpts that the string 'apple' exists (apple, apples, 
> applebees, ...)
> and ('bag' AND 'blue') or 'girl'
> but not ('black' AND 'man') or 'orange' surrounding the master keyword 'apple'
> 
> Today we are using Postgres queries and some Perl code to do that in millions 
> of docs. We have a good performance, for now.
> 
> Is it possible to build such algorithm using Lucy? Fast an easy, in one step?
> Or maybe Lucy will be used just to retrieve the excerpt surroundig the master 
> key word with subsequent Perl code to apply the rest?

you can do most of the above with Lucy, though not in one step. Some
post-processing for the INC_ and EXC_ key words would be necessary.

I use Search::Tools plus Lucy for this kind of thing, since Search::Tools will
let me highlight and excerpt from the original document as well.


-- 
Peter Karman  .  http://peknet.com/  .  [email protected]

Reply via email to