Sorry if duplicate post, I wasn't subscribed when I did the 1st one, so not 
sure if it was aloud or blocked by spam filters.  This is my first "post", so I 
hope I'm doing this correctly.  I've only used Google Groups or on-line forums 
in the past, never email based lists.

I'm using CLucene to index a custom ISAM legacy database of mine.  The database 
holds emails and other messages.  I've got the indexing working, and also the 
searching.  However, our current (very slow) search method provides a "summary" 
for each search result.  Basically fragments of the text around the words that 
were found.

I have to read the records from my database for each result found by CLucene 
anyway, because I don't want to store all that data in Lucene, so I was 
thinking I would just scan my text manually to provide the same "summary" I'm 
currently providing.  However, now I realize that isn't as simple as it was 
before, because I don't always know what words/phrases (I think Lucene calls 
them "terms") were found.  For example, if the user does a fuzzy or wildcard 
search.  Without recreating the CLucene logic, how can I do this?

I understand that there are some additional libraries (maybe only for the Java 
version?) that do "highlighting", which is similar to what I want to do.  I 
haven't found any source code for them yet, and even if I did, I'm not sure I 
could figure out how to extract that and recreate for my purposes (they may 
assume the text is stored in Lucene, or that I want the results in HTML 
format).  My scenario seems to be unique, because I'm not working with a 
website, or indexing basic files.

If I could find the section of code in CLucene where it "finds" results for my 
query, then I would know what words it found for each document.  I'm not too 
worried about showing the "most relavent" fragments at this point, although 
that would be a nice feature down the road.

p.s.-I'm loving CLucene so far, and if I can get past this last hurdle, I 
should be set =]

-Eric Selk


------------------------------------------------------------------------------
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/clucene-developers

Reply via email to