Jeff,
I've spent some time (several hours, in fact -- I'm a little slow)
poking about on the Lucene site and its links in an attempt to
understand more about the search process. I even downloaded and tried
to read the two sample chapters from "Lucene in Action." That book
certainly wasn't written for someone like me!
I also tried reading through some of the items on the Lucene FAQ page
http://wiki.apache.org/jakarta-lucene/LuceneFAQ
which left me overwhelmed. I'm sure there's much great stuff at that
page, but I don't know a lot of the terms used. The answers were
written for someone with more background than I currently possess.
I have many questions, but will start off with just a few. My
questions will reveal how little I know about programming at this
level.
1) It seems as though the Lucene search engine (LSE) deals with
indexes, rather than the email messages. Is that correct?
2) LSE searches fields, and the default field is text. Correct?
3) Thinking of the Sundial archive, how do I find out what fields are
available? "date" seems to be a field, as does "title" and "text."
What other fields are there? Is it possible for me to actually look
at some typical indexes?
4) One FAQ dealt with searching within results:
http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-f70612c6e4670e7fa2d5aeef4710effc522d85e0
** begin quote **
Can Lucene do a "search within search", so that the second search is
constrained by the results of the first query?
Yes. There are two primary options:
*
Use QueryFilter with the previous query as the filter. (you can
search the mailing list archives for QueryFilter and Doug Cutting's
recommendations against using it for this purpose)
*
Combine the previous query with the current query using
BooleanQuery, using the previous query as required.
The BooleanQuery approach is the recommended one.
** end quote **
Can you give me a couple of simple examples of how this might work?
Thanks for any help.
Mac
I'd be happy to help. If you can put up with my cranky comments, I
can put up with your bumbling efforts to patch the search engine.
;-)
Works for me.
So I tracked down the bug that was keeping most of the sundial
messages out of the search index. One particular message
with a very funny date was gumming up the works. Now that
the problem is resolved, searches should find everything.
_______________________________________________
Discussion list for The Mail Archive
Gossip@jab.org
http://jab.org/cgi-bin/mailman/listinfo/gossip