Re: indexing and searching the document title question

Erick Erickson Tue, 27 Feb 2007 10:14:11 -0800

You've probably got it right. But I'd add a couple of things....


1> by using the correct analyzer at index and query time, the
casing will be taken care of for you.

2> you don't want UN_TOKENIZED for fields you search on
in general because there's no parsing. So if you indexed
"This is a String", searching on "This" or "this" wouldn't match.

3> In your code fragment, you didn't show what Analyzer you
use. This is way more important than you think.

4> get a copy of Luke (google lucene luke). It'll let you examine
your index and save you a world of hurt. There have been some
very nice improvements lately along with 2.1 compatability.

5> If you want searches and indexing to use different analyzers
on different fields, see PerFieldAnalyzerWrapper.

6> You'll probably find yourself storing the same data multiple
times, once for searching and once for displaying. So you'll search
on the lowercased, indexed field and display the UN_TOKENIZED
version since it'll retain the capitalization.

7> I think your underlying problem is that the syntax of the search
isn't correct. You're really searching on
NAME:color
defaultfield:me
defaultfield:mine

You want something like +NAME:color +NAME:me +NAME:mine

Best
Erick

On 2/27/07, Phillip Rhodes <[EMAIL PROTECTED]> wrote:


Hi,
According to the FAQ, by indexing the title of the document and performing
a search against the shorter field will automatically give it a higher
weight than matches against the document content.  That is what I am trying
to accomplish with a "NAME" field.  If someone enters a close match of the
name of a document (example Names: "Color Me Mine" ,"Pittsburgh and Its
Countryside"), I want that document to get a hit.  The search is user
entered, so I want it to be case-insensitive.  I also don't want it to have
to be an exact match.  Search terms such as "Pittsburgh Countryside" should
match up against a name of "Pittsburgh and Its Countryside".


Here I am adding the name field to my document:
String value= "Color Me Mine";
document.add(new Field("NAME", value, Field.Store.YES,
                                Field.Index.TOKENIZED));

Performing a search:
NAME:color me mine ->returns no results
NAME:color -> returns the document

I tried indexing the document without the value tokenized:
document.add(new Field("NAME", value, Field.Store.YES,
                                Field.Index.UN_TOKENIZED));

This caused the search to be case sensitive.

I am about to modify my indexing/searching code to use a secondary field,
"name_lowercase", this field would of course contain the name of the object
in lowercase and I would lowercase my search terms in I construct my
TermQuery for this field.

Is this a valid approach, or am I missing something?

Thanks!




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: indexing and searching the document title question

Reply via email to