Re: indexing and searching the document title question

Phillip Rhodes Tue, 27 Feb 2007 14:08:14 -0800

I am confused.  I am following the faq that says indexing/searching a title of 
a document will cause it be ranked higher.

When I do a search on the title of my document (name in my case), the document 
is being returned.  But it does not get ranked higher, in fact, it gets buried 
in the results.

I am using the StandardAnalyser on both the indexing and the searching.
I added the NAME to my document as a tokenized field.
                document.add(new Field("NAME", "Color Me Mine", Field.Store.YES,
                                Field.Index.TOKENIZED));
My queryparser uses the StandardAnalyzer and it is building the query on the 
correct field.  Running this query will return 127 results with the matching 
document name at #40 in the list.
NAME:"color me mine" (CONTENTS:color CONTENTS:me CONTENTS:mine)

If I run this query (NAME:"color me mine"), I will get my one match, but 
nothing else since I am not searching the contents of the document, so I know 
the "name" query is returning a record.

Can anyone think of anything else I can do to boost the results of a match on 
the NAME field? I tried setting a boost on the "name" query, but it didn't 
work.  The documents were not returned in any different order.
The query toString method returned:
NAME:"color me mine"^2.0 (CONTENTS:color CONTENTS:me CONTENTS:mine)

Thanks.

I am installing LUKE now...

The Name is definitely on my document:
Document<stored/uncompressed,indexed<OBJECT_ID:238173> 
stored/uncompressed<SNIPPET:Color Me Mine is a friendly place where you can 
create your own artwork on the pottery piece of your ...> 
stored/uncompressed,indexed<ATTRACTION_ID:238173> 
stored/uncompressed,indexed,tokenized<NAME:Color Me Mine> 
stored/uncompressed,indexed<OBJECT_TYPE:com.reffects.dmi.dbom.Attraction> 
stored/uncompressed,indexed,tokenized<CATEGORY_NAME:Arts & Entertainment> 
stored/uncompressed,indexed<CATEGORY_ID:29> 
stored/uncompressed,indexed<OBJECT_TYPE:ACTIVITY> 
stored/uncompressed,indexed<ATTRACTION_TYPE:A> 
stored/uncompressed,indexed<BLUE_RIBBON:false> 
stored/uncompressed,indexed<CABIN_FEVER:false> 
stored/uncompressed,indexed<APPALACIAN:false> 
stored/uncompressed,indexed<WILD:false> 
stored/uncompressed,indexed,tokenized<REGION_NAME:Pittsburgh and Its 
Countryside> stored/uncompressed,indexed<REGION_ID:4> 
stored/uncompressed,indexed,tokenized<CITY:PITTSBURGH> 
stored/uncompressed,indexed<ZIP_CODE:15217> 
stored/uncompressed,indexed<LATITUDE:1040.438167> 
stored/uncompressed,indexed<LONGITUDE:920.078858> 
stored/uncompressed,indexed<HANDICAP_ACCESS:N> 
stored/uncompressed,indexed<SITE_ID:16651> 
stored/uncompressed,indexed<SITE_ID:16501>>

----- Original Message -----
From: "Erick Erickson" <[EMAIL PROTECTED]>
To: [email protected]
Sent: Tuesday, February 27, 2007 1:13:45 PM (GMT-0500) America/New_York
Subject: Re: indexing and searching the document title question

You've probably got it right. But I'd add a couple of things....

1> by using the correct analyzer at index and query time, the
casing will be taken care of for you.

2> you don't want UN_TOKENIZED for fields you search on
in general because there's no parsing. So if you indexed
"This is a String", searching on "This" or "this" wouldn't match.

3> In your code fragment, you didn't show what Analyzer you
use. This is way more important than you think.

4> get a copy of Luke (google lucene luke). It'll let you examine
your index and save you a world of hurt. There have been some
very nice improvements lately along with 2.1 compatability.

5> If you want searches and indexing to use different analyzers
on different fields, see PerFieldAnalyzerWrapper.

6> You'll probably find yourself storing the same data multiple
times, once for searching and once for displaying. So you'll search
on the lowercased, indexed field and display the UN_TOKENIZED
version since it'll retain the capitalization.

7> I think your underlying problem is that the syntax of the search
isn't correct. You're really searching on
NAME:color
defaultfield:me
defaultfield:mine

You want something like +NAME:color +NAME:me +NAME:mine

Best
Erick

On 2/27/07, Phillip Rhodes <[EMAIL PROTECTED]> wrote:
>
> Hi,
> According to the FAQ, by indexing the title of the document and performing
> a search against the shorter field will automatically give it a higher
> weight than matches against the document content.  That is what I am trying
> to accomplish with a "NAME" field.  If someone enters a close match of the
> name of a document (example Names: "Color Me Mine" ,"Pittsburgh and Its
> Countryside"), I want that document to get a hit.  The search is user
> entered, so I want it to be case-insensitive.  I also don't want it to have
> to be an exact match.  Search terms such as "Pittsburgh Countryside" should
> match up against a name of "Pittsburgh and Its Countryside".
>
>
> Here I am adding the name field to my document:
> String value= "Color Me Mine";
> document.add(new Field("NAME", value, Field.Store.YES,
>                                 Field.Index.TOKENIZED));
>
> Performing a search:
> NAME:color me mine ->returns no results
> NAME:color -> returns the document
>
> I tried indexing the document without the value tokenized:
> document.add(new Field("NAME", value, Field.Store.YES,
>                                 Field.Index.UN_TOKENIZED));
>
> This caused the search to be case sensitive.
>
> I am about to modify my indexing/searching code to use a secondary field,
> "name_lowercase", this field would of course contain the name of the object
> in lowercase and I would lowercase my search terms in I construct my
> TermQuery for this field.
>
> Is this a valid approach, or am I missing something?
>
> Thanks!
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: indexing and searching the document title question

Reply via email to