I am confused. I am following the faq that says indexing/searching a title of
a document will cause it be ranked higher.
When I do a search on the title of my document (name in my case), the document
is being returned. But it does not get ranked higher, in fact, it gets buried
in the results.
I am using the StandardAnalyser on both the indexing and the searching.
I added the NAME to my document as a tokenized field.
document.add(new Field("NAME", "Color Me Mine", Field.Store.YES,
Field.Index.TOKENIZED));
My queryparser uses the StandardAnalyzer and it is building the query on the
correct field. Running this query will return 127 results with the matching
document name at #40 in the list.
NAME:"color me mine" (CONTENTS:color CONTENTS:me CONTENTS:mine)
If I run this query (NAME:"color me mine"), I will get my one match, but
nothing else since I am not searching the contents of the document, so I know
the "name" query is returning a record.
Can anyone think of anything else I can do to boost the results of a match on
the NAME field? I tried setting a boost on the "name" query, but it didn't
work. The documents were not returned in any different order.
The query toString method returned:
NAME:"color me mine"^2.0 (CONTENTS:color CONTENTS:me CONTENTS:mine)
Thanks.
I am installing LUKE now...
The Name is definitely on my document:
Document<stored/uncompressed,indexed<OBJECT_ID:238173>
stored/uncompressed<SNIPPET:Color Me Mine is a friendly place where you can
create your own artwork on the pottery piece of your ...>
stored/uncompressed,indexed<ATTRACTION_ID:238173>
stored/uncompressed,indexed,tokenized<NAME:Color Me Mine>
stored/uncompressed,indexed<OBJECT_TYPE:com.reffects.dmi.dbom.Attraction>
stored/uncompressed,indexed,tokenized<CATEGORY_NAME:Arts & Entertainment>
stored/uncompressed,indexed<CATEGORY_ID:29>
stored/uncompressed,indexed<OBJECT_TYPE:ACTIVITY>
stored/uncompressed,indexed<ATTRACTION_TYPE:A>
stored/uncompressed,indexed<BLUE_RIBBON:false>
stored/uncompressed,indexed<CABIN_FEVER:false>
stored/uncompressed,indexed<APPALACIAN:false>
stored/uncompressed,indexed<WILD:false>
stored/uncompressed,indexed,tokenized<REGION_NAME:Pittsburgh and Its
Countryside> stored/uncompressed,indexed<REGION_ID:4>
stored/uncompressed,indexed,tokenized<CITY:PITTSBURGH>
stored/uncompressed,indexed<ZIP_CODE:15217>
stored/uncompressed,indexed<LATITUDE:1040.438167>
stored/uncompressed,indexed<LONGITUDE:920.078858>
stored/uncompressed,indexed<HANDICAP_ACCESS:N>
stored/uncompressed,indexed<SITE_ID:16651>
stored/uncompressed,indexed<SITE_ID:16501>>
----- Original Message -----
From: "Erick Erickson" <[EMAIL PROTECTED]>
To: [email protected]
Sent: Tuesday, February 27, 2007 1:13:45 PM (GMT-0500) America/New_York
Subject: Re: indexing and searching the document title question
You've probably got it right. But I'd add a couple of things....
1> by using the correct analyzer at index and query time, the
casing will be taken care of for you.
2> you don't want UN_TOKENIZED for fields you search on
in general because there's no parsing. So if you indexed
"This is a String", searching on "This" or "this" wouldn't match.
3> In your code fragment, you didn't show what Analyzer you
use. This is way more important than you think.
4> get a copy of Luke (google lucene luke). It'll let you examine
your index and save you a world of hurt. There have been some
very nice improvements lately along with 2.1 compatability.
5> If you want searches and indexing to use different analyzers
on different fields, see PerFieldAnalyzerWrapper.
6> You'll probably find yourself storing the same data multiple
times, once for searching and once for displaying. So you'll search
on the lowercased, indexed field and display the UN_TOKENIZED
version since it'll retain the capitalization.
7> I think your underlying problem is that the syntax of the search
isn't correct. You're really searching on
NAME:color
defaultfield:me
defaultfield:mine
You want something like +NAME:color +NAME:me +NAME:mine
Best
Erick
On 2/27/07, Phillip Rhodes <[EMAIL PROTECTED]> wrote:
>
> Hi,
> According to the FAQ, by indexing the title of the document and performing
> a search against the shorter field will automatically give it a higher
> weight than matches against the document content. That is what I am trying
> to accomplish with a "NAME" field. If someone enters a close match of the
> name of a document (example Names: "Color Me Mine" ,"Pittsburgh and Its
> Countryside"), I want that document to get a hit. The search is user
> entered, so I want it to be case-insensitive. I also don't want it to have
> to be an exact match. Search terms such as "Pittsburgh Countryside" should
> match up against a name of "Pittsburgh and Its Countryside".
>
>
> Here I am adding the name field to my document:
> String value= "Color Me Mine";
> document.add(new Field("NAME", value, Field.Store.YES,
> Field.Index.TOKENIZED));
>
> Performing a search:
> NAME:color me mine ->returns no results
> NAME:color -> returns the document
>
> I tried indexing the document without the value tokenized:
> document.add(new Field("NAME", value, Field.Store.YES,
> Field.Index.UN_TOKENIZED));
>
> This caused the search to be case sensitive.
>
> I am about to modify my indexing/searching code to use a secondary field,
> "name_lowercase", this field would of course contain the name of the object
> in lowercase and I would lowercase my search terms in I construct my
> TermQuery for this field.
>
> Is this a valid approach, or am I missing something?
>
> Thanks!
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]