At 11:23 PM -0400 9/25/01, Geoff Hutchison wrote:
>  > my search log analyses, I find that most of the multiword searches
>>  are for reasonable phrases, and that pages matching those phrases
>
>I'm assuming you're talking about some sort of proximity ranking. In other
>words, if you performed a regular query and the queried words fell close
>together on the page, it would score higher.
>
>Yes, this is certainly considered. The catch is coming up with a way to
>score this quickly. It seems like mathematically you want to compute
>something like the minimum distance between all words in the query. But
>this seems a bit costly. Certainly if you know of references on computing
>this proximity quickly, I'd be interested to read them.

I mean that if the words are next to each other, they are related -- 
doesn't have to be a more complex proximity matching than that.  If 
you store the word offsets in the index entries, you can simply check 
to see if they're next to each other (n+1), if so, you've got a 
really fine hit and should weight it very heavily.

Avi
-- 
_________________________________________________
Complete Guide to Search Engines for Web Sites, Intranets, 
   and Portals: <http://www.searchtools.com>

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to