On Sun, Aug 21, 2011 at 5:29 PM, Marvin Humphrey <[email protected]> wrote: > On Sun, Aug 21, 2011 at 09:34:58PM +0200, Moritz Lenz wrote: >> One of my conclusions is that Google & co. usually treat the order of >> search terms as an important indicator for relevance, while most other >> search engines don't. > > Yep. A technique was described in section 4.5.1 of the seminal Brin/Page 1998 > paper, "The Anatomy of a Large-Scale Hypertextual Web Search Engine":
If you're willing to do some spelunking, this thread where Marvin quoted the same excerpt might be relevant: http://www.rectangular.com/pipermail/kinosearch/2006-May/005012.html Personally, while proximity search is one approach, I don't see it as essential. I think that weighting of full and partial phrase searches is the more important part, with proximity being just one of these weighted variables. For example, I would want a search for "one two three four" to rank "one two [unrelated text] three four" much higher than "four three two one [unrelated text]", even if the unrelated text is quite long. My thoughts are still close to these: http://www.rectangular.com/pipermail/kinosearch/2007-June/004204.html Nathan Kurz [email protected]
