i remember lucene doesn't do anything for proximity. On 7/14/05, Rajesh Munavalli <[EMAIL PROTECTED]> wrote: > Consider a document with the following contents > " Levenshtein distance is named after the Russian scientist Vladimir > Levenshtein and is also called edit distance" > > Possible bi-grams are (after removing the stop words in the beginning > and end) > "Levenshtein distance", "named after", "Russian scientist", "scientist > Vladimir", "Vladimir Levenshtein" called edit", "edit distance" > > If my query term is "Vladimir levenshtein distance", how does Lucene > compute the similarity to the indexed terms? Are query terms appearing > together given more importance? How does it account for gaps (caused by > stop word removal) while matching multiword query? > > thanks, > > Rajesh Munavalli > >
-- Thanks! yours, WeiZhu Chen
