Quoting Rajesh Munavalli <[EMAIL PROTECTED]>: > Let me explain a scenario where I would need to add the n-grams at > indexing time.
I see your point and I do agree. As it stands, Lucene does not innately support n-gram indexing. However it is not impossible to adapt Lucene to serve as an n-gram index. The method I will describe can be adapted to any search engine, not just Lucene. But before I go on I must warn you that the end result will use a lot of diskspace and will also result in longer search time (by a multiple of N). What's the method? Use of multiple incompatible indexes, N indexes to be exact. You can write an Analyzer that churns out bi-grams and use it to create an index of bi-grams. Likewise, you can also write an Analyzer that churns out tri-grams and create an index of tri-grams. Its a tedious and diskspace wasting method of n-gram indexing, but it can be done. You can then separately search all three indexes, the unigram index, bigram index and trigram index, to generate three separate scores for every document, then combine the three scores using weights. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]