Hello, we are using lucene in one of our applications for fulltext search, which works very vell.
I'am now interested in some similarity search for whole paragraphs. For example there are 1000 textual items in the database, which contain on average more then perhaps 100 words per item. Now i have a set of 10 textual items, and would like to know, which of the 1000 texual items are similar to the 10 (in a certain tolerance)? Is this possible with lucene? Thanks in advance Mark
