I noticed a post discussing term vectors on the lucene-dev list. It is great that people are working on adding term vectors to Lucene. Stored term vectors per field (or document) are a great way to get lucene into classification of documents. There are great many text classification algorithms that have primarily two inputs: term vector for the whole collection of documents and term vector per each document. A nice feature to add using the vectors would be 'like-documents': given an indexed document, what other documents are there in the index that are like it. Another feature can be clustering of documents into categories based on the vectors.
I am curious when the term vector support that you worked on will be added to the main build branch. I couldn't find any information on that. -Emile -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
