Lucene has term vector capability, which facilitates LSA types of things. For a field you can get back all the terms in it, their frequency, and their positions. Enabling this requires setting the flag appropriately on the field during indexing.

Hope that helps.

    Erik


On Aug 18, 2005, at 10:42 AM, Sebastian Menge wrote:

Hi

I want to build a search-engine based on LSA (latent semantic analysis).

How much of lucene's functionality could be reused? Could I use lucene's
index to build up the "term by document" matrix? And of course, why?

TIA, Sebastian


Reply via email to