PS. i've already been reading thru SparseVectorsFromSequenceFiles.java, just trying to figure if can do it faster by taking advice for more starting points to look at.
Thanks in advance. -Dmitriy On Thu, Dec 30, 2010 at 12:56 PM, Dmitriy Lyubimov <[email protected]>wrote: > Hi, > > I would like to try LSI processing of results produced by seq2sparse. > > What's more, I need to be able to fold-in a bunch of new documents > afterwards. > > Is there any support for fold-in indexing in Mahout? > > if not, is there a quick way for me to gain the understanding of seq2sparse > output? > In particular, if i wanted to add fold-in indexing, i need to be able to > produce TF or TF-IDF of the new document on the fly using pre-existing > dictionary and word counts. What's the api for this dictionary? > > Thank you. > -Dmitriy >
