Thanks Ted for your quick response. I am concerned about having different sets of vectors because I want to run LDA over the vector outputs and looking at the code of LDA I'm not sure if I can update this incrementally... I'll have to take another look but I wanted to clarify this before running a large indexing job over my data...
On Thu, Jul 16, 2009 at 10:42 AM, Ted Dunning <[email protected]> wrote: > My first inclination is to think that it will just work if you generate > vectors from each index independently. > > The only failure mode that I can think of would be due to the same term > labels in vectors possibly being assigned different indexes. > > Jeff, Grant, will the labels take precedence over the index when loading a > vector? > > On Thu, Jul 16, 2009 at 9:15 AM, Florian Leibert <[email protected]> wrote: > > > Hi, > > I have a sharded Lucene index that spans about 400 GB and am wondering if > I > > can create the vectors (via the patch specified in MAHOUT-126) on this > > sharded index? > > > > >
