Great question, let me check on that. Sadly I don't have fast control over the indexing process, but I'll post an update in the AM.
Thanks for the tip. Chris On Mon, May 2, 2011 at 6:36 PM, Jake Mannix <[email protected]> wrote: > Were your lucene indexes created with term vectors enabled? > > On May 2, 2011 3:05 PM, "Chris McConnell" <[email protected]> > wrote: > > Hello all, > > We are looking at utilizing LDA for some topic trending off some > pre-built Lucene indexes. I've put the command(s) and output below. > While searching, it seems a lot of people are unable to get this to > work properly. Most answers tell the user to review the example > "build-reuters.sh" but that doesn't utilize a Lucene index for the > input. > > The dictionary is created (on local disk) and an attempt at vector > creation is done on HDFS, however no vectors are written out. I'm > interested to know if anyone has actually gotten this to work on > Mahout 0.4. I have (just for testing purposes) then tried to run the > actual LDA on the created directories, however I wouldn't expect it to > work since there are no vectors created. > > Thanks, > Chris > > bin/mahout lucene.vector --dir /home/index_for_mahout/ --output > /user/vectored_lucene_index --dictOut > /home/vectored_lucene_index/dict.out --weight TF --field content > 11/05/02 17:23:57 INFO lucene.Driver: Output File: > /user/vectored_lucene_index > 11/05/02 17:23:57 INFO util.NativeCodeLoader: Loaded the native-hadoop > library > 11/05/02 17:23:57 INFO zlib.ZlibFactory: Successfully loaded & > initialized native-zlib library > 11/05/02 17:23:57 INFO compress.CodecPool: Got brand-new compressor > 11/05/02 17:23:58 INFO lucene.Driver: Wrote: 0 vectors > 11/05/02 17:23:58 INFO lucene.Driver: Dictionary Output file: > /home/vectored_lucene_index/dict.out > 11/05/02 17:23:58 INFO driver.MahoutDriver: Program took 578 ms >
