Hi Abdelhakim, Thanks for the link! I ended up saving the "Dataset" into file after loading the training data, works perfectly :)
Yuhan On Tue, Aug 14, 2012 at 12:44 PM, Abdelhakim Deneche <[email protected]>wrote: > Hi Yuhan, > > If you follow the example in the wiki page: > > https://cwiki.apache.org/MAHOUT/partial-implementation.html > > You should have generated a descriptor file for your training data using > the "Describe" tool, because the test data must have similar attributes, > you can just load that descriptor using > > Dataset.load() > > > > > On 14 août 2012, at 19:39, Yuhan Zhang <[email protected]> wrote: > > > Hi Deneche and Chyi-Kwei, > > > > Thanks for the suggestion. I'm reading the "TestForest" class ( > > > https://github.com/apache/mahout/blob/trunk/core/src/main/java/org/apache/mahout/classifier/df/mapreduce/Classifier.java > > ) > > > > > > Looks like I'm using a different method to load the dataset > > Dataset dataset = DataLoader.generateDataset( pattern, false, > > trainDataArray); > > > > If I use a similar way to load the dataset for testing, it gives me > > "unknown" as a result: (but works correctly if I use the trainData as > > dataset) > > Dataset dataset = DataLoader.generateDataset(pattern, false, testData > ); > > > > while the example code is loading using: > > Dataset.load(conf, datasetPath) > > > > Looks like it has to do the way I'm generating the test dataset. Thanks > for > > the help. will look more into it. > > > > Yuhan > > > > On Mon, Aug 13, 2012 at 9:28 PM, deneche abdelhakim <[email protected] > >wrote: > > > >> Last time I checked it was still working in the latest version > >> > >> On Tue, Aug 14, 2012 at 4:26 AM, chyi-kwei yau <[email protected] > >>> wrote: > >> > >>> Hi Yuhan, > >>> > >>> You can run "BuildForest" on your train data and "TestForest" on your > >>> testing data. > >>> > >>> You can check the example here: > >>> https://cwiki.apache.org/MAHOUT/partial-implementation.html > >>> > >>> I use mahout 0.6 and it works for me, but not sure it will work in > >>> other mahout version. > >>> > >>> Best, > >>> Chyi-Kwei Yau > >>> > >>> On Mon, Aug 13, 2012 at 8:11 PM, Yuhan Zhang <[email protected]> > >> wrote: > >>>> some typo in the last email: > >>>> name of the method is decisionForest.classify(dataset, random, > >> Instance) > >>>> > >>>> if a dataset other than training dataset is given, it will result > >>>> IllegalArgumentException:values not found for attribute 1. > >>>> > >>>> need some help here. > >>>> > >>>> Yuhan > >>>> > >>>> On Mon, Aug 13, 2012 at 5:03 PM, Yuhan Zhang <[email protected]> > >>> wrote: > >>>> > >>>>> Hi all, > >>>>> > >>>>> I'm trying to train a decision forest, save it to file, and use it > >>> latter. > >>>>> I have managed to write a trained decision forest to file using > >>>>> "DecisionForest.write( dataOutPut ) "; > >>>>> > >>>>> but when I load a saved decision tree from file to classify, I > >> realized > >>>>> the the method > >>>>> DecisionForest.classifier(Dataset, random, Instance) is expecting the > >>>>> original training Dataset. > >>>>> > >>>>> Is there a way to avoid loading the training Dataset? It is kind > >> large, > >>>>> and I'd like to avoid loading it. > >>>>> > >>>>> > >>>>> Thank you > >>>>> > >>>>> Yuhan > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Yuhan Zhang > >>>> Senior Software Engineer > >>>> OneScreen Inc. > >>>> [email protected] <[email protected]> > >>>> www.onescreen.com > >>>> (949) 525-4825 Ext: 177 > >>>> > >>>> > >>>> The information contained in this e-mail is for the exclusive use of > >> the > >>>> intended recipient(s) and may be confidential, proprietary, and/or > >>> legally > >>>> privileged. Inadvertent disclosure of this message does not constitute > >> a > >>>> waiver of any privilege. If you receive this message in error, please > >> do > >>>> not directly or indirectly print, copy, retransmit, disseminate, or > >>>> otherwise use the information. In addition, please delete this e-mail > >> and > >>>> all copies and notify the sender. > >>> > >> > > > > > > > > -- > > Yuhan Zhang > > Senior Software Engineer > > OneScreen Inc. > > [email protected] <[email protected]> > > www.onescreen.com > > (949) 525-4825 Ext: 177 > > > > > > The information contained in this e-mail is for the exclusive use of the > > intended recipient(s) and may be confidential, proprietary, and/or > legally > > privileged. Inadvertent disclosure of this message does not constitute a > > waiver of any privilege. If you receive this message in error, please do > > not directly or indirectly print, copy, retransmit, disseminate, or > > otherwise use the information. In addition, please delete this e-mail and > > all copies and notify the sender. >
