Hi Yuhan,

If you follow the example in the wiki page:

https://cwiki.apache.org/MAHOUT/partial-implementation.html

You should have generated a descriptor file for your training data using the 
"Describe" tool, because the test data must have similar attributes, you can 
just load that descriptor using 

Dataset.load()




On 14 août 2012, at 19:39, Yuhan Zhang <[email protected]> wrote:

> Hi Deneche and Chyi-Kwei,
> 
> Thanks for the suggestion. I'm reading the "TestForest" class (
> https://github.com/apache/mahout/blob/trunk/core/src/main/java/org/apache/mahout/classifier/df/mapreduce/Classifier.java
> )
> 
> 
> Looks like I'm using a different method to load the dataset
>    Dataset dataset = DataLoader.generateDataset( pattern, false,
> trainDataArray);
> 
> If I use a similar way to load the dataset for testing, it gives me
> "unknown" as a result: (but works correctly if I use the trainData as
> dataset)
>   Dataset dataset = DataLoader.generateDataset(pattern, false, testData );
> 
> while the example code is loading using:
>    Dataset.load(conf, datasetPath)
> 
> Looks like it has to do the way I'm generating the test dataset. Thanks for
> the help. will look more into it.
> 
> Yuhan
> 
> On Mon, Aug 13, 2012 at 9:28 PM, deneche abdelhakim <[email protected]>wrote:
> 
>> Last time I checked it was still working in the latest version
>> 
>> On Tue, Aug 14, 2012 at 4:26 AM, chyi-kwei yau <[email protected]
>>> wrote:
>> 
>>> Hi Yuhan,
>>> 
>>> You can run "BuildForest" on your train data and "TestForest" on your
>>> testing data.
>>> 
>>> You can check the example here:
>>> https://cwiki.apache.org/MAHOUT/partial-implementation.html
>>> 
>>> I use mahout 0.6 and it works for me, but not sure it will work in
>>> other mahout version.
>>> 
>>> Best,
>>> Chyi-Kwei Yau
>>> 
>>> On Mon, Aug 13, 2012 at 8:11 PM, Yuhan Zhang <[email protected]>
>> wrote:
>>>> some typo in the last email:
>>>> name of the method is decisionForest.classify(dataset, random,
>> Instance)
>>>> 
>>>> if a dataset other than training dataset is given, it will result
>>>> IllegalArgumentException:values not found for attribute 1.
>>>> 
>>>> need some help here.
>>>> 
>>>> Yuhan
>>>> 
>>>> On Mon, Aug 13, 2012 at 5:03 PM, Yuhan Zhang <[email protected]>
>>> wrote:
>>>> 
>>>>> Hi all,
>>>>> 
>>>>> I'm  trying to train a decision forest, save it to file, and use it
>>> latter.
>>>>> I have managed to write a trained decision forest to file using
>>>>> "DecisionForest.write( dataOutPut ) ";
>>>>> 
>>>>> but when I load a saved decision tree from file  to classify, I
>> realized
>>>>> the the method
>>>>> DecisionForest.classifier(Dataset, random, Instance) is expecting the
>>>>> original training Dataset.
>>>>> 
>>>>> Is there a way to avoid loading the training Dataset? It is kind
>> large,
>>>>> and I'd like to avoid loading it.
>>>>> 
>>>>> 
>>>>> Thank you
>>>>> 
>>>>> Yuhan
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Yuhan Zhang
>>>> Senior Software Engineer
>>>> OneScreen Inc.
>>>> [email protected] <[email protected]>
>>>> www.onescreen.com
>>>> (949) 525-4825 Ext: 177
>>>> 
>>>> 
>>>> The information contained in this e-mail is for the exclusive use of
>> the
>>>> intended recipient(s) and may be confidential, proprietary, and/or
>>> legally
>>>> privileged. Inadvertent disclosure of this message does not constitute
>> a
>>>> waiver of any privilege.  If you receive this message in error, please
>> do
>>>> not directly or indirectly print, copy, retransmit, disseminate, or
>>>> otherwise use the information. In addition, please delete this e-mail
>> and
>>>> all copies and notify the sender.
>>> 
>> 
> 
> 
> 
> -- 
> Yuhan Zhang
> Senior Software Engineer
> OneScreen Inc.
> [email protected] <[email protected]>
> www.onescreen.com
> (949) 525-4825 Ext: 177
> 
> 
> The information contained in this e-mail is for the exclusive use of the
> intended recipient(s) and may be confidential, proprietary, and/or legally
> privileged. Inadvertent disclosure of this message does not constitute a
> waiver of any privilege.  If you receive this message in error, please do
> not directly or indirectly print, copy, retransmit, disseminate, or
> otherwise use the information. In addition, please delete this e-mail and
> all copies and notify the sender.

Reply via email to