[ 
https://issues.apache.org/jira/browse/MAHOUT-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989968#comment-12989968
 ] 

Lance Norskog commented on MAHOUT-602:
--------------------------------------

There is an off-by-one error somewhere. The code generates two files with 
'number of trees requested' instead of one. 

To make things easier I created 10 trees instead of 100. Two files of trees are 
created instead of just one. The patch prints the hashCode() for each 
tree.toString. You can see that the two files have different trees. I have 
included the value for each tree in the attached log 10_hashCode.log. 
(10_toString.log shows the actual string dump for each tree.) 

Apply the patch attached as PartialImplementationBug1.patch, if you want to 
recreate the experiment. Try different numbers of trees and it will always make 
two files of N trees instead of just 1.

This was the command line, as per the wiki:
$HADOOP_HOME/bin/hadoop jar 
/Users/lancenorskog/Documents/open/mahout/examples/target/mahout-examples-0.5-SNAPSHOT-job.jar
 org.apache.mahout.df.mapreduce.BuildForest -Dmapred.max.split.size=1874231 
-oob -d ../../datasets/KDDTrain/KDDTrain+_20Percent.arff -ds 
../../datasets/KDDTrain/KDDTrain+_20Percent.info  -sl 5 -p -t 10 -o nsl-forest


> "Partial Implementation" throws exceptions
> ------------------------------------------
>
>                 Key: MAHOUT-602
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-602
>             Project: Mahout
>          Issue Type: Bug
>         Environment: Macos X
> java version "1.6.0_22"
> Java(TM) SE Runtime Environment (build 1.6.0_22-b04-307-10M3261)
> Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03-307, mixed mode)
>            Reporter: Lance Norskog
>         Attachments: partialImp_fullKDD_errors.log
>
>
> The "Partial Implementation" described on the wiki page [Partial 
> Implementation|https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation]
>  fails with the given dataset and operations.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to