Hey thanks for quick response. If I am understanding u properly, " every tree grown is trained on the whole dataset" means that all the features/variables are used for building the trees where as in partial we take a subset of the features/variables ?? Kindly correct me if I m wrong
Thanks again Regards, Akshay Nowal | -----Original Message----- From: deneche abdelhakim [mailto:[email protected]] Sent: Thursday, July 05, 2012 11:23 AM To: [email protected] Subject: Re: Difference when we don't use partial implementation Hi Akshay, when you don't use the "-p" parameter, the builder loads the whole dataset in memory in every computing node, so every tree grown is trained on the whole dataset (of course using bagging to select a subset of it). When using "-p", every computing node loads a part of the dataset (thus the name "partial") so the trees are trained on parts of the dataset. The training algorithm is the same in both implementations, and the partial implementation is used when the dataset is too big to fit in memory. On Thu, Jul 5, 2012 at 4:38 AM, Nowal, Akshay <[email protected]>wrote: > Hi All, > > > > I am running Decision forest in Mahout, below are the commands that I > have used to implement the algo: > > > > Info file: > > mahout org.apache.mahout.df.tools.Describe -p > /user/an32665/KDD/KDDTrain+.arff -f /user/an32665/KDD/KDDTrain+.info -d > N 3 C 2 N C 4 N C 8 N 2 C 19 N L > > Building Forest: > > mahout org.apache.mahout.df.mapreduce.BuildForest > -Dmapred.max.split.size=1874231 -oob -d /user/an32665/KDD/KDDTrain+.arff > -ds /user/an32665/KDD/KDDTrain+.info -sl 5 -p -t 100 -o nsl-forest > > Testing Forest: > > mahout org.apache.mahout.df.mapreduce.TestForest -i > /user/an32665/KDD/KDDTest+.arff -ds /user/an32665/KDD/KDDTrain+.info -m > nsl-forest -a -mr -o predictions > > > > So while building the forest we use "-P" for implementing partial > implementation. I just wanted to know the difference in algorithm when > we use "-p" and when we don't use "-p". > > > > > > Regards, > > Akshay Nowal > > > >
