Re: [R] rpart.object help
On Mon, 2010-12-13 at 01:55 -0800, jagdeesh_mn wrote: Prof Brian Ripley wrote: snip / Thanks Mr. Brian. That kind of answers my query. On the same note I would like to ask few other questions. Sorry if you find them naive, I am a novice in this subject and am trying to get a grip on things. 1. I am using R package using my code and the fitted object looks like this : The Model representation : n= 60 node), split, n, deviance, yval * denotes terminal node 1) root 60 983551500 12615.670 2) dataFrame[, 6]='Small' 13 21804710 7682.385 * 3) dataFrame[, 6]='Compact','Large','Medium','Sporty','Van' 47 557851600 13980.190 6) dataFrame[, 3]='Japan/USA','Korea','USA' 29 13105 12673.030 12) dataFrame[, 6]='Compact','Sporty' 14 11426050 11055.570 * 13) dataFrame[, 6]='Large','Medium','Van' 15 48812470 14182.670 * 7) dataFrame[, 3]='France','Germany','Japan','Sweden' 18 297418200 16086.170 * What does the term deviance here stand for? At this point, go an read up on the theory of classification and regression trees. Depending on how you fitted your tree (what options used, what type of response modelled) the deviance could be computed in different ways. In short it is a measure of how impure each node of the tree is. See the References section of ?rpart HTH G 2. Could you also suggest me some readings on the topic of CnR trees specific to R with case studies? Regards, Jagdeesh -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rpart.object help
On Sun, 12 Dec 2010, jagdeesh_mn wrote: Hi, Suppose i have generated an object using the following : fit - rpart(Kyphosis ~ Age + Number + Start, data=kyphosis) And when i print fit, i get the following : n= 81 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 81 17 absent (0.7901235 0.2098765) 2) Start=8.5 62 6 absent (0.9032258 0.0967742) 4) Start=14.5 29 0 absent (1.000 0.000) * 5) Start 14.5 33 6 absent (0.8181818 0.1818182) 10) Age 55 12 0 absent (1.000 0.000) * 11) Age=55 21 6 absent (0.7142857 0.2857143) 22) Age=111 14 2 absent (0.8571429 0.1428571) * 23) Age 111 7 3 present (0.4285714 0.5714286) * 3) Start 8.5 19 8 present (0.4210526 0.5789474) * Is it possible to extract the splits alone as a matrix using rpart.object? If so, how? Regards, Jagdeesh The best description of the rpart object is obtained with help(rpart.object). Each row of $frame describes one primary split. More detailed descriptions of the (1 + ncompete + nprimary) split variables for the node are found in the $splits and $csplits component. You would need to look at summary.rpart to see how that is all indexed. I would suggest grabbing a copy of the source code, since that contains comments, which are stripped out when you print the R internal version. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rpart.object help
On Sun, 12 Dec 2010, jagdeesh_mn wrote: Hi, Suppose i have generated an object using the following : fit - rpart(Kyphosis ~ Age + Number + Start, data=kyphosis) And when i print fit, i get the following : n= 81 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 81 17 absent (0.7901235 0.2098765) 2) Start=8.5 62 6 absent (0.9032258 0.0967742) 4) Start=14.5 29 0 absent (1.000 0.000) * 5) Start 14.5 33 6 absent (0.8181818 0.1818182) 10) Age 55 12 0 absent (1.000 0.000) * 11) Age=55 21 6 absent (0.7142857 0.2857143) 22) Age=111 14 2 absent (0.8571429 0.1428571) * 23) Age 111 7 3 present (0.4285714 0.5714286) * 3) Start 8.5 19 8 present (0.4210526 0.5789474) * Is it possible to extract the splits alone as a matrix using rpart.object? If so, how? What do you think 'rpart.object' is? There is no such function in R. If you read help(rpart.object) it describes the returned object. You are probably looking for fit$frame, but if you want something else, study rpart:::print.rpart to see how that output is computed. Regards, Jagdeesh -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rpart.object help
Prof Brian Ripley wrote: On Sun, 12 Dec 2010, jagdeesh_mn wrote: Hi, Suppose i have generated an object using the following : fit - rpart(Kyphosis ~ Age + Number + Start, data=kyphosis) And when i print fit, i get the following : n= 81 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 81 17 absent (0.7901235 0.2098765) 2) Start=8.5 62 6 absent (0.9032258 0.0967742) 4) Start=14.5 29 0 absent (1.000 0.000) * 5) Start 14.5 33 6 absent (0.8181818 0.1818182) 10) Age 55 12 0 absent (1.000 0.000) * 11) Age=55 21 6 absent (0.7142857 0.2857143) 22) Age=111 14 2 absent (0.8571429 0.1428571) * 23) Age 111 7 3 present (0.4285714 0.5714286) * 3) Start 8.5 19 8 present (0.4210526 0.5789474) * Is it possible to extract the splits alone as a matrix using rpart.object? If so, how? What do you think 'rpart.object' is? There is no such function in R. If you read help(rpart.object) it describes the returned object. You are probably looking for fit$frame, but if you want something else, study rpart:::print.rpart to see how that output is computed. Regards, Jagdeesh -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Thanks Mr. Brian. That kind of answers my query. On the same note I would like to ask few other questions. Sorry if you find them naive, I am a novice in this subject and am trying to get a grip on things. 1. I am using R package using my code and the fitted object looks like this : The Model representation : n= 60 node), split, n, deviance, yval * denotes terminal node 1) root 60 983551500 12615.670 2) dataFrame[, 6]='Small' 13 21804710 7682.385 * 3) dataFrame[, 6]='Compact','Large','Medium','Sporty','Van' 47 557851600 13980.190 6) dataFrame[, 3]='Japan/USA','Korea','USA' 29 13105 12673.030 12) dataFrame[, 6]='Compact','Sporty' 14 11426050 11055.570 * 13) dataFrame[, 6]='Large','Medium','Van' 15 48812470 14182.670 * 7) dataFrame[, 3]='France','Germany','Japan','Sweden' 18 297418200 16086.170 * What does the term deviance here stand for? 2. Could you also suggest me some readings on the topic of CnR trees specific to R with case studies? Regards, Jagdeesh -- View this message in context: http://r.789695.n4.nabble.com/rpart-object-help-tp3085054p3085183.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.