On Mon, 2 May 2016, Preetam Pal wrote:

Hi guys,

If I am applying ctree() on a data (specifying some control parameters like
maxdepth), is there a way I can programmatically access the (smaller)
datasets corresponding to the terminal nodes in the tree? Say, if there are
7 terminal nodes, I need those 7 datasets (of course, I can look at the
respective node-splitting attributes and write out a filtering function -
but clearly too much to ask for if I have a large number of terminal
nodes). Intention is to perform regression on each of these terminal
datasets.

If you use the "partykit" implementation you can do:

library("partykit")
ct <- ctree(Species ~ ., data = iris)
data_party(ct, id = 6)

to obtain the data associated with node 6 for example. You can also use ct[6] to obtain the subtree and ct[6]$data for its associated data.

For setting up a factor with the terminal node IDs, you can also use predict(ct, type = "node") and then use that in lm() etc.

Finally, note that there is also lmtree() and glmtree() for trees with (generalized) linear models in their nodes.

Regards,
Preetam

--
Preetam Pal
(+91)-9432212774
M-Stat 2nd Year,                                             Room No. N-114
Statistics Division,                                           C.V.Raman
Hall
Indian Statistical Institute,                                 B.H.O.S.
Kolkata.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to