[R] Random forest regression: feedback on general approach and possible issues

Johannes Klene Fri, 04 Dec 2015 07:41:57 -0800

Hi all,
I'd like to use random forest regression to say something about the
importance of a set of genes (binary) for schizophrenia-related behavior
(continuous measure). I am still reading up on this technique, but would
already really appreciate any feedback on whether my approach is valid.
So...using the randomForest package, is it a good approach to enter a few
dozen binary predictors to assess their importance (as a set, and
individually) for a continuous measure with a sample size of ~1000 people?
More specific questions:
- I have an additional interest in interactions (though perhaps not the
best word in this context), does it make any sense to say something about
the influence one predictor has over others by looking at the change in
estimated importance of the others when that predictor is removed from the
model?
- I have a few siblings in the data, i.e. non-independence, is this a
problem and if so, is there anything I can do about it?
- The few papers I have seen so far on using this technique in a similar
situation do not include any 'standard' covariates such as age and gender,
should I?
Any and all feedback is greatly appreciated!! Kind regards, Johannes


p.s. Hope I've come to the right place despite this being a more general
question, if not please let me know of a forum where this is more suited
for.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Random forest regression: feedback on general approach and possible issues

Reply via email to