[R] Partial dependence plot in randomForest package (all flat responses)
Hi, I'm trying to make a partial plot with package randomForest in R. After I perform my random forest object I type partialPlot(data.rforest, pred.data=act2, x.var=centroid, C) where data.rforest is my randomforest object, act2 is the original dataset, centroid is one of the predictor and C is one of the classes in my response variable. Whatever predictor or response class I try I always get a plot with a straight line (a completely flat response). Similarly, If I set a categorical variable as predictor, I get a barplot with all the bar with the same height. I suppose I'm doing something wrong here because all other analysis on the same rforest object seem correct (e.g. varImp or MDSplot). Is it possible it is related to some option set in random forest object? Can somebody see the problem here? Thanks for your time -- View this message in context: http://r.789695.n4.nabble.com/Partial-dependence-plot-in-randomForest-package-all-flat-responses-tp4650470.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to aggregate combinations
Thanks a lot, this is what I was looking for. All the best -- View this message in context: http://r.789695.n4.nabble.com/How-to-aggregate-combinations-tp4631867p4631980.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to aggregate combinations
Hi all, Given a table like the one below, I want to get a number of vectors equal to the groups of connected ID (ID are considered connected if they are in the same row). Each vector should contains all the connected ID . e.g. In this case: vect1 (1,2,3) vect2 (5,6) vect3 (7,8,9) ID ID2 12 13 23 65 78 89 Does someone know how to do it automatically for tables with thousands of rows? Thanks a lot -- View this message in context: http://r.789695.n4.nabble.com/How-to-aggregate-combinations-tp4631867.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sampling problems
Hi, thank you but it does work for vectors and matrix but not dataframes, it gives me this message error: MeanA - read.csv(MeanAmf.csv,header=T) mysample - MeanA[sample(1:nrow(MeanA), 20, replace=FALSE),] remainder-MeanA[-mysample] Error in `[.default`(MeanA, -mysample) : invalid subscript type 'list' In Ops.factor(left) : - not meaningful for factors Any other way? -- View this message in context: http://r.789695.n4.nabble.com/Sampling-problems-tp4453752p4455912.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sampling problems
Hi sarah, it is not clear to me how to do that, can you show me please? Imagine I have a situation like this: MeanA - read.csv(MeanAmf.csv,header=T) mysample - MeanA[sample(1:nrow(MeanA), 20, replace=FALSE),] Then? -- View this message in context: http://r.789695.n4.nabble.com/Sampling-problems-tp4453752p4455921.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sampling problems
Thanks, but it doesn't work either, it gives me the same message error. It works just if my first sample is taken in this way: mysample - sample(1:nrow(MeanA), 20, replace=FALSE) However, in this way it sample just the number of rows: [1] 71 24 12 36 2 39 69 62 43 38 9 44 13 54 50 63 67 66 37 28 but not the data inside. I need to sample in this way: mysample - MeanA[sample(1:nrow(MeanA), 20, replace=FALSE),] to get a sample like this HRkmMean.mf Mean.mfm Loc Diet Terr Soc Type Soc.Ter W.cat.0.25 W.cat.0.5 -2.49-0.432.57 A OT S D TS b 23 -2.050.67 T CN SD NS A This is an example of my dataframe -- View this message in context: http://r.789695.n4.nabble.com/Sampling-problems-tp4453752p4456048.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Simple solution
Hi everybody, Thank you all for your suggestions, you have been very helpful. However at the end I solved in this way: mysample - MaxDH[sample(1:nrow(MaxDH), 150, replace=FALSE),] A-mysample[1:120,] B-mysample[121:150,] So simple at the end... Best, Luca -- View this message in context: http://r.789695.n4.nabble.com/Sampling-problems-tp4453752p4456469.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sampling problems
Hi, I need to sample randomly my dataset for 1000 times. The sample need to be the 80%. I know how to do that, my problem is that not only I need the 80%, but I also need the corresponding 20% each time. Is there any way to do that? Alternatively, I was thinking to something like setdiff () function to compare my 80% sample to the original dataset and obtain the corresponding 20%, unfortunately setdiff works just for vectors, do you know a similar function for dataframes? Thanks -- View this message in context: http://r.789695.n4.nabble.com/Sampling-problems-tp4453752p4453752.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.