[R] Partial dependence plot in randomForest package (all flat responses)

2012-11-22 Thread Oritteropus
Hi,
I'm trying to make a partial plot with package randomForest in R. After I
perform my random forest object I type

partialPlot(data.rforest, pred.data=act2, x.var=centroid, C) 

where data.rforest is my randomforest object, act2 is the original dataset,
centroid is one of the predictor and C is one of the classes in my response
variable. 
Whatever predictor or response class I try I always get a plot with a
straight line (a completely flat response). Similarly, If I set a
categorical variable as predictor, I get a barplot with all the bar with the
same height. I suppose I'm doing something wrong here because all other
analysis on the same rforest object seem correct (e.g. varImp or MDSplot).
Is it possible it is related to some option set in random forest object? Can
somebody see the problem here?
Thanks for your time



--
View this message in context: 
http://r.789695.n4.nabble.com/Partial-dependence-plot-in-randomForest-package-all-flat-responses-tp4650470.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to aggregate combinations

2012-05-31 Thread Oritteropus
Thanks a lot, this is what I was looking for.
All the best

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-aggregate-combinations-tp4631867p4631980.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to aggregate combinations

2012-05-30 Thread Oritteropus
Hi all, 
Given a table like the one below, I want to get a number of vectors equal to
the groups of connected ID (ID are considered connected if they are in the
same row). Each vector should contains all the connected ID . 

e.g. In this case: vect1  (1,2,3) vect2  (5,6) vect3 (7,8,9)

ID ID2 
12 
13 
23 
65 
78
89

Does someone know how to do it automatically for tables with thousands of
rows?
Thanks a lot

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-aggregate-combinations-tp4631867.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sampling problems

2012-03-08 Thread Oritteropus
Hi, thank you but it does work for vectors and matrix but not dataframes, it
gives me this message error:

MeanA - read.csv(MeanAmf.csv,header=T)
mysample - MeanA[sample(1:nrow(MeanA), 20, replace=FALSE),]
remainder-MeanA[-mysample]
Error in `[.default`(MeanA, -mysample) : invalid subscript type 'list'
In Ops.factor(left) : - not meaningful for factors

Any other way?

--
View this message in context: 
http://r.789695.n4.nabble.com/Sampling-problems-tp4453752p4455912.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sampling problems

2012-03-08 Thread Oritteropus
Hi sarah, it is not clear to me how to do that, can you show me please?

Imagine I have a situation like this:

MeanA - read.csv(MeanAmf.csv,header=T)
mysample - MeanA[sample(1:nrow(MeanA), 20, replace=FALSE),]

Then?


--
View this message in context: 
http://r.789695.n4.nabble.com/Sampling-problems-tp4453752p4455921.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sampling problems

2012-03-08 Thread Oritteropus
Thanks, but it doesn't work either, it gives me the same message error. 
It works just if my first sample is taken in this way:

mysample - sample(1:nrow(MeanA), 20, replace=FALSE)

However, in this way it sample just the number of rows:
 [1] 71 24 12 36  2 39 69 62 43 38  9 44 13 54 50 63 67 66 37 28

but not the data inside.  I need to sample in this way:

mysample - MeanA[sample(1:nrow(MeanA), 20, replace=FALSE),] 

to get a sample like this

HRkmMean.mf Mean.mfm Loc Diet Terr
Soc Type Soc.Ter W.cat.0.25 W.cat.0.5
-2.49-0.432.57   A  
 
OT   S   D  
   
TS  b
23 -2.050.67   T
   
CN   SD 

NS   A

This is an example of my dataframe

--
View this message in context: 
http://r.789695.n4.nabble.com/Sampling-problems-tp4453752p4456048.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Simple solution

2012-03-08 Thread Oritteropus
Hi everybody,
Thank you all for your suggestions, you have been very helpful. 
However at the end I solved in this way:

mysample - MaxDH[sample(1:nrow(MaxDH), 150, replace=FALSE),]
A-mysample[1:120,]
B-mysample[121:150,]

So simple at the end...

Best,

Luca

--
View this message in context: 
http://r.789695.n4.nabble.com/Sampling-problems-tp4453752p4456469.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sampling problems

2012-03-07 Thread Oritteropus
Hi,
I need to sample randomly my dataset for 1000 times. The sample need to be
the 80%. I know how to do that, my problem is that not only I need the 80%,
but I also need the corresponding 20% each time. Is there any way to do
that?
Alternatively, I was thinking to something like setdiff () function to
compare my 80% sample to the original dataset and obtain the corresponding
20%, unfortunately setdiff works just for vectors, do you know a similar
function for dataframes?
Thanks

--
View this message in context: 
http://r.789695.n4.nabble.com/Sampling-problems-tp4453752p4453752.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.