[R] Model Selection and Data

2005-03-28 Thread Melanie Vida
Hi All, Are there times when random forests should not be used as a classification or regression model for determining variable importance. If so, then is it the properties of the training data that causes the issue? Are there other classification and regression models better suited for parti

[R] Gini's Importance Value Variable = Inf

2005-03-23 Thread Melanie Vida
Hi All, In the script below, the importance measure for column 4 (ie MeanDecreaseGini) indicated "Inf" for V7. Running the getTree command showed that "V7" had been selected at least twice in one of the trees for Random Forest. So the "Inf" command was not generated as a result of dividing the s

[R] Question on class 1, 2 output for RandomForest

2005-03-23 Thread Melanie Vida
Hi All, I read the R-newsletter Volum 2/3, December 2002 on page 18. I tried the example there, too. Then, I used a different data set with random Forest from the UCI respository. The results for the "credit" data generated 2 additional columns, column "1" and a column "2" that the example given

[R] Error: Can not handle categorical predictors with more than 32 categories.

2005-03-22 Thread Melanie Vida
Hi All, My question is in regards to an error generated when using randomForest in R. Is there a special way to format the data in order to avoid this error, or am I completely confused on what the error implies? "Error in randomForest.default(m, y, ...) : Can not handle categorical predi

Re: [R] Writing to a file

2005-03-07 Thread Melanie Vida
.dat") Andy From: Melanie Vida Here is a simple question. Is there a quicker way to write to a file several rows of data at a time rather than one line at a time? How can the code below be optimized to write several rows at a time to a file rather than one line at a time. See my slow method

[R] Writing to a file

2005-03-07 Thread Melanie Vida
Here is a simple question. Is there a quicker way to write to a file several rows of data at a time rather than one line at a time? How can the code below be optimized to write several rows at a time to a file rather than one line at a time. See my slow method of write.table below:

RE: [R] Temporal Analysis of variable x; How to select the outlier threshold in R?

2005-03-01 Thread Melanie Vida
h gave a p-value << 0.05. In order to select the outlier threshold, I ended up using the following : outlier_threshold <- qauntile(x, 3/4) + 1.5* IQR(x) -Melanie > > > -Original Message- > From: Melanie Vida [mailto:[EMAIL PROTECTED] > Sent: Friday, February 2

[R] Temporal Analysis of variable x; How to select the outlier threshold in R?

2005-02-25 Thread Melanie Vida
For a financial data set with large variance, I'm trying to find the outlier threshold of one variable "x" over a two year period. I qqplot(x2001, x2002) and found a normal distribution. The latter part of the normal distribution did not look linear though. Is there a suitable method in R to fi

[R] outlier threshold

2005-02-25 Thread Melanie Vida
For the analysis of financial data wih a large variance, what is the best way to select an outlier threshold? Listed below, is there a best method to select an outlier threshold and how does R calculate it? In R, how do you find the outlier threshold through an interquartile range? In R, how d