Re: [R] Fw: Logistic regresion - Interpreting (SENS) and (SPEC)

2008-10-13 Thread Pedro.Rodriguez
Hi Maithili, There are two good papers that illustrate how to compare classifiers using Sensitivity and Specificity and their extensions (e.g., likelihood ratios, young index, KL distance, etc). See: 1) Biggerstaff, Brad, 2000, Comparing diagnostic tests: a simple graphic using likelihood

Re: [R] Dump decision trees of randomForest object

2008-10-09 Thread Pedro.Rodriguez
Hi Chris, Maybe it is easier if you try the following C++ library http://mtv.ece.ucsb.edu/benlee/librf.html Regards, Pedro -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Christian Sturz Sent: Thursday, October 09, 2008 4:30 PM To: Liaw, Andy;

Re: [R] How to validate model?

2008-10-07 Thread Pedro.Rodriguez
Hi Frank, Thanks for your feedback! But I think we are talking about two different things. 1) Validation: The generalization performance of the classifier. See, for example, Studies on the Validation of Internal Rating Systems by BIS. 2) Calibration: Correct calibration of a PD rating system

Re: [R] How to validate model?

2008-10-07 Thread Pedro.Rodriguez
Hi, Yes, from my humble opinion, it doesnt make any sense to use the (2-class) ROC curve for a rating system. For example, if the classifier predicts 100% for all the defaulted exposures and 0% for the good clients, then even though we have a perfect classifier we have a bad rating system.

Re: [R] random normally distributed values within range

2008-10-06 Thread Pedro.Rodriguez
Hi Achaz, Maybe you are interested in the generalized beta distribution? To the best of my knowledge, there is no way to restrict the values of normal deviates, since one may end up with a different distribution. Regards, Pedro -Original Message- From: [EMAIL PROTECTED]

Re: [R] Bias in sample - Logistic Regression

2008-10-02 Thread Pedro.Rodriguez
Hi Shiva, Maybe you are interested in the following paper: Learning when Training Data are Costly: The Effect of Class Distribution on Tree Induction. G. Weiss and F. Provost. Journal of Artificial Intelligence Research 19 (2003) 315-354. For validating the models in those enviroments,

Re: [R] Logistic regression problem

2008-10-01 Thread Pedro.Rodriguez
Hi Bernardo, Do you have to use logistic regression? If not, try Random Forests... It has worked for me in past situations when I have to analyze huge datasets. Some want to understand the DGP with a simple linear equation; others want high generalization power. It is your call... See, e.g.,

Re: [R] ROC curve from logistic regression

2008-09-08 Thread Pedro.Rodriguez
Hi Try the following reference: Comparison of Three Methods for Estimating the Standard Error of the Area under the Curve in ROC Analysis of Quantitative Data by Hajian-Tilaki and Hanley, Academic Radiology, Vol 9, No 11, November 2002. Below is a simple implementation that will return both the

Re: [R] Interpolation Problems

2008-09-02 Thread Pedro.Rodriguez
Hi Steve, It could be the case that you are trying to find values that are not in the range of values you are providing. For example, x - c(1,2,3,4,5) y - c(10,11,12,13,14) xout - c(0.01,0.02) approx(x,y,xout,method=linear) R's output: $x [1] 0.01 0.02 $y [1] NA NA If you want to see the

[R] Import GAUSS .FMT files

2007-12-18 Thread Pedro.Rodriguez
Dear All, Is it possible to import GAUSS .FMT files into R? Thanks for your time. Kind Regards, Pedro N. Rodriguez [[alternative HTML version deleted]] __ R-help@r-project.org mailing list

[R] Simulate an AR(1) process via distributions? (without specifying a model specification)

2007-11-28 Thread Pedro.Rodriguez
Dear All, Is it possible to simulate an AR(1) process via a distribution? I have simulated an AR(1) process the usual way (that is, using a model specification and using the random deviates in the error), and used the generated time series to estimate 3- and 4-parameter distributions (for

[R] Factorial, L-moments, and overflows

2007-09-16 Thread Pedro.Rodriguez
Hi everyone, In the package POT, there is a function that computes the L-moments of a given sample (samlmu). However, to compute those L-moments, one needs to obtain the total number of combinations between two numbers, which, by the way, requires the use of a factorial. See, for example,