Hi Maithili,
There are two good papers that illustrate how to compare classifiers
using Sensitivity and Specificity and their extensions (e.g., likelihood
ratios, young index, KL distance, etc).
See:
1) Biggerstaff, Brad, 2000, "Comparing diagnostic tests: a simple
graphic using likelihood ratio
Hi Chris,
Maybe it is easier if you try the following C++ library
http://mtv.ece.ucsb.edu/benlee/librf.html
Regards,
Pedro
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Christian Sturz
Sent: Thursday, October 09, 2008 4:30 PM
To: Liaw, Andy; r-he
Hi,
Yes, from my humble opinion, it doesnt make any sense to use the (2-class) ROC
curve for a rating system. For example, if the classifier predicts 100% for all
the defaulted exposures and 0% for the good clients, then even though we have a
perfect classifier we have a bad rating system.
Ho
Hi Frank,
Thanks for your feedback! But I think we are talking about two different
things.
1) Validation: The generalization performance of the classifier. See,
for example, "Studies on the Validation of Internal Rating Systems" by
BIS.
2) Calibration: Correct calibration of a PD rating system
Usually one validates scorecards with the ROC curve, Pietra Index, KS
test, etc. You may be interested in the WP 14 from BIS (www.bis.org).
Regards,
Pedro
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Maithili Shiva
Sent: Tuesday, October 07, 2008 8:22
Hi Achaz,
Maybe you are interested in the generalized beta distribution?
To the best of my knowledge, there is no way to restrict the values of
normal deviates, since one may end up with a different distribution.
Regards,
Pedro
-Original Message-
From: [EMAIL PROTECTED] [mailto:[
Hi Shiva,
Maybe you are interested in the following paper:
Learning when Training Data are Costly: The Effect of Class Distribution
on Tree Induction. G. Weiss and F. Provost. Journal of Artificial
Intelligence Research 19 (2003) 315-354.
For validating the models in those enviroments,
Willia
Hi Bernardo,
Do you have to use logistic regression? If not, try Random Forests... It has
worked for me in past situations when I have to analyze huge datasets.
Some want to understand the DGP with a simple linear equation; others want high
generalization power. It is your call... See, e.g.,
Hi
Try the following reference:
Comparison of Three Methods for Estimating the
Standard Error of the Area under the Curve in ROC
Analysis of Quantitative Data by Hajian-Tilaki and Hanley, Academic
Radiology, Vol 9, No 11, November 2002.
Below is a simple implementation that will return both the
Hi Steve,
It could be the case that you are trying to find values that are not in
the range of values you are providing.
For example,
x <- c(1,2,3,4,5)
y <- c(10,11,12,13,14)
xout <- c(0.01,0.02)
approx(x,y,xout,method="linear")
R's output:
$x
[1] 0.01 0.02
$y
[1] NA NA
If you want to see th
Hi Yasir,
Try the following reference:
A heuristic approach for the generation of multivariate random samples
with specific marginal distributions and correlation matrix, Dimos C.
Charmpis and Panayiota L. Panteli, Computational Statistics 19, 283-300,
2004.
I have the R code, please let me know
Hi Ben,
Try the following reference:
Implementing Statistical Criteria To Select Return Forecasting Models:
What do We Learn? By Peter Bossaerts and Pierre Hillion, Review of
Financial Studies, Vol. 12, No. 2.
I have created an R function which implements Bossearts and Hillion's
methodologies. I
Dear All,
Is it possible to import GAUSS .FMT files into R?
Thanks for your time.
Kind Regards,
Pedro N. Rodriguez
[[alternative HTML version deleted]]
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/lis
Thanks Prof. Ripley.
My apologies for not including the code.
Below I illustrate my point using the GLD package.
Thank you very much for your time.
Kind Regards,
Pedro N. Rodriguez
# Code begins
# Simulate an ar(1) process
# x = 0.05 + 0.64*x(t-1) + e
# Create the vector x
x
Dear All,
Is it possible to simulate an AR(1) process via a distribution?
I have simulated an AR(1) process the usual way (that is, using a model
specification and using the random deviates in the error), and used the
generated time series to estimate 3- and 4-parameter distributions (for
Hi everyone,
In the package POT, there is a function that computes the L-moments of a given
sample (samlmu). However, to compute those L-moments, one needs to obtain the
total number of combinations between two numbers, which, by the way, requires
the use of a factorial. See, for example, Hoski
16 matches
Mail list logo