Re: [R] Fw: Logistic regresion - Interpreting (SENS) and (SPEC)
Hi Maithili, There are two good papers that illustrate how to compare classifiers using Sensitivity and Specificity and their extensions (e.g., likelihood ratios, young index, KL distance, etc). See: 1) Biggerstaff, Brad, 2000, Comparing diagnostic tests: a simple graphic using likelihood ratios, Statistics in Medicine, 19:649-663. 2) Lee, Wen-Chung, 1999, Selecting diagnostic tests for ruling out or ruling in disease: the use of the Kllback-Leibler distance, International Epidemiological Association, 28:521-525. Please let me know if have problems finding the aforementioned papers. Kind Regards, Pedro -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Maithili Shiva Sent: Monday, October 13, 2008 3:28 AM To: r-help@r-project.org Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: [R] Fw: Logistic regresion - Interpreting (SENS) and (SPEC) Dear Mr Peter Dalgaard and Mr Dieter Menne, I sincerely thank you for helping me out with my problem. The thing is taht I already have calculated SENS = Gg / (Gg + Bg) = 89.97% and SPEC = Bb / (Bb + Gb) = 74.38%. Now I have values of SENS and SPEC, which are absolute in nature. My question was how do I interpret these absolue values. How does these values help me to find out wheher my model is good. With regards Ms Maithili Shiva Subject: [R] Logistic regresion - Interpreting (SENS) and (SPEC) To: r-help@r-project.org Date: Friday, October 10, 2008, 5:54 AM Hi Hi I am working on credit scoring model using logistic regression. I havd main sample of 42500 clentes and based on their status as regards to defaulted / non - defaulted, I have genereted the probability of default. I have a hold out sample of 5000 clients. I have calculated (1) No of correctly classified goods Gg, (2) No of correcly classified Bads Bg and also (3) number of wrongly classified bads (Gb) and (4) number of wrongly classified goods (Bg). My prolem is how to interpret these results? What I have arrived at are the absolute figures. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dump decision trees of randomForest object
Hi Chris, Maybe it is easier if you try the following C++ library http://mtv.ece.ucsb.edu/benlee/librf.html Regards, Pedro -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Christian Sturz Sent: Thursday, October 09, 2008 4:30 PM To: Liaw, Andy; r-help@r-project.org Subject: Re: [R] Dump decision trees of randomForest object I've tried the getTree() function and printed a decision tree with print(). However, it seems to me that it's hard to parse this representation and translate it into equivalent if-then-else C constructs. Are there no other ways to dump the trees into a more hierarchical form? What do you exactly mean with the prediction in the source package? Maybe what I wanted to ask goes in the same direction: let's say I've learned a random forest model from a learning set. Now I would like to use it in the future as classifier to predict new examples. How can this be done? Can I save a learned model and than invoke R with new examples and applied them to the saved model without again training the random forest from scratch? If so, please give me some hints how to do that. Regards, Chris Original-Nachricht Datum: Thu, 9 Oct 2008 14:38:44 -0400 Von: Liaw, Andy [EMAIL PROTECTED] An: Christian Sturz [EMAIL PROTECTED], r-help@r-project.org Betreff: RE: [R] Dump decision trees of randomForest object See the getTree() function in the package. Also, the source package contains C code that does the prediction that you may be able to work from. Andy From: Christian Sturz Hi, I'm using the package randomForest to generate a classifier for the exemplary iris data set: data(iris) iris.rf-randomForest(Species~.,iris) Is it possible to print all decision trees in the generated forest? If so, can the trees be also written to disk? What I actually need is to translate the decision trees in a random forest into equivalent C++ if-then-else constructs to integrate them in a C++ project. Have this been done in the past and are there already any implemented approaches/parser for that? Cheers, Chris -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Notice: This e-mail message, together with any attach...{{dropped:15}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to validate model?
Hi Frank, Thanks for your feedback! But I think we are talking about two different things. 1) Validation: The generalization performance of the classifier. See, for example, Studies on the Validation of Internal Rating Systems by BIS. 2) Calibration: Correct calibration of a PD rating system means that the calibrated PD estimates are accurate and conform to the observed default rates. See, for instance, An Overview and Framework for PD Backtesting and Benchmarking, by Castermans et al. Frank, you are referring the #1 and I am referring to #2. Nonetheless, I would never create a rating system if my model doesn't discriminate better than a coin toss. Regards, Pedro -Original Message- From: Frank E Harrell Jr [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 07, 2008 11:02 AM To: Rodriguez, Pedro Cc: [EMAIL PROTECTED]; r-help@r-project.org Subject: Re: [R] How to validate model? [EMAIL PROTECTED] wrote: Usually one validates scorecards with the ROC curve, Pietra Index, KS test, etc. You may be interested in the WP 14 from BIS (www.bis.org). Regards, Pedro No, the validation should be done using an absolute reliability (calibration) curve. You need to verify that at all levels of predicted risk there is agreement with the true probability of failure. An ROC curve does not do that, and I doubt the others do. A resampling-corrected loess calibration curve is a good approach as implemented in the Design package's calibrate function. Frank -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Maithili Shiva Sent: Tuesday, October 07, 2008 8:22 AM To: r-help@r-project.org Subject: [R] How to validate model? Hi! I am working on scorecard model and I have arrived at the regression equation. I have used logistic regression using R. My question is how do I validate this model? I do have hold out sample of 5000 customers. Please guide me. Problem is I had never used Logistic regression earlier neither I am used to credit scoring models. Thanks in advance Maithili __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to validate model?
Hi, Yes, from my humble opinion, it doesnt make any sense to use the (2-class) ROC curve for a rating system. For example, if the classifier predicts 100% for all the defaulted exposures and 0% for the good clients, then even though we have a perfect classifier we have a bad rating system. However, if we use the multi-class version of Hand and Till (2001), we may test how good is the model to discriminate between classes or ratings. Hand, David J. and Robert J. Till, A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems, Machine Learning, Vol. 45, No. 2, (November 2001), pp. 171-186. Regards, Pedro -Original Message- From: Ajay ohri [mailto:[EMAIL PROTECTED] Sent: Tue 10/7/2008 6:46 PM To: Frank E Harrell Jr Cc: Rodriguez, Pedro; r-help@r-project.org Subject: Re: [R] How to validate model? the purpose of validating indirect measures such as ROC curves. Biggest Purpose- It is useful while in more marketing /sales meeting context ;) Also , Deciles specific performance is easy to explain and monitor for faster execution/re modeling. Regards, Ajay On Wed, Oct 8, 2008 at 4:01 AM, Frank E Harrell Jr [EMAIL PROTECTED] wrote: Ajay ohri wrote: This is an approach Run the model variables on hold out sample. Check and compare ROC curves between build and validation datasets. Check for changes in parameter estimates (co efficients of variables) p value and signs. Check for binning (response versus deciles of individual variables). Check concordance, and KS Statistic. A decile wise performance of the model in terms of predicted versus actual, rank ordering of deciles, helps in explaining the model to business audience who generally have some business specific input that may require scoring model to be tweaked. This assumes multicollinearity, outliers and missing value treatment have already been done, and holdout sample checks for overfitting. You can always rebuild the model using a different random holdout sample. A stable model would not change too much. In actual implementation , try and build real time triggers for deviations (%) between predicted and actual. Regards, Ajay I wouldn't recommend that approach but legitimate differences of opinion exist on the subject. In particular I fail to see the purpose of validating indirect measures such as ROC curves. Frank www.decisionstats.com http://www.decisionstats.com On Wed, Oct 8, 2008 at 1:33 AM, Frank E Harrell Jr [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Hi Frank, Thanks for your feedback! But I think we are talking about two different things. 1) Validation: The generalization performance of the classifier. See, for example, Studies on the Validation of Internal Rating Systems by BIS. I didn't think the desire was for a classifier but instead was for a risk predictor. If prediction is the goal, classification methods or accuracy indexes based on classifications do not work very well. 2) Calibration: Correct calibration of a PD rating system means that the calibrated PD estimates are accurate and conform to the observed default rates. See, for instance, An Overview and Framework for PD Backtesting and Benchmarking, by Castermans et al. I'm unclear on what you mean here. Correct calibration of a predictive system means that the UNcalibrated estimates are accurate (i.e., they don't need any calibration). (What is PD?) Frank, you are referring the #1 and I am referring to #2. Nonetheless, I would never create a rating system if my model doesn't discriminate better than a coin toss. For sure
Re: [R] random normally distributed values within range
Hi Achaz, Maybe you are interested in the generalized beta distribution? To the best of my knowledge, there is no way to restrict the values of normal deviates, since one may end up with a different distribution. Regards, Pedro -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Achaz von Hardenberg Sent: Monday, October 06, 2008 6:55 PM To: r-help@r-project.org Subject: [R] random normally distributed values within range Hi all, I need to create 100 normally distributed random values (X) which can not exceed a specific range (i.e. 0XY). With rnorm I cannot specify Max and min values among which values have to stay, like in runif so does some other simple way exist to do this with normally distributed random values? thanks a lot in advance, Dr. Achaz von Hardenberg Centro Studi Fauna Alpina - Alpine Wildlife Research Centre Servizio Sanitario e della Ricerca Scientifica Parco Nazionale Gran Paradiso, Degioz, 11, 11010-Valsavarenche (Ao), Italy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bias in sample - Logistic Regression
Hi Shiva, Maybe you are interested in the following paper: Learning when Training Data are Costly: The Effect of Class Distribution on Tree Induction. G. Weiss and F. Provost. Journal of Artificial Intelligence Research 19 (2003) 315-354. For validating the models in those enviroments, William Elazmeh, Nathalie Japkowicz, Stan Matwin. (2006). A Framework for Comparative Evaluation of Classifiers in the Presence of Class Imbalance. Proceedings of the third Workshop on ROC Analysis in Machine Learning, Pittsburgh, USA. Regards, Pedro -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Wensui Liu Sent: Wednesday, October 01, 2008 7:20 PM To: [EMAIL PROTECTED] Cc: r-help@r-project.org Subject: Re: [R] Bias in sample - Logistic Regression Hi, Shiva, The idea of reject inference is very simple. Let's assume a credit card environment. There are 100 applicants, out of which 50 will be approved and booked in. Therefore, we can only observe the adverse behavior, such as default and delinquency, of 50 booked accounts. Again, let's assume out of 50 booked cards, 5 are bad(default / delinquency). A normal thought is to build a model to cherry pick bad guys and then apply the same model to all applicants. However, we can only observed the behavior of the applicants booked, which is 50, but not all applicants, which is 100. Therefore, the model result looks better than what it is supposed to be. This is so-called 'sample bias'. The same thing can happen to healthcare or direct marketing as well. Luckily enough, many people have done some excellent work on this problem. Please do some readings by Heckman. Greene in NYU has paper in this area as well. And I believe there is also implementation in R. If you use SAS(large in industry), take a look at proc qlim. HTH. -- === WenSui Liu Acquisition Risk, Chase Email : [EMAIL PROTECTED] Blog : statcompute.spaces.live.com === [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic regression problem
Hi Bernardo, Do you have to use logistic regression? If not, try Random Forests... It has worked for me in past situations when I have to analyze huge datasets. Some want to understand the DGP with a simple linear equation; others want high generalization power. It is your call... See, e.g., www.cis.upenn.edu/group/datamining/ReadingGroup/papers/breiman2001.pdf. Maybe you are also interested in AD-HOC, an algorithm for feature selection, http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.99.9130 Regards, Pedro -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy Sent: Wednesday, October 01, 2008 12:01 PM To: Frank E Harrell Jr; [EMAIL PROTECTED] Cc: r-help@r-project.org Subject: Re: [R] Logistic regression problem From: Frank E Harrell Jr Bernardo Rangel Tura wrote: Em Ter, 2008-09-30 às 18:56 -0500, Frank E Harrell Jr escreveu: Bernardo Rangel Tura wrote: Em Sáb, 2008-09-27 às 10:51 -0700, milicic.marko escreveu: I have a huge data set with thousands of variable and one binary variable. I know that most of the variables are correlated and are not good predictors... but... It is very hard to start modeling with such a huge dataset. What would be your suggestion. How to make a first cut... how to eliminate most of the variables but not to ignore potential interactions... for example, maybe variable A is not good predictor and variable B is not good predictor either, but maybe A and B together are good predictor... Any suggestion is welcomed milicic.marko I think do you start with a rpart(binary variable~.) This show you a set of variables to start a model and the start set to curoff for continous variables I cannot imagine a worse way to formulate a regression model. Reasons include 1. Results of recursive partitioning are not trustworthy unless the sample size exceeds 50,000 or the signal to noise ratio is extremely high. 2. The type I error of tests from the final regression model will be extraordinarily inflated. 3. False interactions will appear in the model. 4. The cutoffs so chosen will not replicate and in effect assume that covariate effects are discontinuous and piecewise flat. The use of cutoffs results in a huge loss of information and power and makes the analysis arbitrary and impossible to interpret (e.g., a high covariate value:low covariate value odds ratio or mean difference is a complex function of all the covariate values in the sample). 5. The model will not validate in new data. Professor Frank, Thank you for your explain. Well, if my first idea is wrong what is your opinion on the following approach? 1- Make PCA with data excluding the binary variable 2- Put de principal components in logistic model 3- After revert principal componentes in variable (only if is interesting for milicic.marko) If this approach is wrong too what is your approach? Hi Bernardo, If there is a large number of potential predictors and no previous knowledge to guide the modeling, principal components (PC) is often an excellent way to proceed. The first few PCs can be put into the model. The result is not always very interpretable, but you can decode the PCs by using stepwise regression or recursive partitioning (which are safer in this context because the stepwise methods are not exposed to the Y variable). You can also add PCs in a stepwise fashion in the pre-specified order of variance explained. There are many variations on this theme including nonlinear principal components (e.g., the transcan function in the Hmisc package) which may explain more variance of the predictors. While I agree with much of what Frank had said, I'd like to add some points. Variable selection is a treacherous business whether one is interested in prediction or inference. If the goal is inference, Frank's book is a must read, IMHO. (It's great for predictive model building, too.) If interaction is of high interest, principal components are not going to give you that. Regarding cutpoint selection: The machine learners have found that the `optimal' split point for a continuous predictor in tree algorithms are extremely variable, that interpreting them would be risky at best. Breiman essentially gave up on interpretation of a single tree when he went to random forests. Best, Andy Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Notice: This e-mail message, together with any
Re: [R] ROC curve from logistic regression
Hi Try the following reference: Comparison of Three Methods for Estimating the Standard Error of the Area under the Curve in ROC Analysis of Quantitative Data by Hajian-Tilaki and Hanley, Academic Radiology, Vol 9, No 11, November 2002. Below is a simple implementation that will return both the AUC and its standard error (DeLong et al method). Hope this helps... Pedro #Input: yreal [-1,1] auc - function(yreal,forecasts){ sizeT -nrow(yreal) pos - 0 for(i in 1:sizeT){ if(yreal[i]0) {pos - pos + 1} } neg - sizeT-pos yrealpos - vector(length=pos) yrealneg - vector(length=neg) forepos - vector(length=pos) foreneg - vector(length=neg) controlpos - 1 controlneg - 1 for(i in 1:sizeT){ if(yreal[i]0) { yrealpos[controlpos] - yreal[i] forepos[controlpos] - forecasts[i] controlpos - controlpos + 1 } else { yrealneg[controlneg] - yreal[i] foreneg[controlneg] - forecasts[i] controlneg - controlneg + 1 } } oper - 0 for( i in 1:pos){ for(j in 1:neg){ if(forepos[i] foreneg[j]) {oper - oper + 1} if(forepos[i]==foreneg[j]) {oper - oper + 0.50 } } } area - oper/(pos*neg) vpj - vector(length=pos) vqk - vector(length=neg) oper - 0 for(i in 1:pos){ for(j in 1:neg){ if(forepos[i] foreneg[j]) {oper - oper + 1 } else {if(forepos[i]==foreneg[j]) {oper - oper + 0.50 }} } division - oper/neg resta - (division-area)^2 vpj[i] - resta oper - 0 } oper - 0 resta - 0 for(j in 1:neg){ for(i in 1:pos){ if(forepos[i] foreneg[j]) {oper - oper + 1 } else {if(forepos[i]==foreneg[j]) {oper - oper + 0.50 }} } division - oper/pos resta - (division-area)^2 vqk[j] - resta oper - 0 } vpj - vpj/(pos*(pos-1)) vqk - vqk/(neg*(neg-1)) var - sum(vpj)+sum(vqk) s - sqrt(var) return(list(AUC=area, std=s)) } -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Frank E Harrell Jr Sent: Monday, September 08, 2008 8:22 AM To: gallon li Cc: r-help Subject: Re: [R] ROC curve from logistic regression gallon li wrote: I know how to compute the ROC curve and the empirical AUC from the logistic regression after fitting the model. But here is my question, how can I compute the standard error for the AUC estimator resulting form logistic regression? The variance should be more complicated than AUC based on known test results. Does anybody know a reference on this problem? The rcorr.cens function in the Hmisc package will compute the std. error of Somers' Dxy rank correlation. Dxy = 2*(C-.5) where C is the ROC area. This standard error does not include a variance component for the uncertainty in the model (e.g., it does not penalize for the estimation of the regression coefficients if you are estimating the coefficients and assessing ROC area on the same sample). The lrm function in the Design package fits binary and ordinal logistic regression models and reports C, Dxy, and other measures. I haven't seen an example where drawing the ROC curve provides useful information that leads to correct actions. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interpolation Problems
Hi Steve, It could be the case that you are trying to find values that are not in the range of values you are providing. For example, x - c(1,2,3,4,5) y - c(10,11,12,13,14) xout - c(0.01,0.02) approx(x,y,xout,method=linear) R's output: $x [1] 0.01 0.02 $y [1] NA NA If you want to see the value of 10 when you Xs are below 1 and 14 when the Xs are above 5, then code below may help. Regards, Pedro interpolation_test - function(data,cum_prob,xout) { y - vector(length=length(xout)) for(i in 1:length(xout)) { ValueToCheck - xout[i] j -1 while(cum_prob[j] ValueToCheck j length(cum_prob) -2) { j - j + 1 } y0 - data[j] x0 - cum_prob[j] y1 - data[j+1] x1 - cum_prob[j+1] if(x0==ValueToCheck) { y[i]- y0 } else { y[i]- y0 + (ValueToCheck-x0)*(y1-y0)/(x1-x0) } } return(y) } -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Steve Murray Sent: Monday, September 01, 2008 6:17 PM To: r-help@r-project.org Subject: [R] Interpolation Problems Dear all, I'm trying to interpolate a dataset to give it twice as many values (I'm giving the dataset a finer resolution by interpolating from 1 degree to 0.5 degrees) to match that of a corresponding dataset. I have the data in both a data frame format (longitude column header values along the top with latitude row header values down the side) or column format (in the format latitude, longitude, value). I have used Google to determine 'approxfun' the most appropriate command to use for this purpose - I may well be wrong here though! Nevertheless, I've tried using it with the default arguments for the data frame (i.e. interp - approxfun(dataset) ) but encounter the following errors: interp - approxfun(JanAv) Error in approxfun(JanAv) : need at least two non-NA values to interpolate In addition: Warning message: In approxfun(JanAv) : collapsing to unique 'x' values However, there are no NA values! And to double-check this, I did the following: JanAv[is.na(JanAv)] - 0 ...to ensure that there really are no NAs, but receive the same error message each time. With regard to the latter 'collapsing to unique 'x' values', I'm not sure what this means exactly, or how to deal with it. Any words of wisdom on how I should go about this, or whether I should use an alternative command (I want to perform a simple (e.g. linear) interpolation), would be much appreciated. Many thanks for any advice offered, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Import GAUSS .FMT files
Dear All, Is it possible to import GAUSS .FMT files into R? Thanks for your time. Kind Regards, Pedro N. Rodriguez [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Simulate an AR(1) process via distributions? (without specifying a model specification)
Dear All, Is it possible to simulate an AR(1) process via a distribution? I have simulated an AR(1) process the usual way (that is, using a model specification and using the random deviates in the error), and used the generated time series to estimate 3- and 4-parameter distributions (for instance, GLD). However, the random deviates generated from these distributions do not follow the specified AR process. Any comment and feedback will be more than welcome. Thanks for your time. Pedro N. Rodriguez [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Factorial, L-moments, and overflows
Hi everyone, In the package POT, there is a function that computes the L-moments of a given sample (samlmu). However, to compute those L-moments, one needs to obtain the total number of combinations between two numbers, which, by the way, requires the use of a factorial. See, for example, Hosking (1990 , p. 113). How does the function samlmu in the package POT avoids overflows? I was trying to build from scratch a R function similar to samlmu and ran into overflows (Just for my educational purposes :o) ). Is there a trick that I am missing to avoid overflows in the factorial function? Thank you very much for your time. Pedro N. Rodriguez SSRN Homepage: http://ssrn.com/author=412141/ Homepage: http://www.pnrodriguez.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.