Re: [R] SF-8 (not 36) questionnaire scoring for R?

2010-09-13 Thread Frank Harrell
I know someone who has R code for SF-36 and perhaps SF-12. Aren't there copyright issues relating to SF-* even if it is reprogrammed? Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/SF-36

Re: [R] SF-8 (not 36) questionnaire scoring for R?

2010-09-13 Thread Frank Harrell
Yes the company behind that probably received federal funds for some of the research and has been very careful to minimize their contribution to the community. I didn't understand your parenthetical remark. Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View

[R] [R-pkgs] New version of rms package on CRAN

2010-09-13 Thread Frank Harrell
CRAN has a significant update to rms. Windows and unix/linux versions are available and I expect the Mac version to be available soon. The most significant improvement is addition of latex=TRUE arguments to model fitting print methods, made especially for use with Sweave. Here is a summary

[R] Where to find R-help options web page

2010-09-10 Thread Frank Harrell
. Nabble sent a link to turn off e-mail but the r-help mail service rejected nabble's command. Thanks Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Where-to-find-R-help-options-web-page-tp2535123p2535123.html

Re: [R] Where to find R-help options web page

2010-09-10 Thread Frank Harrell
(and to be able to post on it). Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Where-to-find-R-help-options-web-page-tp2535123p2535148.html Sent from the R help mailing list archive at Nabble.com

Re: [R] Where to find R-help options web page

2010-09-10 Thread Frank Harrell
Aha. Thanks David. I somehow thought that that whole section was for administrators only. Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Where-to-find-R-help-options-web-page-tp2535123p2535172.html

Re: [R] some questions about longitudinal study with baseline

2010-09-07 Thread Frank Harrell
Baseline should appear only as a baseline and should be removed from the set of longitudinal responses. This is often done with a merge( ) operation. Frank Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt

Re: [R] Linear Logistic Regression - Understanding the output (and possibly the test to use!)

2010-09-05 Thread Frank Harrell
the time to consult a statistician is before you have done any statistical analysis. Frank Harrell I have a couple of Kleinbaum's (et al) other texts and find them to be well written and reasoned, so I suspect the citation above would be as accessible as any. Thank you, that is useful

Re: [R] Is there any package or function perform stepwise variable selection under GEE method?

2010-09-02 Thread Frank Harrell
The absence of stepwise methods works to your advantage, as these yield invalid statistical inference and inflated regression coefficients. Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Thu,

Re: [R] Comparing COXPH models, one with age as a continuous variable, one with age as a three-level factor

2010-09-02 Thread Frank Harrell
On Thu, 2 Sep 2010, stephenb wrote: sorry to bump in late, but I am doing similar things now and was browsing. IMHO anova is not appropriate here. it applies when the richer model has p more variables than the simpler model. this is not the case here. the competing models use different

Re: [R] [Q] Goodness-of-fit test of a logistic regression model using rms package

2010-09-01 Thread Frank Harrell
don't feel bad. You just know that something somewhere is probably wrong with the model. I focus on directed tests such as allowing all continuous variables to have nonlinear effects or allowing selected interactions, and finding out how important the complex model terms are. Frank Harrell

Re: [R] Speeding up prediction of survival estimates when using `survifit'

2010-08-31 Thread Frank Harrell
Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Mon, 30 Aug 2010, Ravi Varadhan wrote: Hi, I fit a Cox PH model to estimate the cause-specific hazards (in a competing risks setting). Then , I

Re: [R] modify a nomogram

2010-08-25 Thread Frank Harrell
Update to the rms package which is the version now being actively supported. New features will not be added to Design. The nomogram function in rms separates the plotting into a plot method for easier understanding. You can control all axes - some experimentation can tell you if you can

Re: [R] AUC

2010-08-23 Thread Frank Harrell
Samuel, Since the difference in AUCs has insufficient power and doesn't really take into account the pairing of predictions, I recommend the Hmisc package's rcorrp.cens function. Its method has good power and asks the question is one predictor more concordant than the other in the same

Re: [R] R reports

2010-08-21 Thread Frank Harrell
Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Sat, 21 Aug 2010, Donald Paul Winston wrote: Sweave and LaTex is way to much overhead to deal with. There should be a built in standard report()

Re: [R] R reports

2010-08-21 Thread Frank Harrell
Your notes are not well thought out. You'll find that r-help is a friendly place for new users that do not come in with an attitude. I once used SAS (for 23 years) and know it very well. I wrote the first SAS procedures for a graphics device, percentiles, logistic regression, and Cox

Re: [R] R reports

2010-08-21 Thread Frank Harrell
, Frank Harrell wrote: Your notes are not well thought out. You'll find that r-help is a friendly place for new users that do not come in with an attitude. I once used SAS (for 23 years) and know it very well. I wrote the first SAS procedures for a graphics device, percentiles, logistic

Re: [R] R reports

2010-08-21 Thread Frank Harrell
On Sat, 21 Aug 2010, Donald Paul Winston wrote: Good grief. Adding a report function is not going to make R less flexible. Don't you want to use a tool that's relevant to the rest of the world? That world is much bigger then your world. This is ridiculous. Looks like some people are

Re: [R] logistic regression tree

2010-08-20 Thread Frank Harrell
It would be good to tell us of the frequency of observations in each category of Y, and the number of continuous X's. Recursive partitioning will require perhaps 50,000 observations in the less frequent Y category for its structure and predicted values to validate, depending on X and the

Re: [R] logistic regression tree

2010-08-20 Thread Frank Harrell
On Fri, 20 Aug 2010, Kay Cichini wrote: hello, my data-collection is not yet finished, but i though have started investigating possible analysis methods. below i give a very close simulation of my future data-set, however there might be more nominal explanatory variables - there will be no

Re: [R] R reports

2010-08-19 Thread Frank Harrell
What do low level proc print and proc report have on Sweave or http://biostat.mc.vanderbilt.edu/wiki/pub/Main/StatReport/summary.pdf? If proc print and proc report are 4G, let's move back a generation. Frank E Harrell Jr Professor and ChairmanSchool of Medicine

Re: [R] ROCR predictions

2010-08-19 Thread Frank Harrell
At the heart of this you have a problem in incomplete conditioning. You are computing things like Prob(X x) when you know X=x. Working with a statistician who is well versed in probability models will undoubtedly help. Frank Frank E Harrell Jr Professor and ChairmanSchool of

Re: [R] HMisc/rms package questions

2010-08-17 Thread Frank Harrell
Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Tue, 17 Aug 2010, Rob James wrote: 1) How does one capture the plots from the plsmo procedure? Simply inserting a routing call to a graphical

Re: [R] Stepwise Regression + entry/exit significance level

2010-08-14 Thread Frank Harrell
The values of slentry and slstay that will avoid ruining the statistical properties of the result are slentry=1.0 and slstay=1.0. Frank Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Sat, 14 Aug

Re: [R] How to add lines to lattice plot produced by rms::bplot

2010-08-14 Thread Frank Harrell
Once you guys figure all this out, I'm glad to modify bplot to pass more arguments lattice if needed. Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Fri, 13 Aug 2010, David Winsemius wrote:

Re: [R] confidence Intervals for predictions in GLS

2010-08-14 Thread Frank Harrell
install.packages('rms') require(rms) ?Gls ?plot.Predict Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Sat, 14 Aug 2010, Camilo Mora wrote: Hi everyone: Is there a function in R to calculate

Re: [R] How to add lines to lattice plot produced by rms::bplot

2010-08-14 Thread Frank Harrell
Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Sat, 14 Aug 2010, David Winsemius wrote: On Aug 14, 2010, at 9:59 AM, Frank Harrell wrote: Once you guys figure all this out, I'm glad

Re: [R] How to compare the effect of a variable across regression models?

2010-08-13 Thread Frank Harrell
David, In the Cox and many other regression models, the effect of a variable is context-dependent. There is an identifiability problem in what you are doing, as discussed by @ARTICLE{for95mod, author = {Ford, Ian and Norrie, John and Ahmadi, Susan}, year = 1995, title = {Model

Re: [R] val.prob in the Design package - Calibrated Brier Score

2010-08-13 Thread Frank Harrell
Please check the code. I hope that Brier is on the uncalibrated probabilities. Calibrated probabilities are from 1/(1+exp(-[a+b logit(uncalibrated probs)]) where a and b are maximum likelihood estimators (they will be 0 and 1 in training data). Frank Frank E Harrell Jr Professor and

Re: [R] Re : Re : How to compare the effect of a variable across regression models?

2010-08-13 Thread Frank Harrell
. _ De : Bert Gunter gunter.ber...@gene.com À : Biau David djmb...@yahoo.fr Cc : Frank Harrell f.harr...@vanderbilt.edu; r help list r-help@r-project.org Envoyé le : Ven 13 août 2010, 18h 22min 58s Objet : Re: [R] Re : How to compare the effect of a variable across regression models

Re: [R] extracting the standard error in lrm

2010-08-11 Thread Frank Harrell
On Wed, 11 Aug 2010, david dav wrote: Hi, I would like to extract the coefficients of a logistic regression (estimates and standard error as well) in lrm as in glm with summary(fit.glm)$coef Thanks David coef(fit) sqrt(diag(vcov(fit))) But these will not be very helpful except in the

Re: [R] Multiple imputation, especially in rms/Hmisc packages

2010-08-11 Thread Frank Harrell
tests to be exactly the same? Mark - you can see the code for this at the bottom of anova.rms. Compute W, divide by numerator d.f., then compute P-value using F with numerator and error d.f. Frank Thanks in advance; I really appreciate any help you can give. Mark Frank Harrell wrote

Re: [R] Plotting confidence bands around regression line

2010-08-11 Thread Frank Harrell
Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Wed, 11 Aug 2010, Michal Figurski wrote: Peter, Frank, David and others, Thank you all for your ideas. I understand your lack of trust in PB

Re: [R] Plotting confidence bands around regression line

2010-08-11 Thread Frank Harrell
University On Wed, 11 Aug 2010, S Ellison wrote: Frank Harrell f.harr...@vanderbilt.edu 11/08/2010 17:02:03 This problem seems to cry out for one of the many available robust regression methods in R. Not sure that would be much more appropriate, although it would _appear_ to work. The PB

Re: [R] Growth Curves with lmer

2010-08-11 Thread Frank Harrell
Classification accuracy is an improper scoring rule, and one of the problems with it is that the proportional classified correctly can be quite good even if the model uses no predictors. [Hence omitting the intercept is also potentially problematic.] Frank E Harrell Jr Professor and

Re: [R] Plotting confidence bands around regression line

2010-08-10 Thread Frank Harrell
Please give the prescription. The article is not available on our extensive online library. I wonder if the method can compete with the bootstrap. Frank Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt

Re: [R] Plotting confidence bands around regression line

2010-08-10 Thread Frank Harrell
Research Laboratory 3400 Spruce St. 7 Maloney Philadelphia, PA 19104 tel. (215) 662-3413 On 2010-08-10 12:29, Frank Harrell wrote: Please give the prescription. The article is not available on our extensive online library. I wonder if the method can compete with the bootstrap. Frank Frank E

Re: [R] Logistic Regression in R (SAS -like output)

2010-08-09 Thread Frank Harrell
Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Mon, 9 Aug 2010, Harsh wrote: Hello useRs, I have a problem at hand which I'd think is fairly common amongst groups were R is being adopted for

Re: [R] Logistic Regression in R (SAS -like output)

2010-08-09 Thread Frank Harrell
Note that stepwise variale selection based on AIC has all the problems of stepwise variable selection based on P-values. AIC is just a restatement of the P-Value. Frank Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics

Re: [R] Logistic Regression in R (SAS -like output)

2010-08-09 Thread Frank Harrell
Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Mon, 9 Aug 2010, Kingsford Jones wrote: On Mon, Aug 9, 2010 at 10:27 AM, Frank Harrell f.harr...@vanderbilt.edu wrote: Note that stepwise variale selection based on AIC has all

Re: [R] Identification of Outliners and Extraction of Samples

2010-08-09 Thread Frank Harrell
On Mon, 9 Aug 2010, Alexander Eggel wrote: Hello everybody, I need to know which samples (S1-S6) contain a value that is bigger than the median + five standard deviations of the column he is in. This is just an Why not the 70th percentile plus 6 times the difference in the 85th and 75th

Re: [R] Multiple imputation, especially in rms/Hmisc packages

2010-08-09 Thread Frank Harrell
On Mon, 9 Aug 2010, Mark Seeto wrote: Hello, I have a general question about combining imputations as well as a question specific to the rms and Hmisc packages. The situation is multiple regression on a data set where multiple imputation has been used to give M imputed data sets. I know how

Re: [R] plot the dependent variable against one of the predictors with other predictors as constant

2010-08-07 Thread Frank Harrell
There are many ways to do this. Here is one. install.packages('rms') require(rms) dd - datadist(x, y); options(datadist='dd') f - ols(z ~ x + y) plot(Predict(f))# plot all partial effects plot(Predict(f, x)) # plot only the effect of x plot(Predict(f, y)) # plot only the effect of y f -

Re: [R] How to extract se(coef) from cph?

2010-08-06 Thread Frank Harrell
In an upcoming release of the rms package, all fit objects can be printed using LaTeX if putting LaTeX code directly to the console (this is optimized for Sweave). You will be able to say print(fit, latex=TRUE). Frank E Harrell Jr Professor and ChairmanSchool of Medicine

Re: [R] Problems with normality req. for ANOVA

2010-08-02 Thread Frank Harrell
To add to David's note, the Kruskal-Wallis test is the nonparametric counterpart to one-way ANOVA. You can get a series of K-W tests for several grouping or continuous independent variables (but note these are SEPARATE analyses) using the Hmisc package's spearman2 function. The generalization

Re: [R] Problems with normality req. for ANOVA

2010-08-02 Thread Frank Harrell
In addition the poster did not tell us what is wrong with a nonparametric test. Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Mon, 2 Aug 2010, Bert Gunter wrote: My sympathies, but I don't

Re: [R] Specifying interactions in rms package... error

2010-08-02 Thread Frank Harrell
Hi Rob, rms wants symmetry in the sense that the interactions need to use the same number and location of spline knots as the main effects. So if using the * notation omit the main effects (which are generated automatically) and live with the equal knots. Or use the restricted interaction

Re: [R] xYplot error

2010-07-28 Thread Frank Harrell
, method=bands) That looks OK but I can't test it right now. Please continue to have a look, and if you still don't see the problem provide a tiny reproducible example with self-contained data I can access. Frank Thanks for your help. KM On Jul 27, 9:58 pm, Frank Harrell f.harr

Re: [R] how to generate a random data from a empirical distribition

2010-07-28 Thread Frank Harrell
This is true by definition. Read about the bootstrap which may give you some good background information. Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Wed, 28 Jul 2010, xin wei wrote: hi,

Re: [R] xYplot error

2010-07-27 Thread Frank Harrell
If the x-axis variable is really a factor, xYplot will not handle it. You probably need a dot chart instead (see Hmisc's Dotplot). Note that it is unlikely that the confidence intervals are really symmetric. Frank On Tue, 27 Jul 2010, Kang Min wrote: Hi, I'm trying to plot a graph with

Re: [R] how to generate a random data from a empirical distribition

2010-07-27 Thread Frank Harrell
Easiest thing is to sample with replacement from the original data. This is the idea behind the bootstrap, which is sampling from the empirical CDF. Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University

Re: [R] Rank ANCOVA

2010-07-27 Thread Frank Harrell
This has been shown to yield unreliable analyses. Use the more formal proportional odds ordinal logistic model. This is a generalization of the Wiloxon-Mann-Whitney-Kruskal-Wallis statistic. This is implemented in the rms package and elsewhere. Frank E Harrell Jr Professor and Chairman

<    1   2   3   4   5