[R] how to test difference in my case?
hello all, I wonder if anyone could give me a hint on which statistical technique I should use and how to carry it out in R in my case. Thanks in advance. My data is composed of two columns, the same numerical variable (continuous) from actual measurement and model prediction. My objective is to compare the data agreement (if there is significant difference) and make conclusions about the model efficiency. Since the measured and predicted variable was based on the same unit, the first test came into my mind was paired t-test. However, the paired difference is not normal (p-value = 0.0048 from SAS proc univariate). In this case, I can either do a wilcoxon signed-rank test or do transformations about the data. I was told that wilcoxon signed-rank test is not as widely recognized as paired t-test in the literature, so I prefer to do transformation. My question is: do I need to do transformations on both columns of original data, or just the paired difference? What transformation is appropriate? I thought about log transformation, but if I find significant (or no significant) difference between the logged data (measured and predicted), can I say there is significant (or no significant) difference between the original data? After this step of analysis, I will convert the continuous numerical data into qualitative categorical ranking (value=1, 2, 3 and 4). Which statistical test and R command should I use to compare the ranking agreement between the actual measurement and prediction? Thank you very much for helping me out. I haven't slept since a long time ago and this is kind of emergency. If there is any confusion about my description, please let me know. Regards, XY __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to test for significance of random effects?
I don't know if one should include an apparently insignificant random effect in later analysis or not. In the past, I haven't. As far as I know, the best thing to use would be Bayesian model averaging, beginning with some prior over the class of all plausible models (with and without the random effect), and then average predictions, etc., over the posterior. For more information, you could Google for Bayesian model averaging and try RSiteSearch(Bayesian model averaging). I'm not aware of any BMA software in R for mixed models, but I suspect it will be only a matter of time before BMA replaces step and stepAIC for stepwise regression-type applications. Incorporating mixed models into this framework will be harder, but I know of no theoretical obstacles. With luck, others will enlighten us both further on this. Best Wishes, Spencer Graves Dan Bebber wrote: I may be out of my statistical depth here, but isn't it the case that if one has an experimental design with random effects, one has to include the random effects, even if they appear to be non-significant? AFAIK there are two reasons: one is the possibility of 'restriction errors' that arise by unintentional differences in treatments among groups, so making analysis of among-group variance problematic; the other is that allocations of fixed effects to samples is no longer random and therefore the assumption of random errors is broken. Real statisticians may disagree with this, however. Dan Bebber Department of Plant Sciences University of Oxford Message: 12 Date: Sun, 07 May 2006 14:25:44 -0700 From: Spencer Graves [EMAIL PROTECTED] Subject: Re: [R] How to test for significance of random effects? To: Jon Olav Vik [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=ISO-8859-1; format=flowed 1. Ignoring the complication of logistic regression, the anova(lme1,lm1) provides the answer you seek. See sect. 2.4 in Pinheiro and Bates for more detail on the approximations involved and how that answer can be refined using monte carlo. 2. With logistic regression, you want to do essentially the same thing using glm and lmer (in package 'lme4'), except that many of the required functions are not yet part of 'lme4'. Consider the following example: library(lme4) library(mlmRev) (mlmR - vignette(MlmSoftRev)) #edit(mlmR) # with Rgui #Stangle(mlmR$file) # with ESS # - then open file MlmSoftRev.R fitBin - lmer(use ~ urban+age+livch+(1|district), data=Contraception, family=binomial) fitBin0 - glm(use ~ urban+age+livch, data=Contraception, family=binomial) 2*pchisq(2*as.numeric(logLik(fitBin)- logLik(fitBin0)), 2, lower.tail=FALSE) Note however that this p-value computation is known to be only an approximation; see RSiteSearch(lmer p-values) for other perspectives. More accurate p-values can be obtained using Markov Chain Monte Carlo, via mcmcsamp. hope this helps, Spencer Graves Jon Olav Vik wrote: Dear list members, I'm interested in showing that within-group statistical dependence is negligible, so I can use ordinary linear models without including random effects. However, I can find no mention of testing a model with vs. without random effects in either Venable Ripley (2002) or Pinheiro and Bates (2000). Our in-house statisticians are not familiar with this, either, so I would greatly appreciate the help of this list. Pinheiro Bates (2000:83) state that random-effect terms can be tested based on their likelihood ratio, if both models have the same fixed-effects structure and both are estimated with REML (I must admit I do not know exactly what REML is, although I do understand the concept of ML). The examples in Pinheiro Bates 2000 deal with simple vs. complicated random-effects structures, both fitted with lme and method=REML. However, to fit a model without random effects I must use lm() or glm(). Is there a way to tell these functions to use REML? I see that lme() can use ML, but PinheiroBates (2000) advised against this for some reason. lme() does provide a confidence interval for the between-group variance, but this is constructed so as to never include zero (I guess the interval is as narrow as possible on log scale, or something). I would be grateful if anyone could tell me how to test for zero variance between groups. If lm1 and lme1 are fitted with lm() and lme() respectively, then anova(lm1,lme1) gives an error, whereas anova(lme1,lm1) gives an answer which looks reasonable enough. The command logLik() can retrieve either restricted or ordinary log-likelihoods from a fitted model object, but the likelihoods are then evaluated at the fitted parameter estimates. I guess these estimates differ from if the model were estimated using REML? My actual application is a logistic
Re: [R] How to test for significance of random effects?
Hi Spencer, Dan, I think that it depends on the role that the random effects are playing. In models that I have fit, random effects can play one or more of three roles: 1) to reflect the experimental design. These random effects are sacrosanct, as far as I am concerned, and should be included in the model whether significant or not. Therefore, such random effects are not tested; they are estimated and reported. 2) to improve the match between models and assumptions. For example, random intercepts can be augmented by random slopes to ensure that the diagnostics that reflect the assumptions of normality and homoskedasticity are satisfied. (I once had to use random intercepts, slopes, and quadratic terms; I'm sure there are others who have had to do worse!). Testing is rarely of interest in this case because the role of the random effects is to extend the model so that its assumptions are satisfied. However if interest is in a simple model (for some reason) then it might be reasonable to test whether such an innovation significantly improves the fit of the model. For example, I might use a whole-model test to assess whether I need a within-subject correlation model if the ACF plot is borderline. It's important to recall, though, that the test outcomes are predicated on the model assumptions, so interpreting the test results when the assumptions are in doubt is a risky business. 3) to act as containers for estimating variance components of interest. As for 1), there is really no need to test such random effects, because our interest is in estimating the values that they represent. I would be interested to hear of other uses to which random effects have been put :) I suggest that the original poster might in general consider computing the intra-class correlation to show that within-group statistical dependence is negligible. However, in the case of GLMMs I am not at all sure that it retains any meaning. Computer, beware! Cheers Andrew On Mon, May 15, 2006 at 08:46:39PM -0700, Spencer Graves wrote: I don't know if one should include an apparently insignificant random effect in later analysis or not. In the past, I haven't. As far as I know, the best thing to use would be Bayesian model averaging, beginning with some prior over the class of all plausible models (with and without the random effect), and then average predictions, etc., over the posterior. For more information, you could Google for Bayesian model averaging and try RSiteSearch(Bayesian model averaging). I'm not aware of any BMA software in R for mixed models, but I suspect it will be only a matter of time before BMA replaces step and stepAIC for stepwise regression-type applications. Incorporating mixed models into this framework will be harder, but I know of no theoretical obstacles. With luck, others will enlighten us both further on this. Best Wishes, Spencer Graves Dan Bebber wrote: I may be out of my statistical depth here, but isn't it the case that if one has an experimental design with random effects, one has to include the random effects, even if they appear to be non-significant? AFAIK there are two reasons: one is the possibility of 'restriction errors' that arise by unintentional differences in treatments among groups, so making analysis of among-group variance problematic; the other is that allocations of fixed effects to samples is no longer random and therefore the assumption of random errors is broken. Real statisticians may disagree with this, however. Dan Bebber Department of Plant Sciences University of Oxford Message: 12 Date: Sun, 07 May 2006 14:25:44 -0700 From: Spencer Graves [EMAIL PROTECTED] Subject: Re: [R] How to test for significance of random effects? To: Jon Olav Vik [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=ISO-8859-1; format=flowed 1. Ignoring the complication of logistic regression, the anova(lme1,lm1) provides the answer you seek. See sect. 2.4 in Pinheiro and Bates for more detail on the approximations involved and how that answer can be refined using monte carlo. 2. With logistic regression, you want to do essentially the same thing using glm and lmer (in package 'lme4'), except that many of the required functions are not yet part of 'lme4'. Consider the following example: library(lme4) library(mlmRev) (mlmR - vignette(MlmSoftRev)) #edit(mlmR) # with Rgui #Stangle(mlmR$file) # with ESS # - then open file MlmSoftRev.R fitBin - lmer(use ~ urban+age+livch+(1|district), data=Contraception, family=binomial) fitBin0 - glm(use ~ urban+age+livch, data=Contraception, family=binomial) 2*pchisq(2*as.numeric(logLik(fitBin)- logLik(fitBin0)), 2, lower.tail
Re: [R] How to test for significance of random effects?
I may be out of my statistical depth here, but isn't it the case that if one has an experimental design with random effects, one has to include the random effects, even if they appear to be non-significant? AFAIK there are two reasons: one is the possibility of 'restriction errors' that arise by unintentional differences in treatments among groups, so making analysis of among-group variance problematic; the other is that allocations of fixed effects to samples is no longer random and therefore the assumption of random errors is broken. Real statisticians may disagree with this, however. Dan Bebber Department of Plant Sciences University of Oxford Message: 12 Date: Sun, 07 May 2006 14:25:44 -0700 From: Spencer Graves [EMAIL PROTECTED] Subject: Re: [R] How to test for significance of random effects? To: Jon Olav Vik [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=ISO-8859-1; format=flowed 1. Ignoring the complication of logistic regression, the anova(lme1,lm1) provides the answer you seek. See sect. 2.4 in Pinheiro and Bates for more detail on the approximations involved and how that answer can be refined using monte carlo. 2. With logistic regression, you want to do essentially the same thing using glm and lmer (in package 'lme4'), except that many of the required functions are not yet part of 'lme4'. Consider the following example: library(lme4) library(mlmRev) (mlmR - vignette(MlmSoftRev)) #edit(mlmR) # with Rgui #Stangle(mlmR$file) # with ESS # - then open file MlmSoftRev.R fitBin - lmer(use ~ urban+age+livch+(1|district), data=Contraception, family=binomial) fitBin0 - glm(use ~ urban+age+livch, data=Contraception, family=binomial) 2*pchisq(2*as.numeric(logLik(fitBin)- logLik(fitBin0)), 2, lower.tail=FALSE) Note however that this p-value computation is known to be only an approximation; see RSiteSearch(lmer p-values) for other perspectives. More accurate p-values can be obtained using Markov Chain Monte Carlo, via mcmcsamp. hope this helps, Spencer Graves Jon Olav Vik wrote: Dear list members, I'm interested in showing that within-group statistical dependence is negligible, so I can use ordinary linear models without including random effects. However, I can find no mention of testing a model with vs. without random effects in either Venable Ripley (2002) or Pinheiro and Bates (2000). Our in-house statisticians are not familiar with this, either, so I would greatly appreciate the help of this list. Pinheiro Bates (2000:83) state that random-effect terms can be tested based on their likelihood ratio, if both models have the same fixed-effects structure and both are estimated with REML (I must admit I do not know exactly what REML is, although I do understand the concept of ML). The examples in Pinheiro Bates 2000 deal with simple vs. complicated random-effects structures, both fitted with lme and method=REML. However, to fit a model without random effects I must use lm() or glm(). Is there a way to tell these functions to use REML? I see that lme() can use ML, but PinheiroBates (2000) advised against this for some reason. lme() does provide a confidence interval for the between-group variance, but this is constructed so as to never include zero (I guess the interval is as narrow as possible on log scale, or something). I would be grateful if anyone could tell me how to test for zero variance between groups. If lm1 and lme1 are fitted with lm() and lme() respectively, then anova(lm1,lme1) gives an error, whereas anova(lme1,lm1) gives an answer which looks reasonable enough. The command logLik() can retrieve either restricted or ordinary log-likelihoods from a fitted model object, but the likelihoods are then evaluated at the fitted parameter estimates. I guess these estimates differ from if the model were estimated using REML? My actual application is a logistic regression with two continuous and one binary predictor, in which I would like to avoid the complications of using generalized linear mixed models. Here is a simpler example, which is rather trivial but illustrates the general question: Example (run in R 2.2.1): library(nlme) summary(lm1 - lm(travel~1,data=Rail)) # no random effect summary(lme1 - lme(fixed=travel~1,random=~1|Rail,data=Rail)) # random effect intervals(lme1) # confidence for random effect anova(lm1,lme1) ## Outputs warning message: # models with response NULL removed because # response differs from model 1 in: anova.lmlist(object, ...) anova(lme1,lm1) ## Output: Can I trust this? # Model df AIC BIClogLik Test L.Ratio p-value # lme1 1 3 128.1770 130.6766 -61.08850 # lm1 2 2 162.6815 164.3479 -79.34075 1 vs 2 36.50451 .0001 ## Various log likelihoods: logLik(lm1,REML=FALSE) logLik(lm1,REML=TRUE) logLik(lme1,REML=FALSE) logLik(lme1,REML=TRUE) Any help
Re: [R] How to test for significance of random effects?
1. Ignoring the complication of logistic regression, the anova(lme1,lm1) provides the answer you seek. See sect. 2.4 in Pinheiro and Bates for more detail on the approximations involved and how that answer can be refined using monte carlo. 2. With logistic regression, you want to do essentially the same thing using glm and lmer (in package 'lme4'), except that many of the required functions are not yet part of 'lme4'. Consider the following example: library(lme4) library(mlmRev) (mlmR - vignette(MlmSoftRev)) #edit(mlmR) # with Rgui #Stangle(mlmR$file) # with ESS # - then open file MlmSoftRev.R fitBin - lmer(use ~ urban+age+livch+(1|district), data=Contraception, family=binomial) fitBin0 - glm(use ~ urban+age+livch, data=Contraception, family=binomial) 2*pchisq(2*as.numeric(logLik(fitBin)- logLik(fitBin0)), 2, lower.tail=FALSE) Note however that this p-value computation is known to be only an approximation; see RSiteSearch(lmer p-values) for other perspectives. More accurate p-values can be obtained using Markov Chain Monte Carlo, via mcmcsamp. hope this helps, Spencer Graves Jon Olav Vik wrote: Dear list members, I'm interested in showing that within-group statistical dependence is negligible, so I can use ordinary linear models without including random effects. However, I can find no mention of testing a model with vs. without random effects in either Venable Ripley (2002) or Pinheiro and Bates (2000). Our in-house statisticians are not familiar with this, either, so I would greatly appreciate the help of this list. Pinheiro Bates (2000:83) state that random-effect terms can be tested based on their likelihood ratio, if both models have the same fixed-effects structure and both are estimated with REML (I must admit I do not know exactly what REML is, although I do understand the concept of ML). The examples in Pinheiro Bates 2000 deal with simple vs. complicated random-effects structures, both fitted with lme and method=REML. However, to fit a model without random effects I must use lm() or glm(). Is there a way to tell these functions to use REML? I see that lme() can use ML, but PinheiroBates (2000) advised against this for some reason. lme() does provide a confidence interval for the between-group variance, but this is constructed so as to never include zero (I guess the interval is as narrow as possible on log scale, or something). I would be grateful if anyone could tell me how to test for zero variance between groups. If lm1 and lme1 are fitted with lm() and lme() respectively, then anova(lm1,lme1) gives an error, whereas anova(lme1,lm1) gives an answer which looks reasonable enough. The command logLik() can retrieve either restricted or ordinary log-likelihoods from a fitted model object, but the likelihoods are then evaluated at the fitted parameter estimates. I guess these estimates differ from if the model were estimated using REML? My actual application is a logistic regression with two continuous and one binary predictor, in which I would like to avoid the complications of using generalized linear mixed models. Here is a simpler example, which is rather trivial but illustrates the general question: Example (run in R 2.2.1): library(nlme) summary(lm1 - lm(travel~1,data=Rail)) # no random effect summary(lme1 - lme(fixed=travel~1,random=~1|Rail,data=Rail)) # random effect intervals(lme1) # confidence for random effect anova(lm1,lme1) ## Outputs warning message: # models with response NULL removed because # response differs from model 1 in: anova.lmlist(object, ...) anova(lme1,lm1) ## Output: Can I trust this? # Model df AIC BIClogLik Test L.Ratio p-value # lme1 1 3 128.1770 130.6766 -61.08850 # lm1 2 2 162.6815 164.3479 -79.34075 1 vs 2 36.50451 .0001 ## Various log likelihoods: logLik(lm1,REML=FALSE) logLik(lm1,REML=TRUE) logLik(lme1,REML=FALSE) logLik(lme1,REML=TRUE) Any help is highly appreciated. Best regards, Jon Olav Vik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to test for significance of random effects?
Dear list members, I'm interested in showing that within-group statistical dependence is negligible, so I can use ordinary linear models without including random effects. However, I can find no mention of testing a model with vs. without random effects in either Venable Ripley (2002) or Pinheiro and Bates (2000). Our in-house statisticians are not familiar with this, either, so I would greatly appreciate the help of this list. Pinheiro Bates (2000:83) state that random-effect terms can be tested based on their likelihood ratio, if both models have the same fixed-effects structure and both are estimated with REML (I must admit I do not know exactly what REML is, although I do understand the concept of ML). The examples in Pinheiro Bates 2000 deal with simple vs. complicated random-effects structures, both fitted with lme and method=REML. However, to fit a model without random effects I must use lm() or glm(). Is there a way to tell these functions to use REML? I see that lme() can use ML, but PinheiroBates (2000) advised against this for some reason. lme() does provide a confidence interval for the between-group variance, but this is constructed so as to never include zero (I guess the interval is as narrow as possible on log scale, or something). I would be grateful if anyone could tell me how to test for zero variance between groups. If lm1 and lme1 are fitted with lm() and lme() respectively, then anova(lm1,lme1) gives an error, whereas anova(lme1,lm1) gives an answer which looks reasonable enough. The command logLik() can retrieve either restricted or ordinary log-likelihoods from a fitted model object, but the likelihoods are then evaluated at the fitted parameter estimates. I guess these estimates differ from if the model were estimated using REML? My actual application is a logistic regression with two continuous and one binary predictor, in which I would like to avoid the complications of using generalized linear mixed models. Here is a simpler example, which is rather trivial but illustrates the general question: Example (run in R 2.2.1): library(nlme) summary(lm1 - lm(travel~1,data=Rail)) # no random effect summary(lme1 - lme(fixed=travel~1,random=~1|Rail,data=Rail)) # random effect intervals(lme1) # confidence for random effect anova(lm1,lme1) ## Outputs warning message: # models with response NULL removed because # response differs from model 1 in: anova.lmlist(object, ...) anova(lme1,lm1) ## Output: Can I trust this? # Model df AIC BIClogLik Test L.Ratio p-value # lme1 1 3 128.1770 130.6766 -61.08850 # lm1 2 2 162.6815 164.3479 -79.34075 1 vs 2 36.50451 .0001 ## Various log likelihoods: logLik(lm1,REML=FALSE) logLik(lm1,REML=TRUE) logLik(lme1,REML=FALSE) logLik(lme1,REML=TRUE) Any help is highly appreciated. Best regards, Jon Olav Vik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to test robustness of correlation
Hi, Berton: thanks for getting back to me. I played around cor.rob(). Yes, I can get a robust correlation coefficient matrix based on mcd or mve outlier detection methods. I have two further questions: 1) How do I get a p value of the robust r? 2) What I mean by resampling is leave one out procedure, to get a confidence interval of r. Do you know if there is any package in R to do it? I suppose I could code it myself, but it is nice if there is already one. thanks. Yang Berton Gunter [EMAIL PROTECTED] 25-Jan-2006 15:57 To [EMAIL PROTECTED], r-help@stat.math.ethz.ch cc Subject RE: [R] how to test robustness of correlation check out cov.rob() in MASS (among others, I'm sure). The procedure is far more sophisticated than outlier removal or resampling (??). References are given in the docs. -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Wednesday, January 25, 2006 12:37 PM To: r-help@stat.math.ethz.ch Subject: [R] how to test robustness of correlation Hi, there: As you all know, correlation is not a very robust procedure. Sometimes correlation could be driven by a few outliers. There are a few ways to improve the robustness of correlation (pearson correlation), either by outlier removal procedure, or resampling technique. I am wondering if there is any R package or R code that have incorporated outlier removal or resampling procedure in calculating correlation coefficient. Your help is greatly appreciated. Thanks. Yang Yang Qiu Integrated Data Analysis [EMAIL PROTECTED] GlaxoSmithKline [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to test robustness of correlation
Below Hi, Berton: thanks for getting back to me. I played around cor.rob(). Yes, I can get a robust correlation coefficient matrix based on mcd or mve outlier detection methods. I have two further questions: 1) How do I get a p value of the robust r? A p-value for what? That r==0 ? 2) What I mean by resampling is leave one out procedure, to get a confidence interval of r. Do you know if there is any package in R to do it? I suppose I could code it myself, but it is nice if there is already one. thanks. Yang **An** answer to both is the same -- bootstrap it. Leave one out is not resampling (/bootstrapping). It is usually referred to as jackknifing, but that uses more specific ways of doing things than the analogy implies. Efron's little SIAM book on The jackknife, bootstrap, etc. explains them and their relationships in detail. It is trivial to bootstrap cor.rob in base R using sample() (from the x,y **pairs** -- or n-tuples generally -- not the marginals separately ). If you insist on a package, boot is the obvious one -- why did you not attempt to find it yourself? Either way, expect it to take a while for a decent size resample (e.g. 1e4). -- Bert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to test robustness of correlation
One more thing ... I played around cor.rob(). Yes, I can get a robust correlation coefficient matrix based on mcd or mve outlier detection methods. I have two further questions: You might call it semantics, but I prefer resistant estimation to outlier detection methods. I recognize that they are equivalent (any resistant estimator can be used to identify outliers; any outlier detection method leads to a resistant estimator on downweighting of outliers). However, I consider the distinction important. Outlier detection suggests: 1) That outlier is a statistically well-defined concept; it isn't. The implied dichotomy is a fiction (a dangerous one, IMO -- but many would disagree). 2) That some sort of hypothesis testing procedure is used to reject points. None is. Rather, mve() and mcd() try to characterize the behavior of the central mass of the distribution, using that characterization to weight the informativeness of points outside that mass. A 1-D equivalent is MAD for spread. This is a far cry from the bad old days of (sequential) outlier detection. These methods are crucially dependent on modern computer power of course. Cheers, Bert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to test robustness of correlation
The cor function can do spearman correlation using method = spearman . On 1/25/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi, there: As you all know, correlation is not a very robust procedure. Sometimes correlation could be driven by a few outliers. There are a few ways to improve the robustness of correlation (pearson correlation), either by outlier removal procedure, or resampling technique. I am wondering if there is any R package or R code that have incorporated outlier removal or resampling procedure in calculating correlation coefficient. Your help is greatly appreciated. Thanks. Yang Yang Qiu Integrated Data Analysis [EMAIL PROTECTED] GlaxoSmithKline [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to test robustness of correlation
Gabor: Contrary to popular belief, rank-based procedures are **not** resistant. Example: x-c(1:10,100);y-c(1:10+rnorm(10,sd=.25),-100) cor(x,y) [1] -0.9816899 ## awful cor(x,y,method='spearman') [1] 0.5 ## better require(MASS) Loading required package: MASS [1] TRUE cov.rob(cbind(x,y),cor=TRUE) ## ... bunch of output omitted $cor x y x 1.000 0.9977734 ## best y 0.9977734 1.000 ## Look at the plot to see. -- Bert -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gabor Grothendieck Sent: Thursday, January 26, 2006 9:05 AM To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Subject: Re: [R] how to test robustness of correlation The cor function can do spearman correlation using method = spearman . On 1/25/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi, there: As you all know, correlation is not a very robust procedure. Sometimes correlation could be driven by a few outliers. There are a few ways to improve the robustness of correlation (pearson correlation), either by outlier removal procedure, or resampling technique. I am wondering if there is any R package or R code that have incorporated outlier removal or resampling procedure in calculating correlation coefficient. Your help is greatly appreciated. Thanks. Yang Yang Qiu Integrated Data Analysis [EMAIL PROTECTED] GlaxoSmithKline [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to test robustness of correlation
Hi, there: As you all know, correlation is not a very robust procedure. Sometimes correlation could be driven by a few outliers. There are a few ways to improve the robustness of correlation (pearson correlation), either by outlier removal procedure, or resampling technique. I am wondering if there is any R package or R code that have incorporated outlier removal or resampling procedure in calculating correlation coefficient. Your help is greatly appreciated. Thanks. Yang Yang Qiu Integrated Data Analysis [EMAIL PROTECTED] GlaxoSmithKline [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to test robustness of correlation
check out cov.rob() in MASS (among others, I'm sure). The procedure is far more sophisticated than outlier removal or resampling (??). References are given in the docs. -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Wednesday, January 25, 2006 12:37 PM To: r-help@stat.math.ethz.ch Subject: [R] how to test robustness of correlation Hi, there: As you all know, correlation is not a very robust procedure. Sometimes correlation could be driven by a few outliers. There are a few ways to improve the robustness of correlation (pearson correlation), either by outlier removal procedure, or resampling technique. I am wondering if there is any R package or R code that have incorporated outlier removal or resampling procedure in calculating correlation coefficient. Your help is greatly appreciated. Thanks. Yang Yang Qiu Integrated Data Analysis [EMAIL PROTECTED] GlaxoSmithKline [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to test a time series fit the Poisson or other process?
I just did RSiteSearch(poisson time series). The second and third of 75 hits seemed relevant to your question. (e.g., http://finzi.psych.upenn.edu/R/Rhelp02a/archive/58054.html) Some of the other responses did not seem relevant, but I didn't look at all of them. This response menitoned Jim Lindsey, whose R code web site is http://popgen0146uns50.unimaas.nl/~jlindsey/rcode.html;. Hope this helps. If I had a long time series of Poisson counts, I'd be tempted to try a standard time series model on the square root of the counts. If the results were dramatically different from what I got from some more sophisticated modeling strategy, I'd look very carefully at both to make sure I hadn't made a mistake some place. If it were not that important, I might just apply standard time series techniques to the square roots of the counts and go on to the next task. hope this helps. spencer graves p.s. If you'd like more information from this listserve, PLEASE do read the posting guide! www.R-project.org/posting-guide.html. I believe that people who follow that guide generally get quicker, more useful replies. This is especially true for those who supply a simple, toy example in a few lines of R code that someone else can copy from an email into R, test a few ideas, and craft a reply in a very few minutes. 广星 wrote: Hi, R-Help, I am a newbie. what I concern most recently is the analysis of the time series, But there are a lot of package in my eyes. All I want to try is as follow: How to test whether a time series fit the Poisson or other process in R? Thank you very much in advance. [EMAIL PROTECTED] 2005-11-25 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA [EMAIL PROTECTED] www.pdf.com http://www.pdf.com Tel: 408-938-4420 Fax: 408-280-7915 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to test a time series fit the Poisson or other process?
Hi, R-Help, I am a newbie. what I concern most recently is the analysis of the time series, But there are a lot of package in my eyes. All I want to try is as follow: How to test whether a time series fit the Poisson or other process in R? Thank you very much in advance. [EMAIL PROTECTED] 2005-11-25 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to test poisson distribution
Hello Gunter, 2005/10/19, Berton Gunter [EMAIL PROTECTED]: To be pedantic (I'm feeling cranky today): One can never test whether the data follow [data is plural] a Poisson distribution -- only whether there is sufficient evidence to cast that assumption into doubt. Perhaps a better shorthand is whether the data are consistent with Poisonness . This correctly leaves open the possibility that the data are consistent with lots of other distribution-nesses, too. welcome alternatives, perhaps privately to reduce the list noise level. (And,yes, I'm sure that Thomas knows this perfectly well). I'm agree with this. But, my understanding of this thread underlying question was: Where might I find some advice to deal with distributions in R. So I pointed to Vito Riccis paper on distribution fitting, which also points to issues of testing/fitting distributions in a Newbie accessible way, taking beginners to basics insights handling issues like this in GNU R. Well, that said I think pointing at sources like this might help drecreasing, at least a bit of noise on this list. Lots of work and sweat have gone into creating such docs, so why not use them? They're aiming at Newbies using R and offer at the same time slight glances upon the topic itself. Well, I am relying on former dicussions on this topic on this list, keyword spoon feeding versus self-helping based on docs. I do think that we should be a bit less sloppy about such things even here, lest we continue to promulgate already widespread misunderstandings, even at the cost of slightly increased bandwidth. After all, precision is supposed to be a major concern or ours. Yes, again you are correct here, slopyness isn't very helpful here! I will take this more into account posting here next time! sincerely Thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to test poisson distribution
Dear All, I am wonderng how to test whether the data follows poisson distribution. Thank you so much! [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to test poisson distribution
Hi, 2005/10/19, Wensui Liu [EMAIL PROTECTED]: Dear All, I am wonderng how to test whether the data follows poisson distribution. Thank you so much! Did you notice the PDF on distribution tests using R by Vito Ricci, its found at CRAN in the docs contrib section, called FITTING DISTRIBUTIONS WITH R Maybe this could be of some help for you, especially look at page 7 (poisson dsitribution example). __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to test poisson distribution
To be pedantic (I'm feeling cranky today): One can never test whether the data follow [data is plural] a Poisson distribution -- only whether there is sufficient evidence to cast that assumption into doubt. Perhaps a better shorthand is whether the data are consistent with Poisonness . This correctly leaves open the possibility that the data are consistent with lots of other distribution-nesses, too. I welcome alternatives, perhaps privately to reduce the list noise level. (And,yes, I'm sure that Thomas knows this perfectly well). I do think that we should be a bit less sloppy about such things even here, lest we continue to promulgate already widespread misunderstandings, even at the cost of slightly increased bandwidth. After all, precision is supposed to be a major concern or ours. As I've been cranky, others are free to return the favor. Sauce for the goose ... -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Thomas Schönhoff Sent: Wednesday, October 19, 2005 10:00 AM To: r-help@stat.math.ethz.ch Subject: Re: [R] how to test poisson distribution Hi, 2005/10/19, Wensui Liu [EMAIL PROTECTED]: Dear All, I am wonderng how to test whether the data follows poisson distribution. Thank you so much! Did you notice the PDF on distribution tests using R by Vito Ricci, its found at CRAN in the docs contrib section, called FITTING DISTRIBUTIONS WITH R Maybe this could be of some help for you, especially look at page 7 (poisson dsitribution example). __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to test poisson distribution
On Wed, 19 Oct 2005 10:18:02 -0700 Berton Gunter wrote: To be pedantic (I'm feeling cranky today): One can never test whether the data follow [data is plural] a Poisson distribution -- only whether there is sufficient evidence to cast that assumption into doubt. Perhaps a better shorthand is whether the data are consistent with Poisonness . This correctly leaves open the possibility that the data are consistent with lots of other distribution-nesses, too. I welcome alternatives, perhaps privately to reduce the list noise level. ...now that Berton mentioned `checking distribution-nesses': the function distplot() in the package vcd implements various plots for distribution-nesses that can be used for graphical checking. Ord_plot() is made for a similar purpose. Finally, there is also a function goodfit() that computes goodness-of-fit tests for such hypotheses. All three functions are written following Chapter 2 `Fitting and Graphing Discrete Distributions' in Michael Friendly's book `Visualizing Categorical Data'. hth, Z (And,yes, I'm sure that Thomas knows this perfectly well). I do think that we should be a bit less sloppy about such things even here, lest we continue to promulgate already widespread misunderstandings, even at the cost of slightly increased bandwidth. After all, precision is supposed to be a major concern or ours. As I've been cranky, others are free to return the favor. Sauce for the goose ... -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Thomas Schönhoff Sent: Wednesday, October 19, 2005 10:00 AM To: r-help@stat.math.ethz.ch Subject: Re: [R] how to test poisson distribution Hi, 2005/10/19, Wensui Liu [EMAIL PROTECTED]: Dear All, I am wonderng how to test whether the data follows poisson distribution. Thank you so much! Did you notice the PDF on distribution tests using R by Vito Ricci, its found at CRAN in the docs contrib section, called FITTING DISTRIBUTIONS WITH R Maybe this could be of some help for you, especially look at page 7 (poisson dsitribution example). __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to test homogeneity of two covariance matrices?
__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to test homogeneity of covariance matrices?
Dear Group Members, Forgive me if I am a little bit out of subject. I am looking for a good way to test the homogeneity of two variance-covariance matrices using R, prior to a Hotelling T² test. You’ll probably tell me that it is better to use a robust version of T², but I have no precise idea of the statistical behaviour of my variables, because they are parameters from the harmonics of Fourier series used to describe the outlines of specimens. I rather like to explore precisely these harmonics parameters. It is known that Box’s M-test of homogeneity of variance-covariance matrices is oversensitive to heteroscedasticity and to deviation from multivariate normality and that it I not useful (Everitt, 2005 ; Seber, 1984 ; Layard, 1974). I have tried a “quick and dirty” intuitive comparison between two covariance matrices and I am seeking the opinion of professional statisticians about this stuff. The idea is to compare the two matrices using the absolute value of their difference, then to make a quadratic form using a unity vector and its transpose. One obtain a scalar that must be close to zero if the two covariance matrices are homogeneous : Let S1 and S2 be two variance-covariance matrices of dimension n, Let a be a vector of n ones : a - rep(1, times = n) b = a’ * |S1 – S2| * a, i.e. in R: b - a %*% abs(S1 – S2) %*% a Is b distributed following a chi-square distribution? Is this idea total crap? Did someone tried this before and published something? My data gave two 77 x 77 covariance matrices and b = 0.003243, a value close to 0, hence I expect my two covariance matrices are homogeneous. Am I right? If this comparison is incorrect, could someone suggest a useful way to make this comparison using R? Thank you in advance for your comments. Franck ___ Dr Franck BAMEUL Le Clos d'Ornon 7 rue Frédéric Mistral F-33140 VILLENAVE D'ORNON France [EMAIL PROTECTED] 06 89 88 16 73 (personnel) 05 57 19 57 20 (professionnel) 05 57 19 57 27 (fax) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to test this
Thank you all for the reply. Regards, Jin -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Wednesday, 3 August 2005 5:20 P To: Simon Blomberg Cc: Li, Jin (CSE, Atherton); r-help@stat.math.ethz.ch Subject: Re: [R] how to test this On Wed, 3 Aug 2005, Simon Blomberg wrote: This is two tests: Whether the slope != 1 and whether the intercept != 0. Neither model given has an intercept To do this, include an offset in your model: fit - lm(y ~ x + offset(x), data=dat) but no intercept, so use summary(lm(y ~ 0 + x + offset(1.05*x), data=dat)) and look if the coefficient of x is significantly different from zero. E.g. x - 1:10 set.seed(1) y - 1.05*x + rnorm(10) summary(lm(y ~ 0 + x + offset(1.05*x))) Coefficients: Estimate Std. Error t value Pr(|t|) x 0.030610.03910 0.7830.454 is not. HTH, Simon. At 03:44 PM 3/08/2005, [EMAIL PROTECTED] wrote: I am wondering how to test whether a simple linear regression model (e.g. y=1.05x) is significantly different from a 1 to 1 line (i.e. y=x). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to test this
This is two tests: Whether the slope != 1 and whether the intercept != 0. To do this, include an offset in your model: fit - lm(y ~ x + offset(x), data=dat) HTH, Simon. At 03:44 PM 3/08/2005, [EMAIL PROTECTED] wrote: Dear there, I am wondering how to test whether a simple linear regression model (e.g. y=1.05x) is significantly different from a 1 to 1 line (i.e. y=x). Thanks. Regards, Jin [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Simon Blomberg, B.Sc.(Hons.), Ph.D, M.App.Stat. Centre for Resource and Environmental Studies The Australian National University Canberra ACT 0200 Australia T: +61 2 6125 7800 email: Simon.Blomberg_at_anu.edu.au F: +61 2 6125 0757 CRICOS Provider # 00120C __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to test this
On Wed, 3 Aug 2005, Simon Blomberg wrote: This is two tests: Whether the slope != 1 and whether the intercept != 0. Neither model given has an intercept To do this, include an offset in your model: fit - lm(y ~ x + offset(x), data=dat) but no intercept, so use summary(lm(y ~ 0 + x + offset(1.05*x), data=dat)) and look if the coefficient of x is significantly different from zero. E.g. x - 1:10 set.seed(1) y - 1.05*x + rnorm(10) summary(lm(y ~ 0 + x + offset(1.05*x))) Coefficients: Estimate Std. Error t value Pr(|t|) x 0.030610.03910 0.7830.454 is not. HTH, Simon. At 03:44 PM 3/08/2005, [EMAIL PROTECTED] wrote: I am wondering how to test whether a simple linear regression model (e.g. y=1.05x) is significantly different from a 1 to 1 line (i.e. y=x). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to test this
Dear there, I am wondering how to test whether a simple linear regression model (e.g. y=1.05x) is significantly different from a 1 to 1 line (i.e. y=x). Thanks. Regards, Jin [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to test the equalness of several coefficients in a gamma frailty model using R
Hi, I want to test the equalness of several coefficients of a gamma frailty model using R. In SAS, a TEST statement can be used for a cox model.How to do it in R? Thanks a lot! Guanghui __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to test for equality of covariance Matrices in lda
when using the two-group discriminant analysis,we need to test for equality of covariance Matrices in lda.as whenm we formed our estimate of the within-group covariance matrix by pooling across groups,we implicitly assumed that the covariance structure was the same across groups.so it seems important the test the equality.but i can not find function in R to do these. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to test the significance of a value estimated with lme?
Hi, After running an experiment in economics involving 3 treatment variables (Complete, High and Ksup), 32 Groups of subject (4 groups for each of the 8 treatment combinations) and 12 Periods, I've estimated the following model in which all coefficients are significant: y - lme(Diff~factor(Complete)+factor(High)+factor(Ksup)+Period+factor(High):Period+factor(Ksup):Period, random=~1 | Group, method='ML'). The result is: (Intercept) -10.99 factor(Complete)1 -9.05 factor(High)1 -12.28 factor(Ksup)1 14.69 Period 1.12 factor(High)1:Period 0.98 factor(Ksup)1:Period -0.85 This allows me, if I'm not wrong, to estimate for example the value of the dependent variable Diff in period 5 under the High and Complete (but not the Ksup) conditions as: -10.99 - 9.05 - 12.28 + 1.12*5 + 0.98*5. Now, I'd like to test whether this estimated value is significantly different from 0: How should I do? Thanks a lot! Francois Cochard University of Toulouse 1, France. [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to test the existence of a name in a dataframe
I wanted to test if there exists already a name (which is incidentally a substring of another name) in a dataframe. I did e.g.: data(swiss) names(swiss) [1] FertilityAgriculture Examination Education [5] Catholic Infant.Mortality ! is.null(swiss$EduX) [1] FALSE ! is.null(swiss$Edu) [1] TRUE I did not expect to get TRUE here because ``Edu'' does not exist as name of ``swiss''. I did finally: 'Edu' %in% names(swiss) for which I got the expected FALSE. My question: What is the recommended way to do such a test? Thanks - Wolfram Fischer __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to test the existence of a name in a dataframe
Hi Wolfram, this behaviour is due to partial matching. Observe that data(swiss) swiss$Ed swiss$Edu swiss$Educ I think the best way to do it is with `%in%' or `match()', i.e., c(Ed, Edu, Educ, Education) %in% names(swiss) I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/16/336899 Fax: +32/16/337015 Web: http://www.med.kuleuven.ac.be/biostat http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm - Original Message - From: Wolfram Fischer [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, December 07, 2004 9:47 AM Subject: [R] how to test the existence of a name in a dataframe I wanted to test if there exists already a name (which is incidentally a substring of another name) in a dataframe. I did e.g.: data(swiss) names(swiss) [1] FertilityAgriculture Examination Education [5] Catholic Infant.Mortality ! is.null(swiss$EduX) [1] FALSE ! is.null(swiss$Edu) [1] TRUE I did not expect to get TRUE here because ``Edu'' does not exist as name of ``swiss''. I did finally: 'Edu' %in% names(swiss) for which I got the expected FALSE. My question: What is the recommended way to do such a test? Thanks - Wolfram Fischer __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to test a model with two unkown constants
Hi all, suppose I've got a vector y with some data (from a repeated measure design) observed given the conditions in f1 and f2. I've got a model with two unknown fix constants a and b which tries to predict y with respect to the values in f1 and f2. Here is an exsample # data y - c(runif(10, -1,0), runif(10,0,1)) # f1 f1 - rep(c(-1.4, 1.4), rep(10,2)) # f2 f2 - rep(c(-.5, .5), rep(10,2)) Suppose my simple model looks like y = a/f1 + b*f2 Is there a function in R which can compute the estimates for a and b? And is it possible to test the model, eg how good the fits of the model are? Thanks, Sven __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] How to test a model with two unkown constants
That's the linear model lm(y ~ I(1/f1) + f2), so yes, yes and fuller answers can be found in most of the books and guides mentioned in R's FAQ. Note that how `good' the fit is will have to be relative, unless you really can assume a uniform error with range 1, when you could do a maximum-likelihood fit (and watch out for the non-standard distribution theory). On 27 Aug 2003, Sven Garbade wrote: Hi all, suppose I've got a vector y with some data (from a repeated measure design) observed given the conditions in f1 and f2. I've got a model with two unknown fix constants a and b which tries to predict y with respect to the values in f1 and f2. Here is an exsample # data y - c(runif(10, -1,0), runif(10,0,1)) # f1 f1 - rep(c(-1.4, 1.4), rep(10,2)) # f2 f2 - rep(c(-.5, .5), rep(10,2)) Suppose my simple model looks like y = a/f1 + b*f2 Is there a function in R which can compute the estimates for a and b? And is it possible to test the model, eg how good the fits of the model are? -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] How to test a model with two unkown constants
Sven Garbade [EMAIL PROTECTED] writes: Hi all, suppose I've got a vector y with some data (from a repeated measure design) observed given the conditions in f1 and f2. I've got a model with two unknown fix constants a and b which tries to predict y with respect to the values in f1 and f2. Here is an exsample # data y - c(runif(10, -1,0), runif(10,0,1)) # f1 f1 - rep(c(-1.4, 1.4), rep(10,2)) # f2 f2 - rep(c(-.5, .5), rep(10,2)) Suppose my simple model looks like y = a/f1 + b*f2 Is there a function in R which can compute the estimates for a and b? And is it possible to test the model, eg how good the fits of the model are? f2 and 1/f1 are exactly collinear, so no, not in R, nor any other way. Apart from that, the model is linear in a and b so lm() can fit it (with different f1 and f2) if you're not too squeamish about the error distribution. -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] how to test whether two slopes are sign. different?
Or better yet skip the whole significantly different all together, and figure out if a model with 2 slopes explains the data better than a model with 1-- AIC's! Mark -Original Message- From: Brett Magill [mailto:[EMAIL PROTECTED] Sent: Sun 7/20/2003 7:12 PM To: Gijsbert Stoet; [EMAIL PROTECTED] Cc: Subject: Re: [R] how to test whether two slopes are sign. different? Not really r-specific: Z = (b1 - b2) / SQRT ( SEb1^2 + SEb2^2) ---Original Message--- From: Gijsbert Stoet [EMAIL PROTECTED] Sent: 07/20/03 09:51 PM To: [EMAIL PROTECTED] Subject: [R] how to test whether two slopes are sign. different? Hi, suppose I do want to test whether the slopes (e.g. determined with lsfit) of two different population are significantly different, how do I test this (in R). Say for example, I found out what the slope between age and number of books read per year is for two different populations of subjects (e.g. 25 man and 25 woman), say using lsfit. How can I tell whether the slopes are different in R. (And how would I do it for regression coefficients?) Thanks a lot for your help. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] how to test whether two slopes are sign. different?
On Sun, 20 Jul 2003, Herzog, Mark wrote: Or better yet skip the whole significantly different all together, and figure out if a model with 2 slopes explains the data better than a model with 1-- AIC's! That's not what AIC is designed to do: it is about `prediction' not `explanation', as you will discover from the primary sources (if not from some of the secondary ones). In this specific case there are is the question of whether the error variances are the same to take into account, which makes it tricky to fit a single model (especially with lsfit). Mark -Original Message- From: Brett Magill [mailto:[EMAIL PROTECTED] Sent: Sun 7/20/2003 7:12 PM To: Gijsbert Stoet; [EMAIL PROTECTED] Cc: Subject: Re: [R] how to test whether two slopes are sign. different? Not really r-specific: Z = (b1 - b2) / SQRT ( SEb1^2 + SEb2^2) ---Original Message--- From: Gijsbert Stoet [EMAIL PROTECTED] Sent: 07/20/03 09:51 PM To: [EMAIL PROTECTED] Subject: [R] how to test whether two slopes are sign. different? Hi, suppose I do want to test whether the slopes (e.g. determined with lsfit) of two different population are significantly different, how do I test this (in R). Say for example, I found out what the slope between age and number of books read per year is for two different populations of subjects (e.g. 25 man and 25 woman), say using lsfit. How can I tell whether the slopes are different in R. (And how would I do it for regression coefficients?) Thanks a lot for your help. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] how to test whether two slopes are sign. different?
Dear Stoet This can be handled well in using a mixed-effects model, library (nmle). You can use the lmList option to check whether the slopes differ across populations. -- Harold C. Doran Director of Research and Evaluation New American Schools 675 N. Washington Street, Suite 220 Alexandria, Virginia 22314 703.647.1628 http://www.edperform.net -Original Message- From: Gijsbert Stoet [mailto:[EMAIL PROTECTED] Sent: Sunday, July 20, 2003 10:51 PM To: [EMAIL PROTECTED] Subject: [R] how to test whether two slopes are sign. different? Hi, suppose I do want to test whether the slopes (e.g. determined with lsfit) of two different population are significantly different, how do I test this (in R). Say for example, I found out what the slope between age and number of books read per year is for two different populations of subjects (e.g. 25 man and 25 woman), say using lsfit. How can I tell whether the slopes are different in R. (And how would I do it for regression coefficients?) Thanks a lot for your help. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] how to test whether two slopes are sign. different?
Hi, suppose I do want to test whether the slopes (e.g. determined with lsfit) of two different population are significantly different, how do I test this (in R). Say for example, I found out what the slope between age and number of books read per year is for two different populations of subjects (e.g. 25 man and 25 woman), say using lsfit. How can I tell whether the slopes are different in R. (And how would I do it for regression coefficients?) Thanks a lot for your help. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] how to test whether two slopes are sign. different?
Not really r-specific: Z = (b1 - b2) / SQRT ( SEb1^2 + SEb2^2) ---Original Message--- From: Gijsbert Stoet [EMAIL PROTECTED] Sent: 07/20/03 09:51 PM To: [EMAIL PROTECTED] Subject: [R] how to test whether two slopes are sign. different? Hi, suppose I do want to test whether the slopes (e.g. determined with lsfit) of two different population are significantly different, how do I test this (in R). Say for example, I found out what the slope between age and number of books read per year is for two different populations of subjects (e.g. 25 man and 25 woman), say using lsfit. How can I tell whether the slopes are different in R. (And how would I do it for regression coefficients?) Thanks a lot for your help. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help