I am trying to calculate a partial correlation and p-values. Unfortunately, the results in R are different than what SPSS gives.
Here is an example in R (calculating the partial correlation of x and y, controlling for z1 and z2): x <- c(1,20,14,30,9,4,8) y <- c(5,6,7,9,NA,10,6) z1 <- c(13,8,16,14,26,13,20) z2 <- c(12,NA,2,5,8,16,13) fmx <- lm(x ~ z1 + z2, na.action = na.exclude) fmy <- lm(y ~ z1 + z2, na.action = na.exclude) yres <- resid(fmy) xres <- resid(fmx) cor(xres, yres, use = "p") ct <- cor.test(xres, yres) ct$estimate ct$p.value R give me: r = .65, p = .23 However, SPSS calculates: r = .46, p = .70 I think something may be different with R's handling of missing data, as when I replace the NA's with values, R and SPSS give the same r-values, albeit different p-values still. I am doing pairwise case exclusion in both R and SPSS. Any ideas why I'm getting different values? Is something wrong with my formula in R? Any help would be greatly appreciated. Thanks! Peter Ehlers wrote: > > > dadrivr wrote: >> The variables have the same length, but with different numbers of missing >> values (NA). As a result, the residuals calculations (xres & yres) have >> different lengths, and I cannot compute the correlation between the two >> (error of incompatible dimensions - see example below). Is there a way, >> when calculating residuals, to leave the NAs in the residual calculation >> and >> output? Thanks! >> >> x <- c(1,20,14,NA,9) >> y <- c(5,6,7,9,10) >> z <- c(13,NA,16,14,NA) >> xres <- residuals(lm(x ~ z)) >> yres <- residuals(lm(y ~ z)) >> cor(xres, yres) >> ct <- cor.test(xres, yres) >> ct$estimate >> ct$p.value >> > > Well, your example above just uses two points for the x on z > regression and that gives zero-residuals. Let's use > something a bit more realistic: > > set.seed(123) > x <- rnorm(5) > y <- rnorm(5) > z <- rnorm(5) > x[sample(5,1)] <- NA > z[sample(5,1)] <- NA > fmx <- lm(x ~ z, na.action = na.exclude) > fmy <- lm(y ~ z, na.action = na.exclude) > yres <- resid(fmy) > xres <- resid(fmx) > cor(xres, yres, use = "p") # -0.95978 > ct <- cor.test(xres, yres) > ct$estimate # -0.95978 > ct$p.value # 0.04021994 > > Check out the 'na.exclude' action in lm(): it preserves > the length of the residual vector. Then the 'use=' > argument to cor() uses pairwise complete observations. > cor.test() will do that automatically. > > -Peter Ehlers > >> >> Ista Zahn wrote: >>> 1) Think about what you did wrong. It doesn't make sense to do >>> correlation/regression with variables of different lengths. You can >>> have missing values in one or more variables, if that's what you mean. >>> Just code them NA. >>> >>> 2) Just add in the predictors, e.g. >>> residuals(lm(y ~ z1 + z2)) >>> >>> -Ista >>> >>> On Wed, Nov 11, 2009 at 10:34 PM, dadrivr <dadr...@gmail.com> wrote: >>>> Awesome, that's what I was looking for.  I have two additional >>>> questions: >>>> (1) >>>> What can I do if the variables are of different lengths? (2) How do I >>>> update >>>> the formula if I want to control for more than one variable. >>>> >>>> Let's take the following example: >>>> x <- c(1,20,14,7,9) >>>> y <- c(5,6,7,9,10,11) >>>> z <- c(13,27,16,5,4,17,20) >>>> a <- c(4,6,7,1) >>>> >>>> xres <- residuals(lm(x ~ z)) >>>> yres <- residuals(lm(y ~ z)) >>>> cor(xres, yres) >>>> ct <- cor.test(xres, yres) >>>> ct$estimate >>>> ct$p.value >>>> >>>> How do I update the above formula to: >>>> (1) take into account that the variables are of different lengths?  I >>>> get >>>> an >>>> error when calculating the residuals. >>>> (2) control for z and a (i.e., more than one variable)? >>>> >>>> Thanks so much for your help. >>>> >>>> >>>> Peter Ehlers wrote: >>>>> >>>>> dadrivr wrote: >>>>>> I'm trying to write code to calculate partial correlations (along >>>>>> with >>>>>> p-values).  I'm new to R, and I don't know how to do this.  I have >>>>>> searched >>>>>> and come across different functions, but I haven't been able to get >>>>>> any >>>>>> of >>>>>> them to work (for example, pcor and pcor.test from the ggm package). >>>>>> >>>>>> In the following example, I am trying to compute the correlation >>>>>> between >>>>>> x >>>>>> and y, while controlling for z (partial correlation): >>>>>> >>>>>> x <- c(1,20,14,7,9) >>>>>> y <- c(5,6,7,9,10) >>>>>> z <- c(13,27,16,5,4) >>>>>> >>>>>> What function can I append to this to find this partial correlation? >>>>>> Many >>>>>> thanks! >>>>> I'm not sure what you need, but does this give you what >>>>> you want: >>>>> >>>>> xres <- residuals(lm(x ~ z)) >>>>> yres <- residuals(lm(y ~ z)) >>>>> cor(xres, yres) >>>>> # [1] 0.9778857 >>>>> >>>>> or >>>>> >>>>> ct <- cor.test(xres, yres) >>>>> ct$estimate  # 0.9978857 >>>>> ct$p.value  # 0.003934582 >>>>> >>>>>  -Peter Ehlers >>>>> >>>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>>> >>>> -- >>>> View this message in context: >>>> http://old.nabble.com/Partial-correlations-and-p-values-tp26308463p26312873.html >>>> Sent from the R help mailing list archive at Nabble.com. >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >>> -- >>> Ista Zahn >>> Graduate student >>> University of Rochester >>> Department of Clinical and Social Psychology >>> http://yourpsyche.org >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://n4.nabble.com/Partial-correlations-and-p-values-tp908641p932659.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.