[R] creating bins for a plot
Hi. I'm trying to plot the ratio of used versus unused bird houses (coded 1 or 0) versus a continuous environmental gradient (proportion of urban cover [purban2]) that I would like to convert into bins (0 - 0.25, 0.26 - 0.5, 0.51 - 0.75, 0.76 - 1.0) and I'm not having much luck figuring this out. I ran a logistic regression and purban2 ends up driving the probability of a box being occupied so it would be nice to show this relationship. I'm also plotting the fitted values vs. purban2 but that's done. Any suggestions would be appreciated. Many thanks, Jeff Data sample: box use purbank purban2 1 1 0.003813435 0.02684564 2 1 0.04429451 0.1610738 3 1 0.04458785 0.06040268 4 1 0.06072162 0.2080537 5 0 0.6080962 0.6979866 6 1 0.6060428 0.6107383 7 1 0.3807568 0.4362416 8 0 0.3649164 0.3154362 9 0 0.3505427 0.2483221 10 0 0.3476093 0.1409396 11 0 0.3719566 0.3020134 12 1 0.09238011 0.1342282 13 0 0.08616111 0.1073826 14 0 0.07388724 0.04026845 15 1 0.07046477 0.03355705 . . . Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nontabular logistic regression
Hi. I'm attempting to fit a logistic/binomial model so I can determine the influence of landscape on the probability that a box gets used by a bird. I've looked at a few sources (MASS text, Dalgaard, Fox and google) and the examples are almost always based on tabular predictor variables. My data, however are not. I'm not sure if that is the source of the problems or not because the one example that includes a continuous predictor looks to be coded exactly the same way. Looking at the output, I get estimates for each case when I should get a single estimate for purbank. Any suggestions? Many thanks, Jeff THE DATA: (200 boxes total, used [0 if unoccupied, 1 occupied], the rest are landscape variables). box use purbank purban2 purban1 pgrassk pgrass2 pgrass1 grassdist grasspatchk 1 1 0.003813435 0.02684564 0.06896552 0.3282487 0.6845638 0.7586207 0 3.73 2 1 0.04429451 0.1610738 0.1724138 0.1534174 0.3825503 0.6551724 0 1.023261 3 1 0.04458785 0.06040268 0 0.1628043 0.5570470.7586207 0 0.9605769 4 1 0.06072162 0.2080537 0.06896552 0.01936052 0 0 323.10990.2284615 5 0 0.6080962 0.6979866 0.6896552 0.03168084 0.1275168 0.2413793 30 0.2627027 6 1 0.6060428 0.6107383 0.3448276 0.04077442 0.2885906 0.4482759 30 0.2978571 7 1 0.3807568 0.4362416 0.6896552 0.06864183 0.03355705 0 94.868330.468 8 0 0.3649164 0.3154362 0.4137931 0.06277501 0.1275168 0 120 0.4585714 THE CODE: box.use- read.csv(c:\\eabl\\2004\\use_logistic2.csv, header=TRUE) attach(box.use) box.use - na.omit(box.use) use - factor(use, levels=0:1) levels(use) - c(unused, used) glm1 - glm(use ~ purbank, binomial) THE OUTPUT: Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept)-4.544e-16 1.414e+00 -3.21e-161.000 purbank02.157e+01 2.923e+04 0.0010.999 purbank0.001173365 2.157e+01 2.067e+04 0.0010.999 purbank0.001466706 2.157e+01 2.923e+04 0.0010.999 purbank0.001760047 6.429e-16 2.000e+00 3.21e-161.000 purbank0.002346729 2.157e+01 2.923e+04 0.0010.999 purbank0.003813435 2.157e+01 2.923e+04 0.0010.999 purbank0.004106776 2.157e+01 2.067e+04 0.0010.999 purbank0.004693458 2.157e+01 2.067e+04 0.0010.999 Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nontabular logistic regression
Gavin, That worked! I went through and I found a few missing cases where I had . instead of NA - I'm still in SAS mode. Many thanks! Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja Gavin Simpson [EMAIL PROTECTED] 10/13/06 11:23 AM On Fri, 2006-10-13 at 09:28 -0500, Jeffrey Stratford wrote: Hi. I'm attempting to fit a logistic/binomial model so I can determine the influence of landscape on the probability that a box gets used by a bird. I've looked at a few sources (MASS text, Dalgaard, Fox and google) and the examples are almost always based on tabular predictor variables. My data, however are not. I'm not sure if that is the source of the problems or not because the one example that includes a continuous predictor looks to be coded exactly the same way. Looking at the output, I get estimates for each case when I should get a single estimate for purbank. Any suggestions? Many thanks, Jeff Hi Jeff, using the snippet of data you provided (copy/paste into a text file and read in with read.table) worked fine: box.use - read.table(~/tmp/tmp.txt, header = TRUE) box.use str(box.use) 'data.frame': 8 obs. of 10 variables: $ box: int 1 2 3 4 5 6 7 8 $ use: int 1 1 1 1 0 1 1 0 $ purbank: num 0.00381 0.04429 0.04459 0.06072 0.60810 ... $ purban2: num 0.0268 0.1611 0.0604 0.2081 0.6980 ... $ purban1: num 0.069 0.172 0.000 0.069 0.690 ... $ pgrassk: num 0.3282 0.1534 0.1628 0.0194 0.0317 ... $ pgrass2: num 0.685 0.383 0.557 0.000 0.128 ... $ pgrass1: num 0.759 0.655 0.759 0.000 0.241 ... $ grassdist : num0 0 0 323 30 ... $ grasspatchk: num 3.730 1.023 0.961 0.228 0.263 ... Now I don't like attach, and you just don't need it so I deviate a little now. Replace box.use$use directly and make use of the data argument in glm. Also, your data didn't have any missing data so I'm not sure whether the response or predictor is missing and whether your na.omit is needed or not - I don't use it below. box.use$use - factor(box.use$use, levels=0:1) levels(box.use$use) - c(unused, used) box.use str(box.use) glm1 - glm(use ~ purbank, data = box.use, family = binomial()) summary(glm1) Call: glm(formula = use ~ purbank, family = binomial(), data = box.use) Deviance Residuals: Min1QMedian3Q Max -1.61450 -0.03098 0.31935 0.45888 1.39194 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept)3.223 2.225 1.4480.147 purbank -6.129 4.773 -1.2840.199 (Dispersion parameter for binomial family taken to be 1) Null deviance: 8.9974 on 7 degrees of freedom Residual deviance: 6.5741 on 6 degrees of freedom AIC: 10.574 Number of Fisher Scoring iterations: 5 I suspect something got messed up in your reading of the data and R thought purbank was a factor or character. Always check your data after reading in, and str() is a your friend here as printed representations are not always what they seem. HTH G THE DATA: (200 boxes total, used [0 if unoccupied, 1 occupied], the rest are landscape variables). box use purbank purban2 purban1 pgrassk pgrass2 pgrass1 grassdist grasspatchk 1 1 0.003813435 0.02684564 0.06896552 0.3282487 0.6845638 0.7586207 0 3.73 2 1 0.04429451 0.1610738 0.1724138 0.1534174 0.3825503 0.6551724 0 1.023261 3 1 0.04458785 0.06040268 0 0.1628043 0.5570470.7586207 0 0.9605769 4 1 0.06072162 0.2080537 0.06896552 0.01936052 0 0 323.10990.2284615 5 0 0.6080962 0.6979866 0.6896552 0.03168084 0.1275168 0.2413793 30 0.2627027 6 1 0.6060428 0.6107383 0.3448276 0.04077442 0.2885906 0.4482759 30 0.2978571 7 1 0.3807568 0.4362416 0.6896552 0.06864183 0.03355705 0 94.868330.468 8 0 0.3649164 0.3154362 0.4137931 0.06277501 0.1275168 0 120 0.4585714 THE CODE: box.use- read.csv(c:\\eabl\\2004\\use_logistic2.csv, header=TRUE) attach(box.use) box.use - na.omit(box.use) use - factor(use, levels=0:1) levels(use) - c(unused, used) glm1 - glm(use ~ purbank, binomial) THE OUTPUT: Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept)-4.544e-16 1.414e+00 -3.21e-161.000 purbank02.157e+01 2.923e+04 0.0010.999 purbank0.001173365 2.157e+01 2.067e+04 0.0010.999 purbank0.001466706
[R] glm with nesting
I just had a manuscript returned with the biggest problem being the analysis. Instead of using principal components in a regression I've been asked to analyze a few variables separately. So that's what I'm doing. I pulled a feather from young birds and we quantified certain aspects of the color of those feathers. Since I often have more than one sample from a nest, I thought I should use a nested design. Here's the code I've been using and I'd appreciate if someone could look it over and see if it was correct. bb.glm1 - glm(rtot ~ box/(julian +purbank), data=bbmale, family=gaussian, na.action=na.omit) where rtot = total reflectance, box = nest box (i.e., birdhouse), julian = day of the year and purbank = the proportion of urban cover in a 1 km buffer around the nest box. I'm not interested in the box effect and I've seperated males and female chicks. I've asked about nestedness before and I was given code that included | to indicate nestedness but this indicates a grouping does it not? I suspect that there is something wrong. In the summary I get Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 2.880e-01 3.224e-03 89.322 2e-16 *** box -3.219e-05 6.792e-05 -0.4740.636 box:julian 7.093e-08 3.971e-07 0.1790.859 box:purbank -1.735e-05 1.502e-04 -0.1150.908 The other question I have is how do I test a null hypothesis - no explanatory variables? [rtot ~ NULL?] Many thanks, Jeff Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm with nesting
Peter and list, Thanks for the response. A did add box as a factor (box - factor(box)). Julian should be linear - bluebird chicks are bluer as the season progresses from March to August. I did try the following rtot.lme - lmer(rtot ~ sex +(purban|box:chick) + (purban|box), data=bb, na.action=na.omit) # from H. Doran and rtot.lme2 - lme(fixed=rtot ~ sex + purban + sexv:purban, data = bb, random = ~1 |box) # from K. Jones [EMAIL PROTECTED] but these did not work (months ago and I don't remember exactly why) and I have since seperated males and females and added day of the year (julian). But | does indicate grouping not nested, correct? Could someone suggest some coding that might work? Thanks again, Jeff Peter Dalgaard [EMAIL PROTECTED] 10/05/06 7:14 AM Jeffrey Stratford [EMAIL PROTECTED] writes: I just had a manuscript returned with the biggest problem being the analysis. Instead of using principal components in a regression I've been asked to analyze a few variables separately. So that's what I'm doing. I pulled a feather from young birds and we quantified certain aspects of the color of those feathers. Since I often have more than one sample from a nest, I thought I should use a nested design. Notwithstanding comments below, that quote could be aiming for the fortunes package... Here's the code I've been using and I'd appreciate if someone could look it over and see if it was correct. bb.glm1 - glm(rtot ~ box/(julian +purbank), data=bbmale, family=gaussian, na.action=na.omit) where rtot = total reflectance, box = nest box (i.e., birdhouse), julian = day of the year and purbank = the proportion of urban cover in a 1 km buffer around the nest box. I'm not interested in the box effect and I've seperated males and female chicks. I've asked about nestedness before and I was given code that included | to indicate nestedness but this indicates a grouping does it not? I suspect that there is something wrong. In the summary I get Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 2.880e-01 3.224e-03 89.322 2e-16 *** box -3.219e-05 6.792e-05 -0.4740.636 box:julian 7.093e-08 3.971e-07 0.1790.859 box:purbank -1.735e-05 1.502e-04 -0.1150.908 Several things look wrong here. Most importantly, you appear to have single-degree of freedom effects (t tests) of things that appear not to be linear effects: Certainly, you have more than two nest boxes, but also day of year as a linear term looks suspicious to me. Unless there is something I have missed completely, box should be a factor variable, and you might also need trigonometric terms for the julian effect (depending on what sort of time spans we are talking about.) Secondly, notation like box/julian suggests that julian only makes sense within a nest box i.e. 1st of March in one box is completely different from 1st of March in another box (the notation is more commonly used to describe bird number within nests and the like). And with purbank presumably constant for measurements from the same box, the box:purbank term looks strange indeed. If you want to take account of a between-box variation in the effect of covariates, you probably need to add them as variance components, but this requires non-glm software, either lme() or lmer(). However, instructing you on those is outside the scope of this mailing list, and you may need to find a local consultant. The other question I have is how do I test a null hypothesis - no explanatory variables? [rtot ~ NULL?] Many thanks, Jeff Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm with nesting
Harold and list, I've changed a few things since the last time so I'm really starting from scratch. I start with bbmale - read.csv(c:\\eabl\\2004\\feathers\\male_feathers2.csv, header=TRUE) box -factor(box) chick - factor(chick) Here's a sample of the data box,chick,julian,cltchsz,mrtot,cuv,cblue,purbank,purban2,purban1,pgrassk,pgrass2,pgrass1,grassdist,grasspatchk 1,2,141,2,21.72290152,0.305723811,0.327178868,0.003813435,0.02684564,0.06896552,0.3282487,0.6845638,0.7586207,0,3.73 4,1,164,4,18.87699007,0.281863299,0.310935559,0.06072162,0.2080537,0.06896552,0.01936052,0,0,323.1099,0.2284615 4,2,164,4,19.64359348,0.294117388,0.316049817,0.06072162,0.2080537,0.06896552,0.01936052,0,0,323.1099,0.2284615 7,1,118,4,13.48699876,0.303649408,0.31765218,0.3807568,0.4362416,0.6896552,0.06864183,0.03355705,0,94.86833,0.468 12,1,180,4,21.42196378,0.289731361,0.317562294,0.09238011,0.1342282,0,0.2430127,0.8322148,1,0,1.199032 12,2,180,4,18.79487905,0.286052077,0.316367349,0.09238011,0.1342282,0,0.2430127,0.8322148,1,0,1.199032 12,3,180,4,12.83127682,0.260197475,0.292636914,0.09238011,0.1342282,0,0.2430127,0.8322148,1,0,1.199032 15,1,138,4,20.07161467,0.287632782,0.318671887,0.07046477,0.03355705,0.03448276,0.2755622,0.6577181,0.8275862,0,1.503818 15,2,138,4,17.61146256,0.305581768,0.315848051,0.07046477,0.03355705,0.03448276,0.2755622,0.6577181,0.8275862,0,1.503818 15,3,138,4,20.36397134,0.271795667,0.30539683,0.07046477,0.03355705,0.03448276,0.2755622,0.6577181,0.8275862,0,1.503818 15,4,138,4,20.81940158,0.269468041,0.304160648,0.07046477,0.03355705,0.03448276,0.2755622,0.6577181,0.8275862,0,1.503818 As you can see I have multiple boxes ( 70). Sometimes I have multiple chicks per box each having their own response to mrtot, cuv, and cblue but the same landscape variables for that box. Chick number is randomly assigned and is not an effect I'm interested in. I'm really not interested in the box effect either. I would like to know if landscape affects the color of chicks (which may be tied into chick health/physiology). We also know that chicks get bluer as the season progresses and that clutch size (cltchsz) has an effect so I'm including that as covariates. Hopefully, this clears things up a bit. I do have the MASS and MEMS (Pineiro's) texts in hand. Many thanks, Jeff __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 3-dimensional table
Hi, Last week my class conducted an experiment by putting out clay caterpillars to look at the effects of urbanization, color, and location on caterpillar predation. There were two sites (urban, rural), three colors (green, yellow, red) and two locations at each site (edge, interior). The entire data set is below. I've checked out the MASS book, Dalgaard's book, and the R-help archives and I haven't found anything that suggests how to set up a spreadsheet for the xtab function (say, xtab(predation ~ location + site + color, data=class). It would not be a problem to input the data by hand but I wouldn't know how to set that up either. Any suggestions would be greatly appreciated. The class is mostly college sophmores and juniors and biology and education majors. We are using R 2.2.1 on Windows XP. Again, many thanks, Jeff sitelocationcolor predated Urban EdgeGreen 5 Urban EdgeRed 30 Urban EdgeYellow 11 Urban InteriorGreen 11 Urban InteriorRed 22 Urban InteriorYellow 22 Rural EdgeGreen 94 Rural EdgeRed 40 Rural EdgeYellow 67 Rural InteriorGreen 40 Rural InteriorRed 70 Rural InteriorYellow 33 Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] nested ANCOVA: still confused
Harold, Kingsford, and R-users, I settled on using the lmer function. I think the memory issue was more a function of my poor coding than an actual memory problem. I also switched the label from box to clutch to avoid any potential confusion with other functions. This coding seems to have worked: eabl - lmer(rtot ~ sexv + (purban2|clutch), maxIter=1000, data=bb, na.action=na.omit) However, I have two remaining questions: (1)how concerned should I be with the warning message below and (2) is there a way to invoke output to get an estimate of the effect of purban2 (the proportion of urban cover 200 m around a box) on feather color (rtot) and if there is a difference between the sexes? I used the summary function and it doesn't tell me much (see output below). I'll read up mixed models when Pinheiro arrives but any suggestions for diagnostics? I'm going to repeat this study and expand it by doubling or tripling the number of birds. Warning message: nlminb returned message false convergence (8) in: LMEoptimize-(`*tmp*`, value = list(maxIter = 200, tolerance = 1.49011611938477e-08, summary(eabl) Linear mixed-effects model fit by REML Formula: rtot ~ sexv + (purban2 | clutch) Data: bb AIC BIClogLik MLdeviance REMLdeviance 5164.284 6997.864 -2052.142 4128.792 4104.284 Random effects: Groups Name Variance Std.Dev. Corr clutch (Intercept) 502829 709.10 purban20 1341990 1158.44 -0.477 purban20.006711409 5683957 2384.10 -0.226 0.082 purban20.013422821772922 1331.51 -0.386 0.176 0.067 . . . . . # of obs: 235, groups: clutch, 74 Fixed effects: Estimate Std. Error t value (Intercept) 5950.01 241.59 24.628 sexvm1509.07 145.73 10.355 Correlation of Fixed Effects: (Intr) sexvm -0.304 Thanks many time over, Jeff Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja Doran, Harold [EMAIL PROTECTED] 01/25/06 6:37 AM OK, we're getting somewhere. First, it looks as though (by the error message) that you have a big dataset. My first recommendation is to use lmer instead of lme, you will see a significant benefit in terms of computional speed. For the model this would be lmer(rtot ~ sexv +(purban|box:chick) + (purban|box), bb, na.action=na.omit) Now, you have run out of memory. I don't know what operating system you are using, so go and see the appropriate FAQ for increasing memory for your OS. Second, I made a mistake in my reply. Your random statement should be random=~purban|box/chick denoting that chicks are nested in boxes, not boxes nested in chicks, sorry about that. Now, why is it that each chick within box has the same value for purban? If this is so, why are you fitting that as a random effect? It seems not to vary across individual chicks, right? It seems there is only an effect of box and not an effect for chicks. Why not just fit a random effect only for box such as: rtot.lme - lme(fixed=rtot~sexv, random=~purban2|box, na.action=na.omit,bb) or in lmer lmer(rtot ~ sexv + (purban|box), bb, na.action=na.omit) Harold -Original Message- From: Jeffrey Stratford [mailto:[EMAIL PROTECTED] Sent: Tue 1/24/2006 8:57 PM To: Doran, Harold; r-help@stat.math.ethz.ch Cc: Subject:RE: [R] nested ANCOVA: still confused R-users and Harold. First, thanks for the advice; I'm almost there. The code I'm using now is library(nlme) bb - read.csv(E:\\eabl_feather04.csv, header=TRUE) bb$sexv - factor(bb$sexv) rtot.lme - lme(fixed=rtot~sexv, random=~purban2|chick/box, na.action=na.omit, data=bb) A sample of the data looks like this box chick rtotpurban2 sexv 1 1 6333.51 0.026846f 1 2 8710.8840.026846m 2 1 5810.0070.161074f 2 2 5524.33 0.161074f 2 3 4824.4740.161074f 2 4 5617.6410.161074f 2 5 6761.7240.161074f 4 1 7569.6730.208054m 4 2 7877.0810.208054m 4 4 7455.55 0.208054f 7 1 5408.2870.436242m 10 1 6991.7270.14094 f 12 1 8590.2070.134228f 12 2 7536.747
[R] nested ANCOVA: still confused
Dear R-users, I did some more research and I'm still not sure how to set up an ANCOVA with nestedness. Specifically I'm not sure how to express chicks nested within boxes. I will be getting Pinheiro Bates (Mixed Effects Models in S and S-Plus) but it will not arrive for another two weeks from our interlibrary loan. The goal is to determine if there are urbanization (purban) effects on chick health (rtot) and if there are differences between sexes (sex) and the effect of being in the same clutch (box). The model is rtot = sex + purban + (chick)box. I've loaded the package lme4. And the code I have so far is bb - read.csv(C:\\eabl\\eabl_feather04.csv, header=TRUE) bb$sex - factor(bb$sex) rtot.lme - lme(bb$rtot~bb$sex, bb$purban|bb$chick/bb$box, na.action=na.omit) but this is not working. Any suggestions would be greatly appreciated. Thanks, Jeff Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] nested ANCOVA: still confused
R-users and Harold. First, thanks for the advice; I'm almost there. The code I'm using now is library(nlme) bb - read.csv(E:\\eabl_feather04.csv, header=TRUE) bb$sexv - factor(bb$sexv) rtot.lme - lme(fixed=rtot~sexv, random=~purban2|chick/box, na.action=na.omit, data=bb) A sample of the data looks like this box chick rtotpurban2 sexv 1 1 6333.51 0.026846f 1 2 8710.8840.026846m 2 1 5810.0070.161074f 2 2 5524.33 0.161074f 2 3 4824.4740.161074f 2 4 5617.6410.161074f 2 5 6761.7240.161074f 4 1 7569.6730.208054m 4 2 7877.0810.208054m 4 4 7455.55 0.208054f 7 1 5408.2870.436242m 10 1 6991.7270.14094 f 12 1 8590.2070.134228f 12 2 7536.7470.134228m 12 3 5145.3420.134228m 12 4 6853.6280.134228f 15 1 8048.7170.033557m 15 2 7062.1960.033557m 15 3 8165.9530.033557m 15 4 8348.58 0.033557m 16 2 6534.7750.751678m 16 3 7468.8270.751678m 16 4 5907.3380.751678f 21 1 7761.9830.221477m 21 2 6634.1150.221477m 21 3 6982.9230.221477m 21 4 7464.0750.221477m 22 1 6756.7330.281879f 23 2 8231.4960.134228m The error I'm getting is Error in logLik.lmeStructInt(lmeSt, lmePars) : Calloc could not allocate (590465568 of 8) memory In addition: Warning messages: 1: Fewer observations than random effects in all level 2 groups in: lme.formula(fixed = rtot ~ sexv, random = ~purban2 | chick/box, 2: Reached total allocation of 382Mb: see help(memory.size) There's nothing special about chick 1, 2, etc. These were simply the order of the birds measured in each box so chick 1 in box 1 has nothing to do with chick 1 in box 2. Many thanks, Jeff Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja Doran, Harold [EMAIL PROTECTED] 01/24/06 2:04 PM Dear Jeff: I see the issues in your code and have provided what I think will solve your problem. It is often much easier to get help on this list when you provide a small bit of data that can be replicated and you state what the error messages are that you are receiving. OK, with that said, here is what I see. First, you do not need to use the syntax bb$sex in your model, this can be significantly simplified. Second, you do not have a random statement in your model. Here is your original model: lme(bb$rtot~bb$sex, bb$purban|bb$chick/bb$box, na.action=na.omit) Here is what it should be: lme(fixed = rtot~sex, random=~purban|chick/box, na.action=na.omit, data=bb) Notice there is a fixed and random call. You can simplify this as lme(rtot~sex, random=~purban|chick/box, na.action=na.omit, bb) Note, you can eliminate the fixed= portion but not the random statement. Last, if you want to do this in lmer, the newer function for mixed models in the Matrix package, you would do lmer(rtot~sex + (purban|box:chick) + (purban|box), na.action=na.omit, data=bb) Hope this helps. Harold -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey Stratford Sent: Tuesday, January 24, 2006 11:34 AM To: r-help@stat.math.ethz.ch Subject: [R] nested ANCOVA: still confused Dear R-users, I did some more research and I'm still not sure how to set up an ANCOVA with nestedness. Specifically I'm not sure how to express chicks nested within boxes. I will be getting Pinheiro Bates (Mixed Effects Models in S and S-Plus) but it will not arrive for another two weeks from our interlibrary loan. The goal is to determine if there are urbanization (purban) effects on chick health (rtot) and if there are differences between sexes (sex) and the effect of being in the same clutch (box). The model is rtot = sex + purban + (chick)box. I've loaded the package lme4. And the code I have so far is bb - read.csv(C:\\eabl\\eabl_feather04.csv, header=TRUE) bb$sex - factor(bb$sex) rtot.lme - lme(bb$rtot~bb$sex, bb$purban|bb$chick/bb$box, na.action=na.omit) but this is not working. Any suggestions would be greatly appreciated. Thanks, Jeff Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn
[R] regression with nestedness
Dear R-users, I set up an experiment where I put up bluebird boxes across an urbanization gradient. I monitored these boxes and at some point I pulled a feather from a chick and a friend used spectral properties (rtot, a continuous var) to index chick health. There is an effect of sex that I would like to include but how would I set up a regression and look at the effect of urbanization (purban, a continuous var)) on feather properties of chicks within boxes. So the model should look something like rtot = sex + purban + (chick)clutch Also, when I plot purban against rtot using the plot function I get boxplots but I would like to ignore the clutch and just plot each point. I've tried type = p but this has no effect. Thanks, Jeff Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] cv.glm help: no responses?
I sent out a question a few days ago asking help with cv.glm and I didn't get any responses. Is it my question? I'd appreciate any response just to see my post makes it out there. What I'm looking to do is to get a column of predicted responses from cv.glm (leave-one-out). This produces a single predicted value for each case, correct? How can I get those values? I don't want predicted values using a full model but just the values that come from the cross validation. Thanks, Jeff Here's the code I used for the model and to get delta: q_uk.nb - glm.nb(nat_est ~ UK + I(UK^2), SRCOUNT, link=log) cv.err1 - cv.glm(SRCOUNT,q_uk.nb) Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] predicted values from cv.glm
Hi folks, Is there a way to get the predicted values from leave-one-out cross validation using cv.glm? More generally, is there a way to see what output is available with any function that may not show up using the help() function? Below is the code that I've been using: SRCOUNT - read.table(file.choose(),header=T) library(boot) library(MASS) q_u2 - glm.nb(res_est ~ U2 + I(U2^2), SRCOUNT, link = log) cv.qu2- cv.glm(SRCOUNT,qu2) Many thanks, Jeff Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] predicted values from cv.glm
Hi. Is there a way to get the values predicted from (leave-one-out) cv.glm? It seems like a useful diagnostic to plot observed vs. predicted values. Thanks, Jeff Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] cv.glm help
I would appreciate some help writing R code to plot predicted species richness vs. observed species richness (nat_est) with 95% CI lines. I'm using glm to get model coefficients and remove-one cross validation to get predicted (cv.glm). Here is what I have for the code so far: SRCOUNT - read.table(file.choose(),header=T) library(boot) library(MASS) quk.native - glm.nb(nat_est ~ UK + I(UK^2), SRCOUNT, link=log) cv.quk - cv.glm(SRCOUNT, quk) Many thanks, Jeff Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] glm x^2
R-users, I'm having some trouble getting .glm and glm.nb to run a polynomial. I've used x*x and x^2 and neither works. I've checked out the archives and they refer to an archive that's no longer working. I've seen that they use poly() but I'm following up my analysis with cv.glm so I'd prefer to keep using glm. It's easier to just add a column to my data but I'd rather code it. Thanks for the response... I appreciate the people that work on the list. Jeff Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Poisson/negbin followed by jackknife
Folks, Thanks for the help with the hier.part analysis. All the problems stemmed from an import problem which was solved with file.chose(). Now that I have the variables that I'd like to use I need to run some GLM models. I think I have that part under control but I'd like to use a jackknife approach to model validation (I was using a hold out sample but this seems to have fallen out of favor). I'd appreciate it if someone could just point me in the right direction for the jackkife analysis given a particular distribution, coefficients, etc. Thanks, Jeff Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] index question
Thanks for those on the list that answered my previous question. I'm just about where I need to be (looking at output). In the hier.part documentation there is a line env - urbanwq[,2:8]. This means use rows 2 through 8 in the data frame urbanwq, right? What does the comma represent? If one wasn't using column headers would this be necessary? Thanks, Jeff Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] help with hier.part
R-users, Attached is the file (SR_use2.txt) I'd like to include and includes column headers. nat_est is the response variable and is the number of species at a particular point. The other variables are the explanatory vars (vark, var2, var1, UK, U2, U1, GK, G2, G1, PK, P2, P1). Here is Walsh's sample code for hier.part: data(urbanwq) env - urbanwq[,2,8] hier.part(urbanwq$lec, env, fam=gaussian, gof=Rssqu) The code I wrote is library(hier.part) SRUSE- read.table(F:\\GEORGIA\\species_richness\\SR_use2.txt, sep= , header = TRUE, row.names = 1) TEMP- SRUSE[2:13] hier.part(SRUSE$nat_est,TEMP, family=NegBin, gof=logLik, barplot= TRUE) So far this doesn't work and I'd really appreciate some help. While I have your ears, what books would one make for the clueless? Many thanks, Jeff PS. nat_est is the estimated number of species (species richness). Around each of the sampling points I calculated the % of different types of cover (pine, hardwoods, number of different covers) in three scales around the sampling points (1000, 200, and 100 m). What I'm hoping to do with the analysis is to find the best scales and parameters that best predicts species richness. . Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Import help (neophyte)
Hi, I have no experience with R and I'm finding the manuals a bit obtuse and written as if I already understood R. I'm trying to import a csv file from a floppy and it's not working. The code I'm using is read.table(F:\GEORGIA\species_richness\SR_use.csv, sep=,, header = TRUE, row.names = 1) I'm assuming that this command is case sensitive so everything matches. Is there something I'm missing? Thanks, Jeff Jeffrey A. Stratford, Ph.D. Postdoctoral Associate 331 Funchess Hall Department of Biological Sciences Auburn University Auburn, AL 36849 334-329-9198 FAX 334-844-9234 http://www.auburn.edu/~stratja __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html