Re: [R] using ddply with segmented regression
Hello, the code provided by arun did the trick. Thank you very much, arun. However, I'm now unsure of how to further process the results . Looking at the vignette aka split-apply-combine. It appears that I could now create a dataframe from the list of results, and then run the results through the function plot.segmented to view the piecewise regressions by the grouping variable Lot.Run. However, the list is not in the structure expected by ldply -- SP.seg - dlply((df,.(Lot.Run),segmentf_df) SP.out - ldply(SP.seg) [9] ERROR: Results must be all atomic, or all data frames class(SP.seg)[[1]] [1] list head(SP.seg) $`J062431-1` Call: segmented.lm(obj = out.lm, seg.Z = ~Cycle, psi = (Cycle = NA), control = seg.control(stop.if.error = FALSE, n.boot = 0, gap = FALSE, jt = FALSE, nonParam = TRUE)) Meaningful coefficients of the linear terms: (Intercept)Cycle U1.Cycle U2.Cycle U3.Cycle U4.Cycle U5.Cycle U6.Cycle 40.11786 -0.06664 -0.68539 0.49316 0.14955 0.03612 0.22257 -0.41166 U7.Cycle U8.Cycle U9.CycleU10.Cycle -0.48365 0.37949 0.24945 0.06712 Estimated Break-Point(s) psi1.Cycle psi2.Cycle psi3.Cycle psi4.Cycle psi5.Cycle psi6.Cycle psi7.Cycle psi8.Cycle psi9.Cycle psi10.Cycle : 19.67 34.31 51.02 72.10 97.94 117.20 130.10 147.10 155.70 160.40 $`J062431-2` Call: segmented.lm(obj = out.lm, seg.Z = ~Cycle, psi = (Cycle = NA), control = seg.control(stop.if.error = FALSE, n.boot = 0, gap = FALSE, jt = FALSE, nonParam = TRUE)) Meaningful coefficients of the linear terms: (Intercept)Cycle U1.Cycle U2.Cycle U3.Cycle U4.Cycle U5.Cycle U6.Cycle 40.11786 -0.06664 -0.68539 0.49316 0.14955 0.03612 0.22257 -0.41166 U7.Cycle U8.Cycle U9.CycleU10.Cycle -0.48365 0.37949 0.24945 0.06712 Estimated Break-Point(s) psi1.Cycle psi2.Cycle psi3.Cycle psi4.Cycle psi5.Cycle psi6.Cycle psi7.Cycle psi8.Cycle psi9.Cycle psi10.Cycle : 19.67 34.31 51.02 72.10 97.94 117.20 130.10 147.10 155.70 160.40 My hope was to eventually increase my understanding enough to create lattice plots using 'segment.plot' via ldply. Will that even work with the output object from this segmented package? Thanks,Paul Paul Prew | Statistician 651-795-5942 | fax 651-204-7504 Ecolab Research Center | Mail Stop ESC-F4412-A 655 Lone Oak Drive | Eagan, MN 55121-1560 -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Saturday, October 12, 2013 1:42 AM To: R help Cc: Prew, Paul Subject: Re: [R] using ddply with segmented regression Hi, Try: segmentf_df - function(df) { out.lm-lm(deltaWgt~Cycle, data=df) segmented(out.lm,seg.Z=~Cycle, psi=(Cycle=NA),control=seg.control(stop.if.error=FALSE,n.boot=0)) } library(plyr) library(segmented) dlply(df,.(Lot.Run),segmentf_df) $`J062431-1` Call: segmented.lm(obj = out.lm, seg.Z = ~Cycle, psi = (Cycle = NA), control = seg.control(stop.if.error = FALSE, n.boot = 0)) Meaningful coefficients of the linear terms: (Intercept) Cycle U1.Cycle U2.Cycle 38.480 1.130 -2.760 1.497 Estimated Break-Point(s) psi1.Cycle psi2.Cycle : 3.732 5.056 $`J062431-2` Call: segmented.lm(obj = out.lm, seg.Z = ~Cycle, psi = (Cycle = NA), control = seg.control(stop.if.error = FALSE, n.boot = 0)) Meaningful coefficients of the linear terms: (Intercept) Cycle U1.Cycle U2.Cycle 48.4300 -3.2500 3.0905 -0.6555 Estimated Break-Point(s) psi1.Cycle psi2.Cycle : 2.12 22.15 attr(,split_type) [1] data.frame attr(,split_labels) Lot.Run 1 J062431-1 2 J062431-2 #or dlply(df,.(Lot.Run),function(x) segmentf_df(x)) #or lapply(split(df,df$Lot.Run,drop=TRUE),function(x) segmentf_df(x)) A.K. On Friday, October 11, 2013 11:16 PM, Prew, Paul paul.p...@ecolab.com wrote: Hello, I’m unsuccessfully trying to apply piecewise linear regression over each of 22 groups. The data structure of the reproducible toy dataset is below. I’m using the ‘segmented’ package, it worked fine with a data set that containing only one group (“Lot.Run”). $ Cycle : int 1 2 3 4 5 6 7 8 9 10 ... $ Lot.Run : Factor w/ 22 levels J062431-1,J062431-2,..: 1 1 1 1 1 1 1 1 1 1 ... $ deltaWgt: num 38.7 42.6 41 42.3 40.6 ... I am new to ‘segmented’, and also new to ‘plyr’, which is how I’m trying to apply this segmented regression to the 22 Lot.Run groups. Within a Lot.Run, the piecewise linear regressions are deltaWgt vs. Cycle. # define the linear regression # out.lm-lm(deltaWgt~Cycle, data=Test50.df) # define the function called by dlply # # find cutpoints via bootstrapping, fit the piecewise regressions # segmentf_df - function(df) { segmented(out.lm,seg.Z=~Cycle, psi=(Cycle=NA),control=seg.control(stop.if.error=FALSE
Re: [R] using ddply with segmented regression
Yes, David, sorry for the confusion. I forgot arun had proposed a genericized solution, where the dataframe name was shortened from Test50.df to df. I could have replaced arun's 'df' references to 'Test50.df', but lazily changed my Test50.df to df and used arun's code verbatim. The unmatched parenthesis you caught (cutpaste error) is fixed here: SP.seg - dlply(df,.(Lot.Run),segmentf_df) SP.out - ldply(SP.seg) As for your (excerpted) question -- *So what function were you intending to be used in that call to ldply ... If you are using ldply to process the models with plot.segmented, .. object returned will be an empty dataframe but the plots will be done.* I wanted to look at the data frame first so I could understand the model output represented in row x column format. Then I planned to try ldply(SP.seg, segmented.plot) If the above printed out the 22 piecewise regressions individually, then the next step would be determining how they could be arranged in a 1-page lattice. I think the chemist who generated the results would profit from that. The 1-pager would require me investigating lattice or ggplot2. Paul Prew | Statistician 651-795-5942 | fax 651-204-7504 Ecolab Research Center | Mail Stop ESC-F4412-A 655 Lone Oak Drive | Eagan, MN 55121-1560 -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Monday, October 14, 2013 5:30 PM To: Prew, Paul Cc: arun; R help Subject: Re: [R] using ddply with segmented regression On Oct 14, 2013, at 2:57 PM, Prew, Paul wrote: Hello, the code provided by arun did the trick. Thank you very much, arun. However, I'm now unsure of how to further process the results . Looking at the vignette aka split-apply-combine. It appears that I could now create a dataframe from the list of results, and then run the results through the function plot.segmented to view the piecewise regressions by the grouping variable Lot.Run. However, the list is not in the structure expected by ldply -- SP.seg - dlply(df,.(Lot.Run),segmentf_df) That wasn't the name of the dataframe you offered in the first post and this code could not possibly have not thrown an error since there are unmatched parens. SP.out - ldply(SP.seg) So what function were you intending to be used in that call to ldply ... after you fix the errors above? If you are using ldply to process the models with plot.segmented, then realize that the object returned will be an empty dataframe but hte plots will be done. [9] ERROR: Results must be all atomic, or all data frames class(SP.seg)[[1]] [1] list head(SP.seg) $`J062431-1` Call: segmented.lm(obj = out.lm, seg.Z = ~Cycle, psi = (Cycle = NA), control = seg.control(stop.if.error = FALSE, n.boot = 0, gap = FALSE, jt = FALSE, nonParam = TRUE)) Meaningful coefficients of the linear terms: (Intercept)Cycle U1.Cycle U2.Cycle U3.Cycle U4.Cycle U5.Cycle U6.Cycle 40.11786 -0.06664 -0.68539 0.49316 0.14955 0.03612 0.22257 -0.41166 U7.Cycle U8.Cycle U9.CycleU10.Cycle -0.48365 0.37949 0.24945 0.06712 Estimated Break-Point(s) psi1.Cycle psi2.Cycle psi3.Cycle psi4.Cycle psi5.Cycle psi6.Cycle psi7.Cycle psi8.Cycle psi9.Cycle psi10.Cycle : 19.67 34.31 51.02 72.10 97.94 117.20 130.10 147.10 155.70 160.40 $`J062431-2` Call: segmented.lm(obj = out.lm, seg.Z = ~Cycle, psi = (Cycle = NA), control = seg.control(stop.if.error = FALSE, n.boot = 0, gap = FALSE, jt = FALSE, nonParam = TRUE)) Meaningful coefficients of the linear terms: (Intercept)Cycle U1.Cycle U2.Cycle U3.Cycle U4.Cycle U5.Cycle U6.Cycle 40.11786 -0.06664 -0.68539 0.49316 0.14955 0.03612 0.22257 -0.41166 U7.Cycle U8.Cycle U9.CycleU10.Cycle -0.48365 0.37949 0.24945 0.06712 Estimated Break-Point(s) psi1.Cycle psi2.Cycle psi3.Cycle psi4.Cycle psi5.Cycle psi6.Cycle psi7.Cycle psi8.Cycle psi9.Cycle psi10.Cycle : 19.67 34.31 51.02 72.10 97.94 117.20 130.10 147.10 155.70 160.40 My hope was to eventually increase my understanding enough to create lattice plots using 'segment.plot' via ldply. Will that even work with the output object from this segmented package? Hard to tell. You seem to be changing the names of your objects at random. Thanks,Paul Paul Prew | Statistician 651-795-5942 | fax 651-204-7504 Ecolab Research Center | Mail Stop ESC-F4412-A 655 Lone Oak Drive | Eagan, MN 55121-1560 -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Saturday, October 12, 2013 1:42 AM To: R help Cc: Prew, Paul Subject: Re: [R] using ddply with segmented regression Hi, Try: segmentf_df - function(df) { out.lm-lm(deltaWgt~Cycle, data=df) segmented
[R] using ddply with segmented regression
Hello, Iâm unsuccessfully trying to apply piecewise linear regression over each of 22 groups. The data structure of the reproducible toy dataset is below. Iâm using the âsegmentedâ package, it worked fine with a data set that containing only one group (âLot.Runâ). $ Cycle : int 1 2 3 4 5 6 7 8 9 10 ... $ Lot.Run : Factor w/ 22 levels J062431-1,J062431-2,..: 1 1 1 1 1 1 1 1 1 1 ... $ deltaWgt: num 38.7 42.6 41 42.3 40.6 ... I am new to âsegmentedâ, and also new to âplyrâ, which is how Iâm trying to apply this segmented regression to the 22 Lot.Run groups. Within a Lot.Run, the piecewise linear regressions are deltaWgt vs. Cycle. # define the linear regression # out.lm-lm(deltaWgt~Cycle, data=Test50.df) # define the function called by dlply # # find cutpoints via bootstrapping, fit the piecewise regressions # segmentf_df - function(df) { segmented(out.lm,seg.Z=~Cycle, psi=(Cycle=NA),control=seg.control(stop.if.error=FALSE,n.boot=0)), data = df) } at this point, thereâs an error message 23] ERROR: text # repeat for each Lot.Run group # dlply(Test50.df, .(Lot.Run), segmentf_df) at this point, thereâs an error message [28] ERROR: object 'segmentf_df' not found Any suggestions? Thanks, Paul dput(Test50.df) structure(list(Cycle = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L), Lot.Run = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(J062431-1, J062431-2, J062431-3, J062432-1, J062432-2, J062433-1, J062433-2, J062433-3, Lot 1-1, Lot 1-2, Lot 2-1, Lot 2-2, Lot 2-3, Lot 3-1, Lot 3-2, Lot 3-3, P041231-1, P041231-2, P041531-1, P041531-2, P041531-3, P041531-4), class = factor), deltaWgt = c(38.69, 42.58, 40.95, 42.26, 40.63, 41.61, 36.73, 41.28, 39.98, 40.63, 39.66, 39.98, 40.95, 38.36, 39.01, 39, 38.03, 39.66, 37.7, 39.66, 40.63, 38.03, 37.71, 36.73, 37.7, 45.18, 41.93, 42.59, 39.98, 40.95, 42.91, 38.03, 40.96, 39, 41.61, 39.33, 43.88, 39.98, 38.68, 38.68, 36.08, 39.99, 38.35, 40.31, 40.63, 38.68, 37.05, 38.36, 35.43, 36.73)), .Names = c(Cycle, Lot.Run, deltaWgt), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 207L, 208L, 209L, 210L, 211L, 212L, 213L, 214L, 215L, 216L, 217L, 218L, 219L, 220L, 221L, 222L, 223L, 224L, 225L, 226L, 227L, 228L, 229L, 230L, 231L), class = data.frame) Paul Prew ⪠Statistician 651-795-5942 ⪠fax 651-204-7504 Ecolab Research Center ⪠Mail Stop ESC-F4412-A 655 Lone Oak Drive ⪠Eagan, MN 55121-1560 CONFIDENTIALITY NOTICE: This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] assign index to colnames(matrix)
Hello, Iâm trying to follow the syntax of a script from a journal website. In order to create a regression formula used later in the script, the regression matrix must have column names âX1â, âX2â, etc. I have tried to assign these column names to my matrix ScoutRSM.mat using a for loop, but I donât know how to interpret the error message. Suggestions? Thanks, Paul == instructions about dimnames(X)[[2]] = X1,X2,... === function(binstr) { # makes formula for regression # must make sure dimnames(X)[[2]] = X1,X2,... ind-which(binstr==1) rht - paste(X, ind, sep=, collapse=+) ans-paste(y ~ , rht) return(as.formula(ans)) } ### ## my for loop ## ### for (i in 1:dim(ScoutRSM.mat)[2] { colnames(ScoutRSM.mat)[i] - paste(X, i, sep = ââ)) } for(i in 1:dim(ScoutRSM.mat)[2]) { + colnames(ScoutRSM.mat)[i] - paste(X,i, sep = ) + } Error in `colnames-`(`*tmp*`, value = X1) : length of 'dimnames' [2] not equal to array extent ~~ ScoutRSM.mat ~~ ~~ dput(ScoutRSM.mat) structure(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, -1, 1, -1, 0, -1, 0, 0, 0, 1, 1, 0, 1, -1, 1, -1, 1, 0, -1, -0.464, -0.516, -0.48, 1, -0.008, 0.486, 0, 0.524, -0.96, -1, -0.526, 1, 0.47, 0.49, 0, 0.53, -0.866, -0.6, 0.227, -1, -0.895, 1.289, 0.181, 0.204, -1, 0.199, -0.908, -1, -1, 1.289, 1.289, 0.204, -1, 0.213, -0.922, -1, -0.84, -1, 0.2, -1, -1, 0.05, -1, 1.136, -0.861, -0.861, -1, 1.136, 0.193, 0.047, -1, -1, -0.733, -1, -0.817, 0.14, -1, 1, 0.127, -0.05, 1, -1, -0.797, -0.797, 0.123, -1, -1, -0.04, 1, 1, -0.75, 0, 0.464, -0.516, 0.48, 0, 0.008, 0, 0, 0, -0.96, -1, 0, 1, -0.47, 0.49, 0, 0.53, 0, 0.6, -0.227, -1, 0.895, 0, -0.181, 0, 0, 0, -0.908, -1, 0, 1.289, -1.289, 0.204, 1, 0.213, 0, 1, 0.84, -1, -0.2, 0, 1, 0, 0, 0, -0.861, -0.861, 0, 1.136, -0.193, 0.047, 1, -1, 0, 1, 0.817, 0.14, 1, 0, -0.127, 0, 0, 0, -0.797, -0.797, 0, -1, 1, -0.04, -1, 1, 0, 0, -0.105, 0.516, 0.429, 1.289, -0.001, 0.099, 0, 0.104, 0.872, 1, 0.526, 1.289, 0.606, 0.1, 0, 0.113, 0.799, 0.6, 0.39, 0.516, -0.096, -1, 0.008, 0.025, 0, 0.596, 0.827, 0.861, 0.526, 1.136, 0.091, 0.023, 0, -0.53, 0.635, 0.6, 0.379, -0.072, 0.48, 1, -0.001, -0.024, 0, -0.524, 0.765, 0.797, -0.065, -1, -0.47, -0.02, 0, 0.53, 0.65, 0, -0.19, 1, -0.179, -1.289, -0.181, 0.01, 1, 0.226, 0.782, 0.861, 1, 1.464, 0.249, 0.01, 1, -0.213, 0.676, 1, -0.185, -0.14, 0.895, 1.289, 0.023, -0.01, -1, -0.199, 0.724, 0.797, -0.123, -1.289, -1.289, -0.008, -1, 0.213, 0.692, 0, 0.686, -0.14, -0.2, -1, -0.127, -0.003, -1, -1.136, 0.686, 0.686, -0.123, -1.136, -0.193, -0.002, -1, -1, 0.55, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0.215, 0.266, 0.23, 1, 0, 0.236, 0, 0.275, 0.922, 1, 0.277, 1, 0.221, 0.24, 0, 0.281, 0.75, 0.36, 0.051, 1, 0.801, 1.66, 0.033, 0.042, 1, 0.04, 0.825, 1, 1, 1.66, 1.66, 0.042, 1, 0.045, 0.85, 1, 0.705, 1, 0.04, 1, 1, 0.003, 1, 1.292, 0.742, 0.742, 1, 1.292, 0.037, 0.002, 1, 1, 0.537, 1, 0.667, 0.02, 1, 1, 0.016, 0.003, 1, 1, 0.635, 0.635, 0.015, 1, 1, 0.002, 1, 1, 0.563, 0, 0.105, 0.516, -0.429, 0, 0.001, 0, 0, 0, 0.872, 1, 0, 1.289, -0.606, 0.1, 0, 0.113, 0, -0.6, -0.39, 0.516, 0.096, 0, -0.008, 0, 0, 0, 0.827, 0.861, 0, 1.136, -0.091, 0.023, 0, -0.53, 0, -0.6, -0.379, -0.072, -0.48, 0, 0.001, 0, 0, 0, 0.765, 0.797, 0, -1, 0.47, -0.02, 0, 0.53, 0, 0, 0.19, 1, 0.179, 0, 0.181, 0, 0, 0, 0.782, 0.861, 0, 1.464, -0.249, 0.01, -1, -0.213, 0, -1, 0.185, -0.14, -0.895, 0, -0.023, 0, 0, 0, 0.724, 0.797, 0, -1.289, 1.289, -0.008, 1, 0.213, 0, 0, -0.686, -0.14, 0.2, 0, 0.127, 0, 0, 0, 0.686, 0.686, 0, -1.136, 0.193, -0.002, 1, -1, 0, 0, 0.088, -0.516, 0.086, -1.289, 0.001, 0.005, 0, 0.119, -0.751, -0.861, -0.526, 1.464, 0.117, 0.005, 0, -0.113, -0.585, -0.6, 0.086, 0.072, -0.429, 1.289, 0, -0.005, 0, -0.104, -0.695, -0.797, 0.065, -1.289, -0.606, -0.004, 0, 0.113, -0.599, 0, -0.318, 0.072, 0.096, -1, 0.001, -0.001, 0, -0.596, -0.659, -0.686, 0.065, -1.136, -0.091, -0.001, 0, -0.53, -0.476, 0, 0.155, 0.14, 0.179, -1.289, -0.023, -0.001, 1, -0.226, -0.623, -0.686, 0.123, -1.464, -0.249, 0, 1, -0.213, -0.507, 0, -0.464, -0.516, -0.48, 0, -0.008, 0, 0, 0, -0.96, -1, 0, 1, 0.47, 0.49, 0, 0.53, 0, -0.6, 0.227, -1, -0.895, 0, 0.181, 0, 0, 0, -0.908, -1, 0, 1.289, 1.289, 0.204, -1, 0.213, 0, -1, -0.84, -1, 0.2, 0, -1, 0, 0, 0, -0.861, -0.861, 0, 1.136, 0.193, 0.047, -1, -1, 0, -1, -0.817, 0.14, -1, 0, 0.127, 0, 0, 0, -0.797, -0.797, 0, -1, -1, -0.04, 1, 1, 0, 0, -0.215, 0.266, -0.23, 0, 0, 0, 0, 0, 0.922, 1, 0, 1, -0.221, 0.24, 0, 0.281, 0, -0.36, -0.051, 1, -0.801, 0, -0.033, 0, 0, 0, 0.825, 1, 0, 1.66, -1.66, 0.042, -1, 0.045, 0, -1, -0.705, 1, -0.04, 0, -1, 0, 0, 0, 0.742, 0.742, 0, 1.292, -0.037, 0.002, -1, 1, 0, -1, -0.667, 0.02, -1, 0, -0.016, 0,
Re: [R] assign index to colnames(matrix)
Jim, thank you, that worked great. Paul Prew | Statistician 651-795-5942 | fax 651-204-7504 Ecolab Research Center | Mail Stop ESC-F4412-A 655 Lone Oak Drive | Eagan, MN 55121-1560 -Original Message- From: Jim Lemon [mailto:j...@bitwrit.com.au] Sent: Friday, February 22, 2013 7:22 PM To: Prew, Paul Cc: r-help@r-project.org Subject: Re: [R] assign index to colnames(matrix) On 02/23/2013 11:34 AM, Prew, Paul wrote: Hello, I’m trying to follow the syntax of a script from a journal website. In order to create a regression formula used later in the script, the regression matrix must have column names “X1â€, “X2â€, etc. I have tried to assign these column names to my matrix ScoutRSM.mat using a for loop, but I don’t know how to interpret the error message. Suggestions? Thanks, Paul == instructions about dimnames(X)[[2]] = X1,X2,... === function(binstr) { # makes formula for regression # must make sure dimnames(X)[[2]] = X1,X2,... ind-which(binstr==1) rht- paste(X, ind, sep=, collapse=+) ans-paste(y ~ , rht) return(as.formula(ans)) } ### ## my for loop ## ### for (i in 1:dim(ScoutRSM.mat)[2] { colnames(ScoutRSM.mat)[i]- paste(X, i, sep = “â€)) } for(i in 1:dim(ScoutRSM.mat)[2]) { + colnames(ScoutRSM.mat)[i]- paste(X,i, sep = ) } Error in `colnames-`(`*tmp*`, value = X1) : length of 'dimnames' [2] not equal to array extent Hi Paul, You don't really need the loop, try: colnames(ScoutRSM.mat)-paste(X,1:dim(ScoutRSM.mat)[2],sep=) Jim CONFIDENTIALITY NOTICE: This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Windows 7 installation of .qz package from SourceForge
Thank you Duncan, the 2nd instructions worked. The both probably would have worked, but I had some code (below) that threw an error. It's designed to automatically set the internet connection to my Windows setting, so R would know the proxy server used at my worksite. I had seen this suggestion in the archives of R-Help. However, this code in the R-2.14.0/etc./Rprofile.site file wasn't being recognized as a valid command, so I commented it out (the code works fine when I run it in the RGUI). Thanks again, Paul # use the proxy settings for Windows/IE: setInternet2(TRUE) Paul Prew | Statistician 651-795-5942 | fax 651-204-7504 Ecolab Research Center | Mail Stop ESC-F4412-A 655 Lone Oak Drive | Eagan, MN 55121-1560 -Original Message- From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com] Sent: Wednesday, February 01, 2012 11:48 AM To: Prew, Paul Cc: r-help@r-project.org Subject: Re: [R] Windows 7 installation of .qz package from SourceForge On 12-02-01 12:07 PM, Prew, Paul wrote: Hello, I'm trying to install the package metRology from SourceForge. I save the zip file metRology_0.9-06.tar.gz to my Windows 7 machine, and try to install the package using the RGUI: Packages Install packages from local zip files metRology_0.9-06.tar.gz. There's no .zip extension, but R seems to go to work on the installation with a couple warning messages and one error. tar.gz files are source packages. .zip is a binary image of an installed package. To install a tar.gz, you can't use the menu items. If you have the necessary tools set up properly, it should work to do install.packages(.../path/to/metRology_0.9-06.tar.gz, type=source, repos=NULL) from within R. If that fails (as it likely will if the package has C or Fortran code and you haven't installed the right compilers), you need to get the R tools from CRAN mirror/bin/windows/Rtools. You can also install from outside R using R CMD INSTALL .../path/to/metRology_0.9-06.tar.gz. Duncan Murdoch The menu Packages Load Package ... doesn't provide metrology as one of the choices. ==R session window utils:::menuInstallLocal() Warning in unzip(zipname, exdir = dest) : error 1 in extracting from zip file Warning in read.dcf(file.path(pkgname, DESCRIPTION), c(Package, Type)) : cannot open compressed file 'metRology_0.9-06.tar.gz/DESCRIPTION', probable reason 'No such file or directory' Error in read.dcf(file.path(pkgname, DESCRIPTION), c(Package, Type)) : cannot open the connection Do I use a utility such as 7-zip to decompress the underlying files, then re-zip into a file with the .zip extension? Thank you, Paul sessionInfo() R version 2.14.0 (2011-10-31) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] tcltk stats graphics grDevices utils datasets methods base other attached packages: [1] Rcmdr_1.8-1 car_2.0-12 nnet_7.3-1 MASS_7.3-16 loaded via a namespace (and not attached): [1] tools_2.14.0 Paul Prew | Statistician 651-795-5942 | fax 651-204-7504 Ecolab Research Center | Mail Stop ESC-F4412-A 655 Lone Oak Drive | Eagan, MN 55121-1560 CONFIDENTIALITY NOTICE: \ This e-mail communication an...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. CONFIDENTIALITY NOTICE: =\ \ This e-mail communication a...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Windows 7 installation of .qz package from SourceForge
Hello, I'm trying to install the package metRology from SourceForge. I save the zip file metRology_0.9-06.tar.gz to my Windows 7 machine, and try to install the package using the RGUI: Packages Install packages from local zip files metRology_0.9-06.tar.gz. There's no .zip extension, but R seems to go to work on the installation with a couple warning messages and one error. The menu Packages Load Package ... doesn't provide metrology as one of the choices. ==R session window utils:::menuInstallLocal() Warning in unzip(zipname, exdir = dest) : error 1 in extracting from zip file Warning in read.dcf(file.path(pkgname, DESCRIPTION), c(Package, Type)) : cannot open compressed file 'metRology_0.9-06.tar.gz/DESCRIPTION', probable reason 'No such file or directory' Error in read.dcf(file.path(pkgname, DESCRIPTION), c(Package, Type)) : cannot open the connection Do I use a utility such as 7-zip to decompress the underlying files, then re-zip into a file with the .zip extension? Thank you, Paul sessionInfo() R version 2.14.0 (2011-10-31) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] tcltk stats graphics grDevices utils datasets methods base other attached packages: [1] Rcmdr_1.8-1 car_2.0-12 nnet_7.3-1 MASS_7.3-16 loaded via a namespace (and not attached): [1] tools_2.14.0 Paul Prew | Statistician 651-795-5942 | fax 651-204-7504 Ecolab Research Center | Mail Stop ESC-F4412-A 655 Lone Oak Drive | Eagan, MN 55121-1560 CONFIDENTIALITY NOTICE: \ This e-mail communication an...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] quantile regression: out of memory error
Hello, I’m wondering if anyone can offer advice on the out-of-memory error I’m getting. I’m using R2.12.2 on Windows XP, Platform: i386-pc-mingw32/i386 (32-bit). I am using the quantreg package, trying to perform a quantile regression on a dataframe that has 11,254 rows and 5 columns. object.size(subsetAudit.dat) 450832 bytes str(subsetAudit.dat) 'data.frame': 11253 obs. of 5 variables: $ Satisfaction : num 0.64 0.87 0.78 0.75 0.83 0.75 0.74 0.8 0.89 0.78 ... $ Return : num 0.84 0.92 0.91 0.89 0.95 0.81 0.9 0.87 0.95 0.88 ... $ Recommend: num 0.53 0.64 0.58 0.58 0.62 0.6 0.56 0.7 0.64 0.65 ... $ Cust.Clean : num 0.75 0.85 0.72 0.72 0.81 0.79 0.79 0.8 0.78 0.75 ... $ CleanScore.Cycle1: num 96.7 83.3 93.3 86.7 96.7 96.7 90 80 81.7 86.7 ... rq(subsetAudit.dat$Satisfaction ~ subsetAudit.dat$CleanScore.Cycle1, tau = -1) ERROR: cannot allocate vector of size 2.8 Gb I don’t know much about computers – software, hardware, algorithms – but does this mean that the quantreg package is creating some massive intermediate vector as it performs the rq function? Quantile regression is something I’m just starting to explore, but believe it involves ordering data prior to the regression, which could be a huge job when using 11,000 records. Does bigmemory have functionality to help me with this? Thank you, Paul Paul Prew ▪ Statistician 651-795-5942 ▪ fax 651-204-7504 Ecolab Research Center ▪ Mail Stop ESC-F4412-A 655 Lone Oak Drive ▪ Eagan, MN 55121-1560 CONFIDENTIALITY NOTICE: This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] quantile regression: out of memory error
Thank you, Roger, that was my problem. Specifying tau = 1:19/20 worked fine. Regards, Paul Paul Prew | Statistician 651-795-5942 | fax 651-204-7504 Ecolab Research Center | Mail Stop ESC-F4412-A 655 Lone Oak Drive | Eagan, MN 55121-1560 -Original Message- From: Roger Koenker [mailto:rkoen...@uiuc.edu] Sent: Monday, July 11, 2011 12:48 PM To: Prew, Paul Cc: r-help@r-project.org help Subject: Re: [R] quantile regression: out of memory error Paul, Yours is NOT a large problem, but it becomes a large problem when you ask for ALL the distinct QR solutions by specifying tau = -1. You probably don't want to see all these solutions, I suspect that only tau = 1:19/20 or so would suffice. Try this, and see how it goes. Roger url:www.econ.uiuc.edu/~rogerRoger Koenker emailrkoen...@uiuc.eduDepartment of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Urbana, IL 61801 On Jul 11, 2011, at 12:39 PM, Prew, Paul wrote: Hello, I’m wondering if anyone can offer advice on the out-of-memory error I’m getting. I’m using R2.12.2 on Windows XP, Platform: i386-pc-mingw32/i386 (32-bit). I am using the quantreg package, trying to perform a quantile regression on a dataframe that has 11,254 rows and 5 columns. object.size(subsetAudit.dat) 450832 bytes str(subsetAudit.dat) 'data.frame': 11253 obs. of 5 variables: $ Satisfaction : num 0.64 0.87 0.78 0.75 0.83 0.75 0.74 0.8 0.89 0.78 ... $ Return : num 0.84 0.92 0.91 0.89 0.95 0.81 0.9 0.87 0.95 0.88 ... $ Recommend: num 0.53 0.64 0.58 0.58 0.62 0.6 0.56 0.7 0.64 0.65 ... $ Cust.Clean : num 0.75 0.85 0.72 0.72 0.81 0.79 0.79 0.8 0.78 0.75 ... $ CleanScore.Cycle1: num 96.7 83.3 93.3 86.7 96.7 96.7 90 80 81.7 86.7 ... rq(subsetAudit.dat$Satisfaction ~ subsetAudit.dat$CleanScore.Cycle1, tau = -1) ERROR: cannot allocate vector of size 2.8 Gb I don’t know much about computers – software, hardware, algorithms – but does this mean that the quantreg package is creating some massive intermediate vector as it performs the rq function? Quantile regression is something I’m just starting to explore, but believe it involves ordering data prior to the regression, which could be a huge job when using 11,000 records. Does bigmemory have functionality to help me with this? Thank you, Paul Paul Prew ▪ Statistician 651-795-5942 ▪ fax 651-204-7504 Ecolab Research Center ▪ Mail Stop ESC-F4412-A 655 Lone Oak Drive ▪ Eagan, MN 55121-1560 CONFIDENTIALITY NOTICE: This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. CONFIDENTIALITY NOTICE: This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Arguments in functions
Hello, I'm not much of a programmer, and am trying to understand the workings of the function below called RStatFctn within this bootstrap procedure. RStatFctn is defined to have two arguments: x, intended to be a data vector; and d intended to be an index (or so it looks to me). Later, rnormdat is created to be the data vector. However, when RStatFctn is called within the bootstrap function, I don't see where rnormdat is explicitly passed to RStatFctn as the data vector x. And I don't see where any values are ever passed to the index variable d, or where any index is ever referenced or used in conjunction with RStatFctn. Any help you can offer is appreciated. Thanks, Paul library(boot) RStatFctn - function(x,d) {return(mean(x[d]))} b.basic = matrix(data=NA, nrow=1000, ncol=2) b.normal = matrix(data=NA, nrow=1000, ncol=2) b.percent =matrix(data=NA, nrow=1000, ncol=2) b.bca =matrix(data=NA, nrow=1000, ncol=2) for(i in 1:1000){ rnormdat = rnorm(30,0,1) b - boot(rnormdat, RStatFctn, R = 1000) b.ci=boot.ci(b, conf =0.95,type=c(basic,norm,perc,bca)) b.basic[i,] = b.ci$basic[,4:5] b.normal[i,] = b.ci$normal[,2:3] b.percent[i,] = b.ci$percent[,4:5] b.bca[i,] = b.ci$bca[,4:5] } Paul Prew ▪ Statistician 651-795-5942 ▪ fax 651-204-7504 Ecolab Research Center ▪ Mail Stop ESC-F4412-A 655 Lone Oak Drive ▪ Eagan, MN 55121-1560 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Facets in ggplot2
Hello, I'm trying to introduce myself to ggplot2. I'm using syntax from the help file, but pasting in my own data. I don't understand the error message I'm getting. Can anyone clue me in? A Google search of this error statement didn't return anything I could recognize as useful. Thanks, Paul str(d.AD) 'data.frame': 9 obs. of 3 variables: $ treatment: Factor w/ 3 levels 1,2,3: 1 1 1 2 2 2 3 3 3 $ outcome : Factor w/ 3 levels 1,2,3: 1 2 3 1 2 3 1 2 3 $ counts : num 18 17 15 20 10 20 25 13 12 qplot(d.AD$outcome,d.AD$counts,data=d.AD, facets=.~ d.AD$treatment) Error in `[.data.frame`(plot$data, , setdiff(cond, names(df)), drop = FALSE) : undefined columns selected Paul Prew ⪠Statistician 651-795-5942 ⪠fax 651-204-7504 Ecolab Research Center ⪠Mail Stop ESC-F4412-A 655 Lone Oak Drive ⪠Eagan, MN 55121-1560 CONFIDENTIALITY NOTICE: This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sample size for proportion, not binomial
Dear Marc, Thank you very much for the advice and the papers, it helps. Regards, Paul From: Marc Schwartz marc_schwa...@me.com To: Prew, Paul paul.p...@ecolab.com Cc: r-help@r-project.org Subject: Re: [R] Sample size for proportion, not binomial Message-ID: c56551a0-fce8-4ffe-8857-d16dd3827...@me.com Content-Type: text/plain; charset=windows-1252 On Mar 23, 2010, at 11:05 AM, Prew, Paul wrote: Hello, I am looking for a sample size function for samples sizes, to test proportions that are not binomial proportions. The proportions represent a ratio of (final measure) / (baseline measure) on the same experimental unit. Searches using RSeek and such bring multiple hits for binomial proportions, but that doesn't seem to fit my situation. Perhaps there's some standard terminology from a different field that would provide better hits than deeming this a 'rate' or a 'proportion'. Of course, most sample size functions assume a normal distribution, while this data will be bounded between 0 and 1. The scientist I'm working with feels it's important to make fair comparisons, any weight loss must account for the baseline weight. A logistic transformation seems appropriate, but that term also didn't yield hits I recognized as useful. Loss of weight --- compare treatments: Treatment A: 1 - Final weight / Initial weight Treatment B: 1 - Final weight / Initial weight This appears to be a situation that would be common, but I'm not framing it in a way that matches an R package. Any guidance is appreciated. Regards, Paul If you and the scientist are in a position of being open to better options of analyzing change from baseline data, I would recommend that you both read the following two papers: Statistics notes: analysing controlled trials with baseline and follow up measurements. Vickers AJ, Altman DG. BMJ 2001;323:1123?4. http://www.bmj.com/cgi/content/full/323/7321/1123 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1121605/pdf/1123.pdf The use of percentage change from baseline as an outcome in a controlled trial is statistically inefficient: a simulation study. Vickers AJ. BMC Med Res Methodol 2001;1:6. http://www.biomedcentral.com/1471-2288/1/6 http://www.biomedcentral.com/content/pdf/1471-2288-1-6.pdf and review an additional web site: http://biostat.mc.vanderbilt.edu/wiki/Main/MeasureChange Once you are hopefully in a position of adopting a regression based approach (eg. FinalWeight ~ BaseWeight + Treatment), there are various options for calculating sample sizes. The key advantage of this approach is that you get the baseline adjusted between-group comparison (the regression beta coefficient and confidence intervals for Treatment) which is the key outcome of interest in comparing treatments in a parallel design. The easiest, albeit conservative approach for sample size, is to use power.t.test() on your assumptions of the inter-group delta for actual weight change (not percent change), the std dev for actual change, desired power and target alpha. I am not aware off-hand of any power/sample size functions in R for regular linear regression, though they may exist. There are third party programs that do provide that functionality. If you are willing to code and experiment a bit, you could construct a monte carlo simulation with a linear model, using data generated with rnorm() based upon reasonable assumptions about the distribution of your data in each group for the baseline and final values. Once you get your actual data collected and ready for analysis, you will also need to test for a baseline*treatment interaction (FinalWeight ~ BaseWeight * Treatment), which can make the interpretation of treatment effects more complicated, since the treatment effect will be conditional upon the baseline weight, rather than being able to report a mean treatment effect. HTH, Marc Schwartz Paul Prew ⪠Statistician 651-795-5942 ⪠fax 651-204-7504 Ecolab Research Center ⪠Mail Stop ESC-F4412-A 655 Lone Oak Drive ⪠Eagan, MN 55121-1560 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sample size for proportion, not binomial
Hello, I am looking for a sample size function for samples sizes, to test proportions that are not binomial proportions. The proportions represent a ratio of (final measure) / (baseline measure) on the same experimental unit. Searches using RSeek and such bring multiple hits for binomial proportions, but that doesn't seem to fit my situation. Perhaps there's some standard terminology from a different field that would provide better hits than deeming this a 'rate' or a 'proportion'. Of course, most sample size functions assume a normal distribution, while this data will be bounded between 0 and 1. The scientist I'm working with feels it's important to make fair comparisons, any weight loss must account for the baseline weight. A logistic transformation seems appropriate, but that term also didn't yield hits I recognized as useful. Loss of weight --- compare treatments: Treatment A: 1 - Final weight / Initial weight Treatment B: 1 - Final weight / Initial weight This appears to be a situation that would be common, but I'm not framing it in a way that matches an R package. Any guidance is appreciated. Regards, Paul Paul Prew ⪠Statistician 651-795-5942 ⪠fax 651-204-7504 Ecolab Research Center ⪠Mail Stop ESC-F4412-A 655 Lone Oak Drive ⪠Eagan, MN 55121-1560 CONFIDENTIALITY NOTICE: This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Aov: SE's for split plot
Hello, Can anyone explain why the following message appears for the function model.tables, where se=T? In the VR MASS text, p.285 the se=T option works for a split plot example that seems similar to my operation. But the model.tables documentation, in the Arguments section for se, states should standard errors be computed?. Warning: Warning in model.tables.aovlist(Magic.aov, type = means, se=T): SE's for type 'means' are not implemented yet I am trying to get the Standard Errors (SE's) for a split plot using the aov function. I'm not sure, it may be a split-split plot design. There are two crossed factors: 1)Formula 2)SwatchType. * Formula is applied to the whole plots. WP = Load. * Each Whole plot is further divided into 3 split plots. SP = BackrNbr, 3 BackrNbrs within a single Load. * Each split plot is the block for a completely randomized design using factor = SwatchType. All 6 SwatchTypes are applied within each BackrNbr Magic.aov-aov(SoilRemoval~Formula*SwatchType+Error(Load/BackrNbr),data=Magic.dat) model.tables(Magic.aov, type=means,se=T) ### the table of means are printed, no SE's str(Magic.dat) 'data.frame': 162 obs. of 7 variables: $ BackrNbr : Factor w/ 27 levels 1,2,3,4,..: 1 1 1 1 1 1 2 2 2 2 ... $ Load : Factor w/ 9 levels 1,2,3,4,..: 1 1 1 1 1 1 1 1 1 1 ... $ Formula: Factor w/ 3 levels CONTROL,EXP1,..: 2 2 2 2 2 2 2 2 2 2 ... $ SwatchType : Factor w/ 6 levels DMO,Dust.Seb,..: 2 6 1 5 4 3 2 6 1 5 ... $ L.Initial : num 71.2 73.9 69.3 58.6 40.6 ... $ L.Final: num 89.8 75.6 70.5 69.6 58.9 ... $ SoilRemoval: num 75.08 7.88 4.71 29.44 33.15 ... Thank you for your consideration, Paul Paul Prew | Statistician 651-795-5942 | fax 651-204-7504 Ecolab Research Center | Mail Stop ESC-F4412-A 655 Lone Oak Drive | Eagan, MN 55121-1560 CONFIDENTIALITY NOTICE: =\ \ This e-mail communication a...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vcd package --- change layout of plot
Dear Achim, thank you very much for the suggestions, they work well. Agreed that an ordinal logistic regression seems a more powerful choice of analysis method, and I'm using that also. Thanks for providing the vcd package, it's proving quite helpful. Regards, Paul Paul Prew | Statistician 651-795-5942 | fax 651-204-7504 Ecolab Research Center | Mail Stop ESC-F4412-A 655 Lone Oak Drive | Eagan, MN 55121-1560 -Original Message- From: Achim Zeileis [mailto:achim.zeil...@wu-wien.ac.at] Sent: Friday, May 22, 2009 3:06 PM To: Prew, Paul Cc: r-help@r-project.org Subject: Re: [R] vcd package --- change layout of plot On Thu, 21 May 2009, Prew, Paul wrote: Hello, I'm trying to use the vcd package to analyze survey data. Expert judges ranked possible features for product packaging. Seven features were listed, and 19 judges split between 2 cities ranked them. The following code (1) works, but the side-by-side plots for Cities PX, SF are shrunk too much. Stacking PX on top of SF would make for a better plot. (I could switch the order of Feature and Rank dimensions, and go with the default side-by-side, but would prefer not to). Several comments: - Adding keep_aspect_ratio = TRUE does probably change what you call shrunk too much. By default is aspect ratio is kept fixed for 2-way displays. - I would switch the splitting order of Feature/Rank because you probably want the distribution of ranks for a given feature and not the other way round. In that case, you might find etting split_vertical = TRUE aesthetically more pleasing. But it's probably a matter of taste. - Setting gp = shading_max is not really the best choice: If you would want a conditional independence test based on the maximum of residuals you should use panel = cotab_coindep. See the JCGS paper in the references and the accompanying vignette(residual-shadings, package = vcd) for more details. But note that the double maximum test is not the most powerful test against ordinal alternatives (as one might expect due to the ordered Rank variable). hth, Z (1) cotabplot(~ Rank + Feature| Cities, data = Pack.dat, gp = shading_max, rot_labels = c(90, 0, 0, 0),just_labels = c(left, left, left, right),set_varnames = c(Feature = )) Reading the vcd help, I got lost trying to understand the panel-generating parameters I should use. My best guess was below (2), but gave an error message. Clearly, I don't know what the paneling is asking for. This is where I would like some advice, if anyone is familiar with vcd. (2) Tried to change the layout of trellis plot from horizontal to vertical Pack.mos-mosaic(~Feature + Rank, data = Pack.tab, gp = shading_max, rot_labels = c(0, 0, 0, 0),just_labels = c(left, left, left, right),set_varnames = c(Feature = )) ## attempt to create an object for panel argument in cotabplot function pushViewport(viewport(layout = grid.layout(ncol = 1))) pushViewport(viewport(layout.pos.row = 1)) ## tell vcd to change the default layout, and what to put in the top plot cotabplot(~ Feature + Rank | Cities, data = Pack.dat, panel = Pack.mos, Pack.dat[[PX]], gp = shading_max, rot_labels = c(0, 0, 0, 0)) ## create mosaic plot that's conditional on Cities; first plot Cities = PX ## panel argument is an attempt to modify an example in the vcd help file popViewport() ## create the graphic Error: Cannot pop the top-level viewport (grid and graphics output mixed?) # no point in gong on to code the plot for layout.pos.row = 2 str(Pack.tab) Error in `[.structable`(x, i, args[[2]]) : subscript out of bounds class(Pack.tab) [1] structable ftable dim(Pack.tab) [1] 7 2 7 Cities PX SF Rank Feature 1Flexible 2 0 Integrate.Probes 1 2 Large/heavy 1 0 Lockout 0 1 Recyclable 3 5 Rigid0 0 Small/light 2 1 2Flexible 1 6 Integrate.Probes 2 0 Large/heavy 1 1 Lockout 1 0 Recyclable 2 0 Rigid1 0 Small/light 2 2 3Flexible 1 1 Integrate.Probes 3 0 Large/heavy 1 1 Lockout 2 1 Recyclable 1 3 Rigid0 0 Small/light 0 3 4Flexible 3 0 Integrate.Probes 0 2 Large/heavy 0 0 Lockout 2 2 Recyclable 0 1 Rigid1 2 Small/light 3 2 5Flexible 1 1 Integrate.Probes 1 1 Large/heavy 0 3 Lockout 0 2 Recyclable 2 0 Rigid3 1
[R] vcd package --- change layout of plot
Hello, I'm trying to use the vcd package to analyze survey data. Expert judges ranked possible features for product packaging. Seven features were listed, and 19 judges split between 2 cities ranked them. The following code (1) works, but the side-by-side plots for Cities PX, SF are shrunk too much. Stacking PX on top of SF would make for a better plot. (I could switch the order of Feature and Rank dimensions, and go with the default side-by-side, but would prefer not to). (1) cotabplot(~ Rank + Feature| Cities, data = Pack.dat, gp = shading_max, rot_labels = c(90, 0, 0, 0),just_labels = c(left, left, left, right),set_varnames = c(Feature = )) Reading the vcd help, I got lost trying to understand the panel-generating parameters I should use. My best guess was below (2), but gave an error message. Clearly, I don't know what the paneling is asking for. This is where I would like some advice, if anyone is familiar with vcd. (2) Tried to change the layout of trellis plot from horizontal to vertical Pack.mos-mosaic(~Feature + Rank, data = Pack.tab, gp = shading_max, rot_labels = c(0, 0, 0, 0),just_labels = c(left, left, left, right),set_varnames = c(Feature = )) ## attempt to create an object for panel argument in cotabplot function pushViewport(viewport(layout = grid.layout(ncol = 1))) pushViewport(viewport(layout.pos.row = 1)) ## tell vcd to change the default layout, and what to put in the top plot cotabplot(~ Feature + Rank | Cities, data = Pack.dat, panel = Pack.mos, Pack.dat[[PX]], gp = shading_max, rot_labels = c(0, 0, 0, 0)) ## create mosaic plot that's conditional on Cities; first plot Cities = PX ## panel argument is an attempt to modify an example in the vcd help file popViewport() ## create the graphic Error: Cannot pop the top-level viewport (grid and graphics output mixed?) # no point in gong on to code the plot for layout.pos.row = 2 str(Pack.tab) Error in `[.structable`(x, i, args[[2]]) : subscript out of bounds class(Pack.tab) [1] structable ftable dim(Pack.tab) [1] 7 2 7 Cities PX SF Rank Feature 1Flexible 2 0 Integrate.Probes 1 2 Large/heavy 1 0 Lockout 0 1 Recyclable 3 5 Rigid0 0 Small/light 2 1 2Flexible 1 6 Integrate.Probes 2 0 Large/heavy 1 1 Lockout 1 0 Recyclable 2 0 Rigid1 0 Small/light 2 2 3Flexible 1 1 Integrate.Probes 3 0 Large/heavy 1 1 Lockout 2 1 Recyclable 1 3 Rigid0 0 Small/light 0 3 4Flexible 3 0 Integrate.Probes 0 2 Large/heavy 0 0 Lockout 2 2 Recyclable 0 1 Rigid1 2 Small/light 3 2 5Flexible 1 1 Integrate.Probes 1 1 Large/heavy 0 3 Lockout 0 2 Recyclable 2 0 Rigid3 1 Small/light 2 1 6Flexible 0 1 Integrate.Probes 1 3 Large/heavy 3 2 Lockout 3 1 Recyclable 0 0 Rigid2 2 Small/light 0 0 7Flexible 1 0 Integrate.Probes 1 1 Large/heavy 3 2 Lockout 1 2 Recyclable 1 0 Rigid2 4 Small/light 0 0 sessionInfo() R version 2.9.0 RC (2009-04-10 r48318) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] grid tcltk stats graphics grDevices utils datasets [8] methods base other attached packages: [1] relimp_1.0-1 vcd_1.2-4 colorspace_1.0-0 MASS_7.2-46 [5] RSiteSearch_0.1-5 brew_1.0-3 lme4_0.999375-30 Matrix_0.999375-26 [9] lattice_0.17-22Rcmdr_1.4-9car_1.2-13 loaded via a namespace (and not attached): [1] tools_2.9.0 If someone can give advice on stacking the two cities' plots, I would be grateful. Thanks, Paul CONFIDENTIALITY NOTICE: =\ \ This e-mail communication a...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] effects package --- add abline to plot
Dear John and David, thank you for your help. I apologize for not defining the analysis as an ordinal regression, or including a structure --- could have taken some of the guesswork out for you. John --- for the ticks, I would still like to make this work for future analyses, but still not sure what specifically needs changing. Before initially posting, I did read ?effect, and several other searches around tick and at, but couldn't find a workable description or example for how to use at. The one example I did find I thought I had copied pretty closely with my command below. plot(..., ticks=c(0.1,0.2,0.3,0.4,0.5,0.6)) I tried other combinations that probably look pretty silly... plot(..., ticks(at=c(0.1,0.2,0.3,0.4,0.5,0.6))) plot(..., ticks=c(0.1:0.6/0.1)) Just don't know how to properly populate this list --- ticks: a two-item list controlling the placement of tick marks on the vertical axis, with elements at and n. If at=NULL (the default), the program attempts to find `nice' locations for the ticks, and the value of n (default, 5) gives the approximate number of tick marks desired; if at is non-NULL, then the value of n is ignored. Thanks again. Effects is a terrific package. Paul ** Model Specification Clean.label - polr(Clean.lbl ~ City.Abbr, method=logistic, data=Safeway, Hess=TRUE) str(Clean.label) List of 18 $ coefficients : Named num [1:8] -1.0887 -1.1449 0.6923 0.0894 -0.8229 ... ..- attr(*, names)= chr [1:8] City.Abbr[T.Dublin] City.Abbr[T.Englwd] City.Abbr[T.Fairfax] City.Abbr[T.Falls Ch] ... $ zeta : Named num [1:4] -2.529 0.894 3.447 6.406 ..- attr(*, names)= chr [1:4] Excellent|V.Good V.Good|Good Good|Fair Fair|Poor $ deviance : num 2327 $ fitted.values: num [1:1248, 1:5] 0.0568 0.0568 0.0568 0.0568 0.0568 ... ..- attr(*, dimnames)=List of 2 .. ..$ : chr [1:1248] 1 2 3 4 ... .. ..$ : chr [1:5] Excellent V.Good Good Fair ... $ lev : chr [1:5] Excellent V.Good Good Fair ... $ terms:Classes 'terms', 'formula' length 3 Clean.lbl ~ City.Abbr .. ..- attr(*, variables)= language list(Clean.lbl, City.Abbr) .. ..- attr(*, factors)= int [1:2, 1] 0 1 .. .. ..- attr(*, dimnames)=List of 2 .. .. .. ..$ : chr [1:2] Clean.lbl City.Abbr .. .. .. ..$ : chr City.Abbr .. ..- attr(*, term.labels)= chr City.Abbr .. ..- attr(*, order)= int 1 .. ..- attr(*, intercept)= int 1 .. ..- attr(*, response)= int 1 .. ..- attr(*, .Environment)=environment: R_GlobalEnv .. ..- attr(*, predvars)= language list(Clean.lbl, City.Abbr) .. ..- attr(*, dataClasses)= Named chr [1:2] ordered factor .. .. ..- attr(*, names)= chr [1:2] Clean.lbl City.Abbr $ df.residual : num 1236 $ edf : num 12 $ n: num 1248 $ nobs : num 1248 $ call : language polr(formula = Clean.lbl ~ City.Abbr, data = Safeway, Hess = TRUE, method = logistic) $ method : chr logistic $ convergence : int 0 $ niter: Named int [1:2] 62 22 ..- attr(*, names)= chr [1:2] f.evals.function g.evals.gradient $ Hessian : num [1:12, 1:12] 31.9 0 0 0 0 ... ..- attr(*, dimnames)=List of 2 .. ..$ : chr [1:12] City.Abbr[T.Dublin] City.Abbr[T.Englwd] City.Abbr[T.Fairfax] City.Abbr[T.Falls Ch] ... .. ..$ : chr [1:12] City.Abbr[T.Dublin] City.Abbr[T.Englwd] City.Abbr[T.Fairfax] City.Abbr[T.Falls Ch] ... $ model:'data.frame': 1248 obs. of 2 variables: ..$ Clean.lbl: Ord.factor w/ 5 levels ExcellentV.Good..: 1 2 3 3 3 3 3 3 2 2 ... ..$ City.Abbr: Factor w/ 9 levels DC,Dublin,..: 9 9 9 9 9 9 9 9 9 9 ... ..- attr(*, terms)=Classes 'terms', 'formula' length 3 Clean.lbl ~ City.Abbr .. .. ..- attr(*, variables)= language list(Clean.lbl, City.Abbr) .. .. ..- attr(*, factors)= int [1:2, 1] 0 1 .. .. .. ..- attr(*, dimnames)=List of 2 .. .. .. .. ..$ : chr [1:2] Clean.lbl City.Abbr .. .. .. .. ..$ : chr City.Abbr .. .. ..- attr(*, term.labels)= chr City.Abbr .. .. ..- attr(*, order)= int 1 .. .. ..- attr(*, intercept)= int 1 .. .. ..- attr(*, response)= int 1 .. .. ..- attr(*, .Environment)=environment: R_GlobalEnv .. .. ..- attr(*, predvars)= language list(Clean.lbl, City.Abbr) .. .. ..- attr(*, dataClasses)= Named chr [1:2] ordered factor .. .. .. ..- attr(*, names)= chr [1:2] Clean.lbl City.Abbr $ contrasts:List of 1 ..$ City.Abbr: chr contr.Treatment $ xlevels :List of 1 ..$ City.Abbr: chr [1:9] DC Dublin Englwd Fairfax ... - attr(*, class)= chr polr Paul Prew | Statistician 651-795-5942 | fax 651-204-7504 Ecolab Research Center | Mail Stop ESC-F4412-A 655 Lone Oak Drive | Eagan, MN 55121-1560 -Original Message- From: John Fox [mailto:j...@mcmaster.ca] Sent: Tuesday, April 28, 2009 10:34 AM To: 'David Winsemius' Cc: r-help@r-project.org; Prew, Paul Subject: RE: [R] effects package --- add abline to plot Dear David, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help
Re: [R] effects package --- add abline to plot
John, Thank you for the code, it looks like just what's needed. I'll give it a try tomorrow. The desire for reference lines is to aid comparisons of different factor levels, to look at whether confidence intervals overlap. I've attached a graphic file of the particular allEffects plot, if you're interested in the types of comparisons. A prediction: the client is going to lay down a ruler and examine which levels don't overlap. I'd like to save them some trouble. Thanks again, Paul From: John Fox [mailto:j...@mcmaster.ca] Sent: Tue 4/28/2009 2:48 PM To: Prew, Paul Cc: r-help@r-project.org; 'David Winsemius' Subject: RE: [R] effects package --- add abline to plot Dear Paul, -Original Message- From: Prew, Paul [mailto:paul.p...@ecolab.com] Sent: April-28-09 2:19 PM To: John Fox; David Winsemius Cc: r-help@r-project.org Subject: RE: [R] effects package --- add abline to plot Dear John and David, thank you for your help. I apologize for not defining the analysis as an ordinal regression, or including a structure --- could have taken some of the guesswork out for you. John --- for the ticks, I would still like to make this work for future analyses, but still not sure what specifically needs changing. Before initially posting, I did read ?effect, and several other searches around tick and at, but couldn't find a workable description or example for how to use at. The one example I did find I thought I had copied pretty closely with my command below. plot(..., ticks=c(0.1,0.2,0.3,0.4,0.5,0.6)) I tried other combinations that probably look pretty silly... plot(..., ticks(at=c(0.1,0.2,0.3,0.4,0.5,0.6))) plot(..., ticks=c(0.1:0.6/0.1)) Just don't know how to properly populate this list --- ticks: a two-item list controlling the placement of tick marks on the vertical axis, with elements at and n. If at=NULL (the default), the program attempts to find `nice' locations for the ticks, and the value of n (default, 5) gives the approximate number of tick marks desired; if at is non-NULL, then the value of n is ignored. You need to specify ticks as a *list*; in your case, you just need the first element, so ticks=list(at=c(0.1,0.2,0.3,0.4,0.5,0.6)) or more compactly ticks=list(at=0.1*1:6) should do the trick. Can you tell me a bit more about why you want to draw horizontal lines on the plot? If this seems like a generally useful idea, I can put it my to-do list. I hope this helps, John Thanks again. Effects is a terrific package. Paul ** Model Specification Clean.label - polr(Clean.lbl ~ City.Abbr, method=logistic, data=Safeway, Hess=TRUE) str(Clean.label) List of 18 $ coefficients : Named num [1:8] -1.0887 -1.1449 0.6923 0.0894 -0.8229 ... ..- attr(*, names)= chr [1:8] City.Abbr[T.Dublin] City.Abbr[T.Englwd] City.Abbr[T.Fairfax] City.Abbr[T.Falls Ch] ... $ zeta : Named num [1:4] -2.529 0.894 3.447 6.406 ..- attr(*, names)= chr [1:4] Excellent|V.Good V.Good|Good Good|Fair Fair|Poor $ deviance : num 2327 $ fitted.values: num [1:1248, 1:5] 0.0568 0.0568 0.0568 0.0568 0.0568 ... ..- attr(*, dimnames)=List of 2 .. ..$ : chr [1:1248] 1 2 3 4 ... .. ..$ : chr [1:5] Excellent V.Good Good Fair ... $ lev : chr [1:5] Excellent V.Good Good Fair ... $ terms:Classes 'terms', 'formula' length 3 Clean.lbl ~ City.Abbr .. ..- attr(*, variables)= language list(Clean.lbl, City.Abbr) .. ..- attr(*, factors)= int [1:2, 1] 0 1 .. .. ..- attr(*, dimnames)=List of 2 .. .. .. ..$ : chr [1:2] Clean.lbl City.Abbr .. .. .. ..$ : chr City.Abbr .. ..- attr(*, term.labels)= chr City.Abbr .. ..- attr(*, order)= int 1 .. ..- attr(*, intercept)= int 1 .. ..- attr(*, response)= int 1 .. ..- attr(*, .Environment)=environment: R_GlobalEnv .. ..- attr(*, predvars)= language list(Clean.lbl, City.Abbr) .. ..- attr(*, dataClasses)= Named chr [1:2] ordered factor .. .. ..- attr(*, names)= chr [1:2] Clean.lbl City.Abbr $ df.residual : num 1236 $ edf : num 12 $ n: num 1248 $ nobs : num 1248 $ call : language polr(formula = Clean.lbl ~ City.Abbr, data = Safeway, Hess = TRUE, method = logistic) $ method : chr logistic $ convergence : int 0 $ niter: Named int [1:2] 62 22 ..- attr(*, names)= chr [1:2] f.evals.function g.evals.gradient $ Hessian : num [1:12, 1:12] 31.9 0 0 0 0 ... ..- attr(*, dimnames)=List of 2 .. ..$ : chr [1:12] City.Abbr[T.Dublin] City.Abbr[T.Englwd] City.Abbr[T.Fairfax] City.Abbr[T.Falls Ch] ... .. ..$ : chr [1:12] City.Abbr[T.Dublin] City.Abbr[T.Englwd] City.Abbr[T.Fairfax] City.Abbr[T.Falls Ch] ... $ model:'data.frame':1248 obs. of 2 variables: ..$ Clean.lbl: Ord.factor w/ 5 levels ExcellentV.Good..: 1 2 3 3 3 3 3 3 2 2 ... ..$ City.Abbr: Factor w/ 9 levels DC,Dublin,..: 9 9 9 9 9 9
[R] effects package --- add abline to plot
Hello, I am not having success in a simple task. Using the effects package, I would like to add reference lines at probability values of 0.1 â 0.6 on a plot of the effects. The plot command works, but following up with an abline command produces the message âplot .new has not been called yetâ, and of course the reference lines were not added. Looking through past R help lists, there was a similar request for help --- trying to add an abline but âgot the error plot.new has not been called yet. The help list reply was â ?abline: This function adds one or more straight lines through the current plot., i.e. the already existing *current plot*. So plot your data (e.g. with plot(x, y)) before adding a regression line.â I interpreted the above to suggest the following --- plot(allEffects(Clean.label),ask=FALSE, alternating = TRUE, ylab=Probability of Rating, xlab=City,main=Cleanliness Ratings by City, factor.names=FALSE, ticks=c(0.1,0.2,0.3,0.4,0.5,0.6)) abline(h=c(0.1,0.2,0.3,0.4,0.5,0.6)) Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...) : plot.new has not been called yet Less bothersome is the fact that the tick marks werenât modified to 0.1, 0.2, 0.3, etc. Further searching brought the panel.abline command to light, but that didnât produce any results, not even an error message. plot(allEffects(Clean.label),ask=FALSE, alternating = TRUE, + ylab=Probability of Rating, xlab=City,main=Cleanliness Ratings by City, + factor.names=FALSE, ticks=c(0.1,0.2,0.3,0.4,0.5,0.6)) panel.abline(h=c(0.1,0.2,0.3,0.4,0.5,0.6)) ;;; sessionInfo() R version 2.9.0 RC (2009-04-10 r48318) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] tcltk grid stats graphics grDevices utils datasets methods [9] base other attached packages: [1] relimp_1.0-1 Rcmdr_1.4-9 car_1.2-13 effects_2.0-4 [5] colorspace_1.0-0 nnet_7.2-46 MASS_7.2-46 lattice_0.17-22 loaded via a namespace (and not attached): [1] tools_2.9.0 Thank you for any advice. Paul Paul Prew ⪠Statistician 651-795-5942 ⪠fax 651-204-7504 Ecolab Research Center ⪠Mail Stop ESC-F4412-A 655 Lone Oak Drive ⪠Eagan, MN 55121-1560 CONFIDENTIALITY NOTICE: This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fatal Error (was Error in saveLog(currentLogFileName)
Hello, I hope someone can advise me on how to get R running on my Windows PC again. I uninstalled R2.8.1, and also the previous version 2.8 that were on my hard drive. I reinstalled 2.8.1, but when I started the program, the message below came up, and R shut down. Fatal error: unable to restore saved data in .Rdata I uninstalled and reinstalled 2.8.1, but same results. Then I uninstalled 2.8.1 and downloaded the 2.9 beta version. Still the same fatal error message. Any advice on how I can rectify whatever I did to wreck R? thanks, Paul From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Thu 4/9/2009 2:40 PM To: Prew, Paul Cc: r-help@r-project.org Subject: Re: [R] Error in saveLog(currentLogFileName On Apr 9, 2009, at 2:17 PM, Prew, Paul wrote: David, thank you for your helpful reply. a) The sessionInfo goof is actually kind of enlightening. I'm assuming that the purpose of adding the () symbols is to tell R that sessionInfo is a function to be invoked, and leaving () empty says to use default arguments? Yes. Exactly. b) In the R GUI, I end my session by using the menu File Exit. A dialog appears, asking, Save workspace image?, and I choose Yes. In fact, I just did it and got a message, never seen this one before Console not found. Error in structure(.ExternaldotTclObjv:,objv, PACKAGE - tcltk), class = tclObj):[tcl]invalid command name . 6.1. I am not an expert in the Windows GUI, but as a former user of it, I would wonder if my installation of tcltk had gotten corrupted and would try to reinstall. Caveat: That advice could be worth what you paid for it. Closed that message, another similar message was behind it, but now the command name was 17.1 c) yes, agreed --- John Fox has been helpful to me a couple of times. I like the R Commander (it automatically adds those pesky details like () that elude me), but I can't go the CRAN to get packages from it, so I end up back at the GUI to install a package. John has told me that moving back and forth between the GUI and Commander is not how Commander was intended to be used. Perhaps this practice has led to some of the present confusions. Let's say I save a script or history or similar file that I would like to recall at a later time, If I save the file using RCommander, should I only open the file using R Commander? Ditto for R GUI? Regardless of which platform I'm saving the files with, they seem to have some of the same extensions, .R, .r, .Rdata, etc. I doubt that would be necessary, assuming the scripts do not have function calls that depend on R Commander being around. Files with .r extensions are just text files by another name. I could not find a saveLog function with R-search restricted to functions but did find it as an argument to RCmdr. IIRC, RCmdr uses Tck/Tl extensively. -- David Thanks, Paul Paul Prew | Statistician 651-795-5942 | fax 651-204-7504 Ecolab Research Center | Mail Stop ESC-F4412-A 655 Lone Oak Drive | Eagan, MN 55121-1560 -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Thursday, April 09, 2009 12:17 PM To: Prew, Paul Cc: r-help@r-project.org Subject: Re: [R] Error in saveLog(currentLogFileName This is just two suggestions and a guess. a) When you desire the information from sessionInfo ,you need to type : sessionInfo() # not sessionInfo ... results come from function calls and all you got was the sessionInfo code produced by the implicit print function to which you gave the argument sessionInfo. b) Tell us exactly the methods you used to save my script, my workspace, etc? c) As a guess, you had the Zelig package loaded under R-Commander yesterday but not today. The right person to ask this question of might be one of those package maintainers. (John Fox is very good about supporting R-Commander.) David Winsemius, MD Heritage Laboratories West Hartford, CT CONFIDENTIALITY NOTICE: \ This e-mail communication an...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fatal Error (was Error in saveLog(currentLogFileName)
Dear Uwe and David, Thank you, deleting the .RData file did the trick. It's fantastic that you are so readily willing to help. Paul From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Sat 4/11/2009 3:27 PM To: Prew, Paul Cc: r-help@r-project.org Subject: Re: [R] Fatal Error (was Error in saveLog(currentLogFileName) Prew, Paul wrote: Hello, I hope someone can advise me on how to get R running on my Windows PC again. I uninstalled R2.8.1, and also the previous version 2.8 that were on my hard drive. I reinstalled 2.8.1, but when I started the program, the message below came up, and R shut down. Fatal error: unable to restore saved data in .Rdata I uninstalled and reinstalled 2.8.1, but same results. Then I uninstalled 2.8.1 and downloaded the 2.9 beta version. Still the same fatal error message. Any advice on how I can rectify whatever I did to wreck R? Probably your .Rdata file contains some objects that cannot be restored in the newer version of R. Perhaps some S4 object that is no longer valid? You could try to load the workspace with your old version of R again. Or better, if you do not need the wrokspace, just delete the .Rdata file in your working directory. Best, Uwe Ligges thanks, Paul From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Thu 4/9/2009 2:40 PM To: Prew, Paul Cc: r-help@r-project.org Subject: Re: [R] Error in saveLog(currentLogFileName On Apr 9, 2009, at 2:17 PM, Prew, Paul wrote: David, thank you for your helpful reply. a) The sessionInfo goof is actually kind of enlightening. I'm assuming that the purpose of adding the () symbols is to tell R that sessionInfo is a function to be invoked, and leaving () empty says to use default arguments? Yes. Exactly. b) In the R GUI, I end my session by using the menu File Exit. A dialog appears, asking, Save workspace image?, and I choose Yes. In fact, I just did it and got a message, never seen this one before Console not found. Error in structure(.ExternaldotTclObjv:,objv, PACKAGE - tcltk), class = tclObj):[tcl]invalid command name . 6.1. I am not an expert in the Windows GUI, but as a former user of it, I would wonder if my installation of tcltk had gotten corrupted and would try to reinstall. Caveat: That advice could be worth what you paid for it. Closed that message, another similar message was behind it, but now the command name was 17.1 c) yes, agreed --- John Fox has been helpful to me a couple of times. I like the R Commander (it automatically adds those pesky details like () that elude me), but I can't go the CRAN to get packages from it, so I end up back at the GUI to install a package. John has told me that moving back and forth between the GUI and Commander is not how Commander was intended to be used. Perhaps this practice has led to some of the present confusions. Let's say I save a script or history or similar file that I would like to recall at a later time, If I save the file using RCommander, should I only open the file using R Commander? Ditto for R GUI? Regardless of which platform I'm saving the files with, they seem to have some of the same extensions, .R, .r, .Rdata, etc. I doubt that would be necessary, assuming the scripts do not have function calls that depend on R Commander being around. Files with .r extensions are just text files by another name. I could not find a saveLog function with R-search restricted to functions but did find it as an argument to RCmdr. IIRC, RCmdr uses Tck/Tl extensively. -- David Thanks, Paul Paul Prew | Statistician 651-795-5942 | fax 651-204-7504 Ecolab Research Center | Mail Stop ESC-F4412-A 655 Lone Oak Drive | Eagan, MN 55121-1560 -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Thursday, April 09, 2009 12:17 PM To: Prew, Paul Cc: r-help@r-project.org Subject: Re: [R] Error in saveLog(currentLogFileName This is just two suggestions and a guess. a) When you desire the information from sessionInfo ,you need to type : sessionInfo() # not sessionInfo ... results come from function calls and all you got was the sessionInfo code produced by the implicit print function to which you gave the argument sessionInfo. b) Tell us exactly the methods you used to save my script, my workspace, etc? c) As a guess, you had the Zelig package loaded under R-Commander yesterday but not today. The right person to ask this question of might be one of those package maintainers. (John Fox is very good about supporting R-Commander.) David Winsemius, MD Heritage Laboratories West Hartford, CT CONFIDENTIALITY NOTICE: \ This e-mail communication an...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read
[R] Error in saveLog(currentLogFileName
Hello, very basic question from a user who is baffled by the workings of computers in general When logging off R, a dialog box asked if I wanted to save my log, I chose yes. Then I noticed that the following message appeared in the Command Window Error in saveLog(currentLogFileName) : unused argument(s) (C:/Documents and Settings/prewpj/My Documents/Data/Analyses/Healthcare/Hand Care Panel Test_KMolinaro/soapfeel_ZeligMixedLogit.R) My attempts to find the parameters that saveLog requires havenât been successful, so Iâm wondering if someone on this list could advise me. Yesterday, I saved several files: my script, my workspace, etc. to the above filepath, but today they are not there. A Windows search of my C:drive using the names of the files came up empty. They didnât show up when I listed the objects from the default R directory, either --- ls() command didnât list yesterdayâs files. Further, the sessionInfo output is like nothing Iâve seen before. Typically, it starts out with âR version 2.8.1 (2008-12-22) i386-pc-mingw32 ⦠Thank you, Paul ?saveLog No documentation for 'saveLog' in specified packages and libraries: you could try '??saveLog' ??saveLog No help files found with alias or concept or title matching 'saveLog' using fuzzy matching. help.search(saveLog) Error in help.search(saveLog) : object saveLog not found help.search(saveLog) No help files found with alias or concept or title matching 'saveLog' using fuzzy matching. sessionInfo function (package = NULL) { z - list() z$R.version - R.Version() z$locale - Sys.getlocale() if (is.null(package)) { package - grep(^package:, search(), value = TRUE) keep - sapply(package, function(x) x == package:base || !is.null(attr(as.environment(x), path))) package - sub(^package:, , package[keep]) } pkgDesc - lapply(package, packageDescription) if (length(package) == 0) stop(no valid packages were specified) basePkgs - sapply(pkgDesc, function(x) !is.null(x$Priority) x$Priority == base) z$basePkgs - package[basePkgs] if (any(!basePkgs)) { z$otherPkgs - pkgDesc[!basePkgs] names(z$otherPkgs) - package[!basePkgs] } loadedOnly - loadedNamespaces() loadedOnly - loadedOnly[!(loadedOnly %in% package)] if (length(loadedOnly)) { names(loadedOnly) - loadedOnly pkgDesc - c(pkgDesc, lapply(loadedOnly, packageDescription)) z$loadedOnly - pkgDesc[loadedOnly] } class(z) - sessionInfo z } environment: namespace:utils Paul Prew ⪠Statistician 651-795-5942 ⪠fax 651-204-7504 Ecolab Research Center ⪠Mail Stop ESC-F4412-A 655 Lone Oak Drive ⪠Eagan, MN 55121-1560 CONFIDENTIALITY NOTICE: This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in saveLog(currentLogFileName
David, thank you for your helpful reply. a) The sessionInfo goof is actually kind of enlightening. I'm assuming that the purpose of adding the () symbols is to tell R that sessionInfo is a function to be invoked, and leaving () empty says to use default arguments? b) In the R GUI, I end my session by using the menu File Exit. A dialog appears, asking, Save workspace image?, and I choose Yes. In fact, I just did it and got a message, never seen this one before Console not found. Error in structure(.ExternaldotTclObjv:,objv, PACKAGE - tcltk), class = tclObj):[tcl]invalid command name .6.1. Closed that message, another similar message was behind it, but now the command name was 17.1 c) yes, agreed --- John Fox has been helpful to me a couple of times. I like the R Commander (it automatically adds those pesky details like () that elude me), but I can't go the CRAN to get packages from it, so I end up back at the GUI to install a package. John has told me that moving back and forth between the GUI and Commander is not how Commander was intended to be used. Perhaps this practice has led to some of the present confusions. Let's say I save a script or history or similar file that I would like to recall at a later time, If I save the file using RCommander, should I only open the file using R Commander? Ditto for R GUI? Regardless of which platform I'm saving the files with, they seem to have some of the same extensions, .R, .r, .Rdata, etc. Thanks, Paul Paul Prew | Statistician 651-795-5942 | fax 651-204-7504 Ecolab Research Center | Mail Stop ESC-F4412-A 655 Lone Oak Drive | Eagan, MN 55121-1560 -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Thursday, April 09, 2009 12:17 PM To: Prew, Paul Cc: r-help@r-project.org Subject: Re: [R] Error in saveLog(currentLogFileName This is just two suggestions and a guess. a) When you desire the information from sessionInfo ,you need to type : sessionInfo() # not sessionInfo ... results come from function calls and all you got was the sessionInfo code produced by the implicit print function to which you gave the argument sessionInfo. b) Tell us exactly the methods you used to save my script, my workspace, etc? c) As a guess, you had the Zelig package loaded under R-Commander yesterday but not today. The right person to ask this question of might be one of those package maintainers. (John Fox is very good about supporting R-Commander.) On Apr 9, 2009, at 12:18 PM, Prew, Paul wrote: Hello, very basic question from a user who is baffled by the workings of computers in general When logging off R, a dialog box asked if I wanted to save my log, I chose yes. Then I noticed that the following message appeared in the Command Window Error in saveLog(currentLogFileName) : unused argument(s) (C:/Documents and Settings/prewpj/My Documents/ Data/Analyses/Healthcare/Hand Care Panel Test_KMolinaro/ soapfeel_ZeligMixedLogit.R) My attempts to find the parameters that saveLog requires haven’t been successful, so I’m wondering if someone on this list could advise me. Yesterday, I saved several files: my script, my workspace, etc. to the above filepath, but today they are not there. A Windows search of my C:drive using the names of the files came up empty. They didn’t show up when I listed the objects from the default R directory, either --- ls() command didn’t list yesterday’s files. Further, the sessionInfo output is like nothing I’ve seen before. Typically, it starts out with “R version 2.8.1 (2008-12-22) i386-pc-mingw32 … Thank you, Paul ?saveLog No documentation for 'saveLog' in specified packages and libraries: you could try '??saveLog' ??saveLog No help files found with alias or concept or title matching 'saveLog' using fuzzy matching. help.search(saveLog) Error in help.search(saveLog) : object saveLog not found help.search(saveLog) No help files found with alias or concept or title matching 'saveLog' using fuzzy matching. sessionInfo function (package = NULL) { z - list() z$R.version - R.Version() z$locale - Sys.getlocale() if (is.null(package)) { package - grep(^package:, search(), value = TRUE) keep - sapply(package, function(x) x == package:base || !is.null(attr(as.environment(x), path))) package - sub(^package:, , package[keep]) } pkgDesc - lapply(package, packageDescription) if (length(package) == 0) stop(no valid packages were specified) basePkgs - sapply(pkgDesc, function(x) !is.null(x$Priority) x$Priority == base) z$basePkgs - package[basePkgs] if (any(!basePkgs)) { z$otherPkgs - pkgDesc[!basePkgs] names(z$otherPkgs) - package[!basePkgs] } loadedOnly - loadedNamespaces() loadedOnly - loadedOnly[!(loadedOnly %in% package)] if (length(loadedOnly)) { names
[R] repolr output
Hello all, I am unsure of how to interpret the output from a Generalized Estimating Equation analysis of an ordinal response. I hope someone can enlighten me. The analysis was done using package 'repolr'. The data consists of a Score on a 3-point scale from 56 Subjects after repeatedly washing their hands with soap. Two soap Products were tested, each panelist washed 10 times = 10 Applications. I'm puzzled by some aspects of the model output below from repolr--- * correlation structure changed from AR1 to Fixed * additional factors = cuts 1 and 2 added to model ( is this testing for the ability of GLM to detect cutpoints for the assumed latent effect?) * assume that factor(Product)[T.2] refers to the 2nd level of the Product factor (coded 869), for comparison to baseline of 1st level (coded 143) Any insight is much appreciated. Thanks, Paul Model = This is the model that was fit, using the repolr package: soapfeel.mod - repolr(formula = Score ~ factor(Product) * Application , subjects = Subject , data = soap.data , categories = 3 , times = c(1,2,3,4,5,6,7,8,9,10) , corstr = ar1, tol = 0.001, scalevalue = 1, alpha = 0.5,po.test=TRUE, fixed=FALSE) Output= summary(soapfeel.mod[[gee]]) GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA gee S-function, version 4.13 modified 98/01/27 (1998) Model: Link: Logit Variance to Mean Relation: Binomial Correlation Structure: Fixed Call: ogee(formula = formula, id = exdata$exdata$subjects, data = exdata$exdata, R = R_mat, b = as.numeric(coeffs), maxiter = 10, family = binomial, corstr = fixed, silent = TRUE, scale.fix = TRUE, scale.value = scalevalue) Summary of Residuals: Min 1Q Median 3Q Max -0.32791175 -0.13163165 -0.05841431 -0.02337869 0.97076089 Coefficients: Estimate Naive S.E. Naive z Robust S.E. factor(cuts)1-3.0093220 0.47829379 -6.291786 0.51860442 factor(cuts)2-1.3082469 0.41257539 -3.170928 0.40572065 factor(Product)[T.2] -0.6136790 0.58810180 -1.043491 0.69995927 Application -0.1445904 0.07672638 -1.884494 0.09077013 factor(Product)[T.2]:Application 0.2650185 0.09867353 2.685811 0.10336124 Robust z factor(cuts)1-5.8027310 factor(cuts)2-3.2245017 factor(Product)[T.2] -0.8767352 Application -1.5929293 factor(Product)[T.2]:Application 2.5640026 Estimated Scale Parameter: 1 Number of Iterations: 1 Working Correlation [,1] [,2] [,3] [,4] [,5] [1,] 1.00e+00 4.271812e-01 2.375569e-01 1.014798e-01 5.643328e-02 ... ... Data frame str(soap.data) 'data.frame': 560 obs. of 6 variables: $ Subject: int 1 1 1 1 1 1 1 1 1 1 ... $ Product: int 869 869 869 869 869 869 869 869 869 869 ... $ Question : int 2 2 2 2 2 2 2 2 2 2 ... $ Application: int 1 2 3 4 5 6 7 8 9 10 ... $ Score : int 3 3 3 3 3 3 3 3 3 3 ... =sessionInfo()== R version 2.8.1 (2008-12-22) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] tcltk grid stats graphics grDevices datasets utils [8] methods base other attached packages: [1] Rcmdr_1.4-7 car_1.2-12 repolr_1.0 hints_1.0.1-1 [5] gee_4.13-13 effects_2.0-3 nnet_7.2-45 MASS_7.2-45 [9] lattice_0.17-20 geepack_1.0-16 loaded via a namespace (and not attached): [1] boot_1.2-35lme4_0.999375-28 Matrix_0.999375-20 tools_2.8.0 [5] Zelig_3.4-1 Paul Prew Statistician Ecolab ESC F44, 655 Lone Oak Drive Eagan, MN 55123 CONFIDENTIALITY NOTICE: \ This e-mail communication an...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Time-Ordered Clustering
Dear Ingmar, Thank you for your reply, I hope I answer your question --- A couple specific applications I have in mind: * We work with customers to reduce energy consumption from use of hot water. Baseline data was gathered at several locations by attaching a temperature sensor downstream from a hot water valve and recording the temperature every two minutes, over 10 days. Example questions: how many times was hot water used? What was the average duration? What were the average temperatures of the hot water? * We have products that have a 3-stage lifecycle: 1) ramp-up = 2) steady-state = 3) end of life. Performance is different in each stage. Data is gathered by attaching sensors to the product, and continuously monitoring. Example questions for each stage:What was the average duration? What was the average performance? I'm not very familiar with markov processes, and don't know what detail is necessary to specify a transition matrix. The processes have not been regular/predictable/cyclical enough to consider time series analyses. Thanks, Paul = Message: 123 Date: Fri, 13 Mar 2009 10:45:45 +0100 From: Ingmar Visser i.vis...@uva.nl Subject: Re: [R] Time-Ordered Clustering To: Prew, Paul paul.p...@ecolab.com Cc: r-help@r-project.org Message-ID: 38589632-d0db-4466-8a69-887b9beef...@uva.nl Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Dear Paul, Could you be more specific about what you mean here? I don't know the Runger paper so it's hard to tell what it is that you're looking for. Blatant plug: I developed a package for hidden Markov models called depmixS4 that in some sense does what you want: clustering taking dependencies over time into account by specifying a transition matrix. Similarly, there are other packages that fit similar models, searching for hidden markov model provides a number of them. hth, Ingmar Visser On 12 Mar 2009, at 23:39, Prew, Paul wrote: Hello All, Does anyone know of a package that performs constraint-based clusters? Ideally the package could perform Time-Ordered Clustering, a technique applied in a recent journal article by Runger, Nelson, Harnish (using MS Excel). Quote, in our specific implementation of constrained clustering, the clustering algorithm remains agglomerative and hierarchical, but observations or clusters are constrained to only join if they are adjacent in time. CRAN searches using variants of cluster and/or constraint and/or time etc. didn't yield anything I could recognize. Thank you, Paul Paul Prew Ecolab Eagan, MN paul.p...@ecolab.com CONFIDENTIALITY NOTICE: =\ \ This e-mail communication a...{{dropped: 12}} CONFIDENTIALITY NOTICE: =\ \ This e-mail communication a...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Time-Ordered Clustering
Hello All, Does anyone know of a package that performs constraint-based clusters? Ideally the package could perform Time-Ordered Clustering, a technique applied in a recent journal article by Runger, Nelson, Harnish (using MS Excel). Quote, in our specific implementation of constrained clustering, the clustering algorithm remains agglomerative and hierarchical, but observations or clusters are constrained to only join if they are adjacent in time. CRAN searches using variants of cluster and/or constraint and/or time etc. didn't yield anything I could recognize. Thank you, Paul Paul Prew Ecolab Eagan, MN paul.p...@ecolab.com CONFIDENTIALITY NOTICE: =\ \ This e-mail communication a...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Windows version of package Ratings?
Hello, Does anyone know whether a Windows version of the Ratings package will be available? Thank you, Paul Prew CONFIDENTIALITY NOTICE: \ This e-mail communication an...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] compare performance behavior over a lifecycle
Dear all, I would like to characterize the behavior of a certain product over its lifetime. The purpose is to compare the behavior of experimental versions of the product vs. a baseline version. It's known that the behavior of the product follows a roughly trapezoidal pattern: start-up effect where the performance ramps up, then steady state performance, finally a degradation at the end of the life. A sensor captures the performance every two minutes. Is anyone aware of a package that can be used for such a cradle-to-grave comparison? This is not quite survival analyses, because I'm interested in more than just 'length of life'. Interest is in 'length of ramp-up', 'length amplitude of steady-state' and 'length of degradation'. I've considered Time Series with seasonality or Fourier analyses to deconstruct the behavior into periodic component. However, the I'm not aware of TS or Fourier comparisons for contrasting two populations. Such comparisons are probably common in image analysis, but that's a foreign field to me. Thank you, Paul Paul Prew Ecolab RD Eagan, MN CONFIDENTIALITY NOTICE: \ This e-mail communication an...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.