Re: [R] drc results differ for different versions
Hello Thanks a lot Marc, for the suggestion to explore the issue a bit more systematically So I did and the conclusion is that with the old drc 1.4-2, I get a SE=0.003, with the new drc 1.5-2, I get a SE=0.4 irrespective of the R version or the version of the packages drc depends on I hope somebody can help me further, I still have the feeling that a SE of 0.003 on an IC50 of 1.2, for a reasonable good fit is more realistic than a SE of 0.4 on an IC50 of 1.2 Regards, Hans Here are the results of the following test script: d-data.frame(dose=c(2.00e-05,4.00e-06,8.00e-07,1.60e-07,3.20e-08,6.40e- 09,1.28e-09,2.56e-10,5.10e-11,1.00e-11,2.00e-05,4.00e-06,8.00e-07,1.60e- 07,3.20e-08,6.40e-09,1.28e-09,2.56e-10,5.10e-11,1.00e-11),response=c(97. 202,81.670,47.292,16.924, 16.832, 6.832, 11.118, 1.319, 5.495, -3.352, 102.464, 83.114, 50.631, 22.792, 18.348, 19.066, 27.794, 14.682, 11.992, 12.868)) m- drm(response ~ (log10(dose*1e6)), data = d, fct = l4(fixed = c(NA,NA,NA,NA), names = c(hs, bottom, top, ec50)), logDose = 10, control = drmc(useD = T)) summary(m) RESULTS: sessionInfo() R version 2.7.0 (2008-04-22) i386-pc-mingw32 locale: LC_COLLATE=Dutch_Belgium.1252;LC_CTYPE=Dutch_Belgium.1252;LC_MONETARY=Du tch_Belgium.1252;LC_NUMERIC=C;LC_TIME=Dutch_Belgium.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] drc_1.5-2 plotrix_2.4-2 nlme_3.1-89MASS_7.2-41 lattice_0.17-6 [6] alr3_1.1.7 loaded via a namespace (and not attached): [1] grid_2.7.0 tools_2.7.0 RESULT: ec50:(Intercept)=1.27447 SE=0.43541 CONCLUSION: R 2.7.0 with recent drc 1.5-2 (older dependencies) gives SE=0.43541 === R version 2.9.0 (2009-04-17) i386-pc-mingw32 locale: LC_COLLATE=Dutch_Belgium.1252;LC_CTYPE=Dutch_Belgium.1252;LC_MONETARY=Du tch_Belgium.1252;LC_NUMERIC=C;LC_TIME=Dutch_Belgium.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] drc_1.4-2 plotrix_2.5-5 nlme_3.1-90 MASS_7.2-46 [5] lattice_0.17-22 alr3_1.1.7 loaded via a namespace (and not attached): [1] grid_2.9.0 tools_2.9.0 RESULT: ec50:(Intercept)SE=1.2039e+00 CONCLUSION: R 2.9.0 with old drc 1.4-2 (newer dependencies) gives SE=0.003 Hans, You have three important factors changing here. The version of R, the version of drc and the versions of any relevant drc dependencies (alr3, lattice, magic, MASS, nlme, plotrix). I would first try to install the newer version of drc on the older R system (all else staying the same) and see what you get. Don't run update.packages() here, lest you change other things. Just install the newer version of drc. If you get the same results as the older version, then it might lead you to something in R or one of the package dependencies changing. If you get a different result, then it would lead to something in drc changing. You can also install the old version of drc on your more recent R system to see what you get, which might help to confirm behavior. The old source version of drc would be available from: http://cran.r-project.org/src/contrib/Archive/drc/ I also found a Windows binary of the old package here: http://cran.r-project.org/bin/windows/contrib/2.6/drc_1.4-2.zip I have also copied Christian Ritz, the drc package author/maintainer, so that he may be able to assist you further with the problem. HTH, Marc Schwartz -- This e-mail and its attachment(s) (if any) may contain confidential and/or proprietary information and is intended for its addressee(s) only. Any unauthorized use of the information contained herein (including, but not limited to, alteration, reproduction, communication, distribution or any other form of dissemination) is strictly prohibited. If you are not the intended addressee, please notify the orginator promptly and delete this e-mail and its attachment(s) (if any) subsequently. Galapagos nor any of its affiliates shall be liable for direct, special, indirect or consequential damages arising from alteration of the contents of this message (by a third party) or as a result of a virus being passed on. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] arrangement of crowded labels
Thomas Zumbrunn wrote: ... Thanks for your answers. This was almost what I was looking for, except that I would need something for a 2-dimensional context (my question was not specific enough). Hi Thomas, The thigmophobe.labels function just works out how to place each label away from the nearest point to the point that is being labeled. The pointLabel function in the maptools package and the spread.labs function in Teaching Demos use more sophisticated algorithms to do this, and may work in situations where thigmophobe.labels wouldn't separate all the labels. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Confirmatory factor analysis problems using sem package (works in Amos)
Hello all, I'm trying to replicate a confirmatory factor analysis done in Amos. The idea is to compare a one-factor and a two-factor model. I get the following warning message when I run either model: Could not compute QR decomposition of Hessian. Optimization probably did not converge. I have no idea what to do here. I believe posters reported the same problem. It seems that the ability to invert the correlation matrix may have something to do with this error, but solve(correl) yields a solution. Here are my correlation matrix and model specifications: --- R CODE BEGIN correl - matrix( c(1.000,-0.6657822,0.6702089,0.7997673,-0.7225454,0.8992372, -0.8026491,-0.6715168,-0.5781565,-0.6657822,1.000,-0.5107568, -0.5030886,0.6971188,-0.6306937,0.7200848,0.5121227,0.4496810, 0.6702089,-0.5107568,1.000,0.7276558,-0.3893792,0.6043672, -0.7176532,-0.5247434,-0.4670362,0.7997673,-0.5030886,0.7276558, 1.000,-0.6251056,0.8164190,-0.6728515,-0.6371453,-0.5531964, -0.7225454,0.6971188,-0.3893792,-0.6251056,1.000,-0.7760765, 0.6175124,0.5567924,0.4914176,0.8992372,-0.6306937,0.6043672, 0.8164190,-0.7760765,1.000,-0.7315507,-0.6625136,-0.5814590, -0.8026491,0.7200848,-0.7176532,-0.6728515,0.6175124,-0.7315507, 1.000,0.5910688,0.5096898,-0.6715168,0.5121227,-0.5247434, -0.6371453,0.5567924,-0.6625136,0.5910688,1.000,0.8106496, -0.5781565,0.4496810,-0.4670362,-0.5531964,0.4914176,-0.5814590, 0.5096898,0.8106496,1.000), ,nrow=9,ncol=9) rownames(correl) = c(pvote, jmposaff, jmnegaff, boposaff,bonegaff, obama.therm, mccain.therm, oddcand.D, evencand.D ) colnames(correl) = c(pvote, jmposaff, jmnegaff, boposaff,bonegaff, obama.therm, mccain.therm, oddcand.D, evencand.D ) #One Factor Model: model.all - specify.model() allmeasures - pvote, b1, NA allmeasures - obama.therm, b2, NA allmeasures - mccain.therm,b3, NA allmeasures - jmposaff,b4, NA allmeasures - jmnegaff,b5, NA allmeasures - boposaff,b6, NA allmeasures - bonegaff,b7, NA allmeasures - oddcand.D, b8, NA allmeasures - evencand.D, b9, NA allmeasures - allmeasures,NA,1 pvote - pvote,v1, NA obama.therm - obama.therm,v2, NA mccain.therm - mccain.therm, v3, NA jmposaff - jmposaff, v4, NA jmnegaff - jmnegaff, v5, NA boposaff - boposaff, v6, NA bonegaff - bonegaff, v7, NA oddcand.D - oddcand.D,v8, NA evencand.D - evencand.D, v9, NA sem.all - sem(model.all, correl, 1100) summary(sem.all) #Two Factor Model: model.vi - specify.model() verbal - pvote,b1, NA verbal - obama.therm, b2, NA verbal - mccain.therm, b3, NA verbal - jmposaff, b4, NA verbal - jmnegaff, b5, NA verbal - boposaff, b6, NA verbal - bonegaff, b7, NA imp - oddcand.D, b8, NA imp - evencand.D, b9, NA imp - imp,NA, 1 verbal - verbal, NA, 1 pvote - pvote,v1, NA obama.therm - obama.therm,v2, NA mccain.therm - mccain.therm, v3, NA jmposaff - jmposaff, v4, NA jmnegaff - jmnegaff, v5, NA boposaff - boposaff, v6, NA bonegaff - bonegaff, v7, NA oddcand.D - oddcand.D,v8, NA evencand.D - evencand.D, v9, NA imp - verbal, civ, NA sem.vi - sem(model.vi, correl, 1100) summary(sem.vi) --- R CODE END Thanks very much. -Solomon -- View this message in context: http://www.nabble.com/Confirmatory-factor-analysis-problems-using-sem-package-%28works-in-Amos%29-tp23664618p23664618.html Sent
[R] Goodness of fit in quantile regression
Dear R users, I've used the function qr.fit.sfn to estimate a quantile regression on a panel data set. Now I would like to compute an statistic to measure the goodness of fit of this model. Does someone know how could I do that? I could compute a pseudo R2 but in order to do that I would need the value of the objetive function at the optimum and I don't see how to get this from the function qr.fit.sfn. If someone has any good idea about how to solve this problem I would be most grateful! Best Laura -- View this message in context: http://www.nabble.com/Goodness-of-fit-in-quantile-regression-tp23666962p23666962.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drc results differ for different versions
Hi Hans, I hope I can resolve your problems below (Marc, thank you very much for cc'ing me on your initial response!). Have a look at the following R lines: ## Fitting the model using drm() (from the latest version) m1- drm(response ~ dose, data = d, fct = LL.4()) summary(m1) plot(m1) ## Checking the fit by using nls() ## (we have very good guesses for the parameter estimates) m2 - nls(response ~ c + (d - c)/(1 + (dose/e)^b), data=d, start=list(b=-0.95, c=10, d=106, e=1.2745e-06)) summary(m2) The standard errors agree quite well. The minor discrepancies between to two fits are attributable to different numerical approximations of the variance-covariance matrix being used in drm() and nls(). So I would use the latest version of 'drc', especially for datasets with really small doses. One recent change to drm() was to incorporate several layers of scaling prior to estimation (as well as subsequent back scaling after estimation): 1) scaling of parameters with the same scale as the x axis 2) scaling of parameters with the same scale as the y axis 3) scaling of parameters in optim() The effect of scaling is to temporarily convert the dataset (and the model) to scales that are more convenient for the estimation procedure. Any feedback on this would be much appreciated. Therefore it should also not be necessary to manually do any scaling prior to using drm() (like what you did). Compare, for instance, your specification of drm() to mine above. Is this explanation useful?! Christian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] likelihood
I have a problem related to measuring likelihood between -an observed presence absence dataset (containing 0 or 1) -a predicted simulation matrix of the same dimensions (containing values from 0 to 1) This must be a common problem but I am struggling to find the answer in the literature. Within the simulation model I have a parameter 'm' which I can alter to find the best fit (maximum likelihood). Currently I just use a 'sum of squares of difference' to measure likelihood. ie likelihood = sum (obs-pred)^2 This is then very easy to find (using numerical optimisation techniques) the value of 'm' which gives my maximum likelihood (least sum of squares) However I do not think my likelihood function is the correct one to be using for this purpose. Firstly, if sum of squares is the correct method, maybe I should be taking the square root of the likelihood (makes no difference) and possibly the 'mean' values of the datasets may need to be included in my calculatons. However, sum of squares suggests my data are normally distributed (which it is clearly not) Obs (boolean O or 1) Pred (beta O to 1) Difference (beta -1 to 1) My guess is that I should be using a beta (or uniform) defined likelihood measure. Or maybe just a simple transformation. Any help greatly appreciated Mark _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Behavior of seq with vector from
Hi, Perhaps you can try this, seq.weave - function(froms, by, length, ... ){ c( matrix(c(sapply(froms, seq, by=by, length = length/2, ...)), nrow=length(froms), byrow=T) ) } seq.weave(c(2, 3), by=3, length=8) seq.weave(c(2, 3, 4), by=2, length=8) HTH, baptiste On 21 May 2009, at 21:08, Rowe, Brian Lee Yung (Portfolio Analytics) wrote: Hello, I want to use seq with multiple from values and am getting unexpected (to me) behavior. I'm wondering if this behavior is intentional or not. seq(2, by=3, length.out=4) [1] 2 5 8 11 seq(3, by=3, length.out=4) [1] 3 6 9 12 Now if I want the combined sequence, I thought I could pass in c(2,3), and I get: seq(c(2,3), by=3, length.out=8) [1] 2 6 8 12 14 18 20 24 However, the result is not what I expected (i.e. what I wanted): [1] 2 3 5 6 8 9 11 12 It seems that this is a consequence of vector recycling during the summation in seq.default: if (missing(to)) from + (0L:(length.out - 1L)) * by To get the value I want, I am using the following code: sort(as.vector(apply(array(c(2,3)), 1, seq, by=3,length.out=4))) [1] 2 3 5 6 8 9 11 12 So two questions: 1. Is seq designed/intended to be used with a vector from argument, and is this the desired behavior? 2. If so, is there a cleaner way of implementing what I want? Thanks, Brian -- This message w/attachments (message) may be privileged...{{dropped:26}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] System crash when using surv=T in cph
maxb wrote: Can someone help me. I am very new to R. I am fitting a Cox model using Frank Harrell's cph as I want to produce a Nomogram. This is what I have done: Srv- Surv(time,cens) f.cox- cph(Srv~ v1+v2+v3+v4, x=T, y=T, surv=T) This is not reproducible for us. Where are the data? What is cph? I presume the one from package Design? As soon as I press enter, Windows XP crashes. If I remove surv=T, then it works. I have R version 2.9.0. Have you updates all your packages? Which versions of survival and Design are you using? Is it really Windows that crashes, or just R? See http://cran.r-project.org/web/checks/check_results_Design.html to learn that even the newest version of Design produces WARNINGs. If you can send reproducible code to let R crash, please send it to the package maintainer of the involved packages. Best, Uwe Ligges Is there a way of displaying the parameter estimates (ie beta coefficients) and HR when I type anova(f.cox) as this only displays the Chi squared and p-values. Any help or advise drawing a Nomogram will be appreciated. Thanks in advance Max __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Paste Strings as logical for functions?
tsunhin wong wrote: Dear R Users, I have some dynamic selection rules that I want to pass around for my functions: rules - paste(g$TrialList==1 g$Session==2) I guess you do not want to paste() at all: rules - g$TrialList==1 g$Session==2 Uwe Ligges myfunction - function(rules) { index - which(rules) anotherFunction(index) } However, I can't find a way to pass around these selection rules easily (for subset, for which, etc) Please let me know if you have some idea. Thank you very much! - John __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] step by step debugger in R?
Michael wrote: Could anybody point me to the latest status of the most user-friendly debugger in R? I always have been happy with ?debug, ?recover and friends cited there. Although you also find additional software like the debug package .. Best, Uwe Ligges How I wish I don't have to stare at my (long) code for ages and stuck... Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drc results differ for different versions
Yes, thanks that's very useful Apart from checking the fit with nls() as you suggested, I've also used Prism, which gave the following results Equation 1 Best-fit values BOTTOM 10.96 TOP106.4 LOGEC50-5.897 HILLSLOPE 0.9501 EC50 1.2670e-006 Std. Error BOTTOM 2.196 TOP9.337 LOGEC500.1439 HILLSLOPE 0.2270 95% Confidence Intervals BOTTOM 6.301 to 15.61 TOP86.62 to 126.2 LOGEC50-6.202 to -5.592 HILLSLOPE 0.4689 to 1.431 EC50 6.2750e-007 to 2.5560e-006 Goodness of Fit Degrees of Freedom 16 R² 0.9622 Absolute Sum of Squares787.5 Sy.x 7.015 Data Number of X values 20 Number of Y replicates 1 Total number of values 20 Number of missing values 0 In other words: also in line with the drc 1.6-3 and nls() results As for the scaling: yes this is useful because I can't predict whether concentrations are in molar, micromolar,..., right now I indeed scaled dose-values manually, it's better/ more robust when the drm-function takes care of that I suppose this also means I don't have to do the log transformation anymore? Thanks (both of you) for your swift feedback Hans -Original Message- From: Christian Ritz [mailto:r...@life.ku.dk] Sent: vrijdag 22 mei 2009 11:30 To: Hans Vermeiren Cc: r-help@r-project.org; marc_schwa...@me.com Subject: Re: [R] drc results differ for different versions Hi Hans, I hope I can resolve your problems below (Marc, thank you very much for cc'ing me on your initial response!). Have a look at the following R lines: ## Fitting the model using drm() (from the latest version) m1- drm(response ~ dose, data = d, fct = LL.4()) summary(m1) plot(m1) ## Checking the fit by using nls() ## (we have very good guesses for the parameter estimates) m2 - nls(response ~ c + (d - c)/(1 + (dose/e)^b), data=d, start=list(b=-0.95, c=10, d=106, e=1.2745e-06)) summary(m2) The standard errors agree quite well. The minor discrepancies between to two fits are attributable to different numerical approximations of the variance-covariance matrix being used in drm() and nls(). So I would use the latest version of 'drc', especially for datasets with really small doses. One recent change to drm() was to incorporate several layers of scaling prior to estimation (as well as subsequent back scaling after estimation): 1) scaling of parameters with the same scale as the x axis 2) scaling of parameters with the same scale as the y axis 3) scaling of parameters in optim() The effect of scaling is to temporarily convert the dataset (and the model) to scales that are more convenient for the estimation procedure. Any feedback on this would be much appreciated. Therefore it should also not be necessary to manually do any scaling prior to using drm() (like what you did). Compare, for instance, your specification of drm() to mine above. Is this explanation useful?! Christian -- This e-mail and its attachment(s) (if any) may contain confidential and/or proprietary information and is intended for its addressee(s) only. Any unauthorized use of the information contained herein (including, but not limited to, alteration, reproduction, communication, distribution or any other form of dissemination) is strictly prohibited. If you are not the intended addressee, please notify the orginator promptly and delete this e-mail and its attachment(s) (if any) subsequently. Galapagos nor any of its affiliates shall be liable for direct, special, indirect or consequential damages arising from alteration of the contents of this message (by a third party) or as a result of a virus being passed on. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Behavior of seq with vector from
Rowe, Brian Lee Yung (Portfolio Analytics) wrote: To get the value I want, I am using the following code: sort(as.vector(apply(array(c(2,3)), 1, seq, by=3,length.out=4))) [1] 2 3 5 6 8 9 11 12 So two questions: 1. Is seq designed/intended to be used with a vector from argument, and is this the desired behavior? 2. If so, is there a cleaner way of implementing what I want? 1. Hmm, not really. NA. 2. I'd view it as an outer sum, stringed out to a single vector, hence: c(outer(c(2,3), seq(0,,3,4), +)) [1] 2 3 5 6 8 9 11 12 -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drc results differ for different versions
Yes, you're right: taking logarithms is no longer needed! Christian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] anova leads to an error
Dear R-list, the following code had been running well over the last months: exam - matrix(rnorm(100,0,1), 10, 10) gg - factor(c(rep(A, 5), rep(B, 5))) mlmfit - lm(exam ~ 1); mlmfitG - lm(exam ~ gg) result - anova(mlmfitG, mlmfit, X=~0, M=~1) Until, all of a sudden the following error occured: Fehler in apply(abs(sapply(deltassd, function(X) diag((T %*% X %*% t(T), : dim(X) must have a positive length I have not kept track of the changes in my R-version, so it might have to do with that. Now it is: R version 2.9.0 (2009-04-17). Does anybody know more about this error? I would help me a lot! Thank you very much! Nils __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Changing a legend title to bold font
Dear all, I have seen a response from Duncan Murdoch which comes close to solving this one, but I can't quite seem to tailor it to fit my needs! I am trying to make just the title in my legend as bold font, with the legend 'items' in normal typeface. I've tried setting par(font=2) external to the legend command, but this makes everything bold. I've also tried (in the legend code), font.main=2, but I didn't have any luck with this. Any suggestions would be most welcome, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cannot Install Cairo Library
Dear All, I am running Debian testing on my box and I have R 2.9.0 installed from the standard repositories. I downloaded the package source from http://cran.r-project.org/web/packages/Cairo/index.html but when I try to install it on my system, this is what I get $ sudo R CMD INSTALL Cairo_1.4-4.tar.gz * Installing to library ‘/usr/local/lib/R/site-library’ * Installing *source* package ‘Cairo’ ... checking for gcc... gcc -std=gnu99 checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc -std=gnu99 accepts -g... yes checking for gcc -std=gnu99 option to accept ISO C89... none needed checking how to run the C preprocessor... gcc -std=gnu99 -E checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking for ANSI C header files... yes checking for sys/wait.h that is POSIX.1 compatible... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking for string.h... (cached) yes checking sys/time.h usability... yes checking sys/time.h presence... yes checking for sys/time.h... yes checking for unistd.h... (cached) yes checking for an ANSI C-conforming const... yes checking for pkg-config... /usr/bin/pkg-config checking whether pkg-config knows about cairo... yes checking for configurable backends... cairo cairo-ft cairo-pdf cairo-png cairo-ps cairo-xlib cairo-xlib-xrender configure: CAIRO_CFLAGS=-D_REENTRANT -I/usr/include/cairo -I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb -I/usr/include/libpng12 checking if R was compiled with the RConn patch... no checking cairo.h usability... yes checking cairo.h presence... yes checking for cairo.h... yes checking for PNG support in Cairo... yes checking for ATS font support in Cairo... no configure: CAIRO_LIBS=-lfreetype -lfontconfig -lpng12 -lz -lXrender -lcairo -lX11 checking for library containing deflate... none required checking whether Cairo programs can be compiled... yes checking whether cairo_image_surface_get_format is declared... no checking for FreeType support in cairo... yes checking whether FreeType needs additional flags... no checking wheter libjpeg works... yes checking wheter libtiff works... no configure: creating ./config.status config.status: creating src/Makevars config.status: creating src/cconfig.h ** libs gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo -I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb -I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c cairobem.c -o cairobem.o gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo -I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb -I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c cairogd.c -o cairogd.o gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo -I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb -I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c cairotalk.c -o cairotalk.o gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo -I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb -I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c img-backend.c -o img-backend.o gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo -I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb -I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c img-jpeg.c -o img-jpeg.o gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo -I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb -I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c img-tiff.c -o img-tiff.o gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo -I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb -I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c pdf-backend.c -o pdf-backend.o gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo -I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb -I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c ps-backend.c -o ps-backend.o gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo -I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb -I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c svg-backend.c -o svg-backend.o gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo -I/usr/include/pixman-1
Re: [R] Cannot Install Cairo Library
On 22 May 2009 at 13:42, Lorenzo Isella wrote: | Dear All, | I am running Debian testing on my box and I have R 2.9.0 installed from | the standard repositories. | I downloaded the package source from | http://cran.r-project.org/web/packages/Cairo/index.html | but when I try to install it on my system, this is what I get | | $ sudo R CMD INSTALL Cairo_1.4-4.tar.gz | * Installing to library ‘/usr/local/lib/R/site-library’ | * Installing *source* package ‘Cairo’ ... | checking for gcc... gcc -std=gnu99 [...] | gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo | -I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb | -I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c | xlib-backend.c -o xlib-backend.o | xlib-backend.c:34:74: error: X11/Intrinsic.h: No such file or directory ^ e...@ron:~$ dpkg -S `locate Intrinsic.h` libxt-dev: /usr/include/X11/Intrinsic.h e...@ron:~$ | xlib-backend.c: In function ‘Rcairo_init_xlib’: | xlib-backend.c:158: warning: implicit declaration of function | ‘XrmUniqueQuark’ | make: *** [xlib-backend.o] Error 1 | ERROR: compilation failed for package ‘Cairo’ | * Removing ‘/usr/local/lib/R/site-library/Cairo’ | | Is there a header file missing? Or is there anything wrong with my system? | Many thanks Please run 'sudo apt-get install libxt-dev' and try again. Dirk -- Three out of two people have difficulties with fractions. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cannot Install Cairo Library
Dirk Eddelbuettel wrote: On 22 May 2009 at 13:42, Lorenzo Isella wrote: | Dear All, | I am running Debian testing on my box and I have R 2.9.0 installed from | the standard repositories. | I downloaded the package source from | http://cran.r-project.org/web/packages/Cairo/index.html | but when I try to install it on my system, this is what I get | | $ sudo R CMD INSTALL Cairo_1.4-4.tar.gz | * Installing to library ‘/usr/local/lib/R/site-library’ | * Installing *source* package ‘Cairo’ ... | checking for gcc... gcc -std=gnu99 [...] | gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo | -I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb | -I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c | xlib-backend.c -o xlib-backend.o | xlib-backend.c:34:74: error: X11/Intrinsic.h: No such file or directory ^ e...@ron:~$ dpkg -S `locate Intrinsic.h` libxt-dev: /usr/include/X11/Intrinsic.h e...@ron:~$ | xlib-backend.c: In function ‘Rcairo_init_xlib’: | xlib-backend.c:158: warning: implicit declaration of function | ‘XrmUniqueQuark’ | make: *** [xlib-backend.o] Error 1 | ERROR: compilation failed for package ‘Cairo’ | * Removing ‘/usr/local/lib/R/site-library/Cairo’ | | Is there a header file missing? Or is there anything wrong with my system? | Many thanks Please run 'sudo apt-get install libxt-dev' and try again. Dirk Perfect! Now Cairo installs nicely. Many thanks Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reading iputs from adjacency matrix
Dear all , I am just wondering if there is a way to read inputs from adjacency matrix . I am using igraph module. I want to directly load the input as adjacency instead of reading as edgelist or pajek format. Can anypne help ? Thanks , Pankaj Barah Mathematical modelling Computational Biology Group Centre for Cellular Molecular Biology Uppal Road, Hyderabad 500 007, AP, India Explore and discover exciting holidays and getaways with Yahoo! India Travel http://in.travel.yahoo.com/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need a faster function to replace missing data
Tim Clark mudiver1200 at yahoo.com writes: I need some help in coming up with a function that will take two data sets, determine if a value is missing in one, find a value in the second that was taken at about the same time, and substitute the second value in for where the first should have been. This is the type of job I would do with a database, not R (alone). The main advantage is that you have to do the cleanup job only once and can retrieve the data in a rather well-documented way later (it's possible with R, I know). Put the 5 minutes data into one table. I would two new columns giving the delta to the next value for easier linear interpolation, but that's secondary. Make sure to index the table. Put the 1 seconds data into another table, adding values rounded to 5 seconds, and giving these an index. From R/ODBC or with RSQLite, make a Join of all values in Table 1 that do have NULL values in the coordinates. If you do not want to do a linear interpolation, you could even do this within the database and SQL alone. Compute the linear interpolation, and write the data back into the database. If you want to be careful, you might mark the interpolated values in a separate field as computed When at a later time new data come in, you can run the procedure again without penalty. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Step by step: Making an R package with Fortran 95
To all. I need your help. The big question is: How do I make an R library with Fortran 95 files? You may assume that I am a pretty decent programmer in both R and Fortran. I lay out a scenario and need your help! I know how to make an ordinary R package and have dabbled with R + Fortran 95 *.dll linking. I do not have a great handle on the whole Makevars file and whatever else I might need to get things working to build a new R package. By the way, I'm using R 2.8.1 and Windows XP (and sometimes Linux). Let's take this simple situation. Suppose I'm using Windows XP and place the Folder for the package Testme in C:\Program Files\R\R-2.8.1\src\library\Testme Files and folders: DESCRIPTION file -- I understand this file NAMESPACE file: This one has the contents useDynLib(Testme) exportPattern(^[^\\.]) man directory: I put my *.Rd files here. I understand this part. R directory: I put my *.R files here. One of these files calls upon a Fortran 95 subroutine with .Fortran(MySubroutine, var1=as.double(x), var2=as.double(y), etc.) I understand this part. src directory: I put my *.f95 files here. Also, I put a Makevars file here. Suppose that I have two *.f95 files, named CalculatorModule.f95 and ANiceSubroutine.f95. CalculatorModule.f95 contains a Fortran 95 module. ANiceSubroutine.f95 uses the functions in Calculator.Module.f95 with the USE command. To be consistent with my R file (above), ANiceSubroutine.f95 contains the subroutine: MySubroutine. Finally: I issue the command from a DOS Shell in the directory C:\Program Files\R\R_2.8.1\src\gnuwin32 make pkg-Testme This results in errors. Here are the problems: 1. The order of compilation must be f95 -c CalculatorModule.f95 ANiceSubroutine.f95. Unfortunately, R compiles them alphabetically. So, I was given errors making the *.o files. I overcame this problem by renaming the files a01.CalculatorModule.f95 and a02.ANiceSubroutine.f95. I would rather not have to name the files this way if I don't have to! When I did this, R compiled the Fortran files in the correct order, but I still have errors. 2. Here was the error output. Help? C:\Program Files\R\R-2.8.1\src\gnuwin32make pkg-Testme -- Making package Testme adding build stamp to DESCRIPTION installing NAMESPACE file and metadata making DLL ... g95 -O3 -c a01.CalculatorMod.f95 -o a01.CalculatorMod.o g95 -O3 -c a02.ANiceSubroutine.f95 -o a02.ANiceSubroutine.o windres --preprocessor=gcc -E -xc -DRC_INVOKED -I C:/PROGRA~1/R/R-28~1.1/inclu de -i Testme_res.rc -o Testme_res.o g95 -shared -s -o Testme.dll Testme.def a01.CalculatorMod.o a02.ANiceSubroutine.o Testme_res.o -LC:/PROGRA~1/R/R-28~1.1/bin-lR ld: Testme.def:6: syntax error ld:Testme.def: file format not recognized; treating as linker script ld:Testme.def:2: syntax error make[3]: *** [Testme.dll] Error 1 make[2]: *** [srcDynlib] Error 2 make[1]: *** [all] Error 2 make: *** [pkg-Testme] Error 2 Thanks for helping!!! :) -- View this message in context: http://www.nabble.com/Step-by-step%3A-Making-an-R-package-with-Fortran-95-tp23669355p23669355.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to compute the score-statistics from a logistic model using lrm {Design}
Greetings, is there are way to simply compute the score-statistics for a logistic model generated with lrm? For example, I want to compare the wald-statistics for a given model against the score-statistics in order to find the relevant predictors. attach(iris) model = lrm (species~.,iris) anova (model) Wald Statistics Response: Species Factor Chi-Square d.f. P Sepal.Length 1.06 10.3032 Sepal.Width 2.22 10.1358 Petal.Length 3.96 10.0465 Petal.Width 3.52 10.0605 TOTAL5.56 40.2346 So what I now basically want is an output of the score-statistics to verify if it leads to the same predictors than the wald-statistics (anova) and therefore estimate if the score-statistics would produce a better model. I hope I stated everything clearly. Thanks for your help! Christian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confirmatory factor analysis problems using sem package (works in Amos)
Dear Solomon, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of S. Messing Sent: May-22-09 1:27 AM To: r-help@r-project.org Subject: [R] Confirmatory factor analysis problems using sem package (works in Amos) Hello all, I'm trying to replicate a confirmatory factor analysis done in Amos. Occasionally in an ill-conditioned problem, one program will produce a solution and another won't. As a general matter, I'd expect Amos to be more robust than sem() since Amos is written specifically for SEMs, while sem() uses nlm(), a general-purpose optimizer. The idea is to compare a one-factor and a two-factor model. I get the following warning message when I run either model: Could not compute QR decomposition of Hessian. Optimization probably did not converge. I have no idea what to do here. A general strategy is to set debug=TRUE in the call to sem() and see what happens in the optimization. Then there are several things that you can do to try to get the optimization to converge; see ?sem. In this case, however, I wasn't able to get a solution. The one-factor model is equivalent to a one-factor exploratory FA, which can be fit by ML using factanal(): factanal(factors=1, covmat=correl, n.obs=1100) Call: factanal(factors = 1, covmat = correl, n.obs = 1100) Uniquenesses: pvote jmposaff jmnegaff boposaff bonegaff obama.therm mccain.thermoddcand.D evencand.D 0.1000.4960.4970.2770.397 0.1290.3120.4660.585 Loadings: Factor1 pvote-0.949 jmposaff 0.710 jmnegaff -0.709 boposaff -0.850 bonegaff 0.777 obama.therm -0.934 mccain.therm 0.830 oddcand.D 0.731 evencand.D0.645 Factor1 SS loadings 5.744 Proportion Var 0.638 Test of the hypothesis that 1 factor is sufficient. The chi square statistic is 1710.03 on 27 degrees of freedom. The p-value is 0 As you can see, the one-factor model fits the data very poorly (as does a two-factor EFA); I suspect, but am not sure, that this is the source of the problem in sem(). I couldn't get a solution from sem() even when I used the factanal() solution as start values. I believe posters reported the same problem. In almost all cases, the models haven't been properly specified, which is not the case here. Here, the model just doesn't fit the data. It seems that the ability to invert the correlation matrix may have something to do with this error, but solve(correl) yields a solution. No, the input correlation matrix is positive-definite. sem() would have complained if it were not: eigen(correl, only.values=TRUE) $values [1] 6.12561630 0.82418329 0.71616585 0.51263750 0.24467315 0.18248909 0.17024374 [8] 0.13905585 0.08493524 I'll keep your problem as a test case to see whether I can produce a solution, possibly using a different optimizer -- as I mentioned, sem() uses nlm(). Regards, John Here are my correlation matrix and model specifications: --- R CODE BEGIN library(sem) correl - matrix( c(1.000,-0.6657822,0.6702089,0.7997673,-0.7225454,0.8992372, -0.8026491,-0.6715168,-0.5781565,- 0.6657822,1.000,-0.5107568, -0.5030886,0.6971188,- 0.6306937,0.7200848,0.5121227,0.4496810, 0.6702089,-0.5107568,1.000,0.7276558,- 0.3893792,0.6043672, -0.7176532,-0.5247434,-0.4670362,0.7997673,- 0.5030886,0.7276558, 1.000,-0.6251056,0.8164190,-0.6728515,- 0.6371453,-0.5531964, -0.7225454,0.6971188,-0.3893792,- 0.6251056,1.000,-0.7760765, 0.6175124,0.5567924,0.4914176,0.8992372,- 0.6306937,0.6043672, 0.8164190,-0.7760765,1.000,-0.7315507,- 0.6625136,-0.5814590, -0.8026491,0.7200848,-0.7176532,- 0.6728515,0.6175124,-0.7315507, 1.000,0.5910688,0.5096898,-0.6715168,0.5121227,- 0.5247434, -0.6371453,0.5567924,- 0.6625136,0.5910688,1.000,0.8106496, -0.5781565,0.4496810,-0.4670362,- 0.5531964,0.4914176,-0.5814590, 0.5096898,0.8106496,1.000), ,nrow=9,ncol=9) rownames(correl) = c(pvote, jmposaff, jmnegaff, boposaff,bonegaff, obama.therm, mccain.therm, oddcand.D, evencand.D ) colnames(correl) = c(pvote, jmposaff, jmnegaff, boposaff,bonegaff, obama.therm, mccain.therm, oddcand.D, evencand.D ) #One Factor Model: model.all - specify.model() allmeasures - pvote, b1, NA allmeasures - obama.therm,
Re: [R] survfit, summary, and survmean (was Changelog for survival package)
Further I appreciate your new function survmean(). At the moment it seems to be intended as internal, and not documented in the help. The computations done by print.survfit are now a part of the results returned by summary.survfit. See 'table' in the output list of ?summary.survfit. Both call an internal survmean() function to ensure that any future updates stay in synchrony. This was a perennial (and justified) complaint with print.survfit. Per the standard print(x) always returns x, so there was no way to get the results of the print as an S object. Terry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] System crash when using surv=T in cph
maxb wrote: Can someone help me. I am very new to R. I am fitting a Cox model using Frank Harrell's cph as I want to produce a Nomogram. This is what I have done: Srv- Surv(time,cens) f.cox- cph(Srv~ v1+v2+v3+v4, x=T, y=T, surv=T) As soon as I press enter, Windows XP crashes. If I remove surv=T, then it works. I have R version 2.9.0. Is there a way of displaying the parameter estimates (ie beta coefficients) and HR when I type anova(f.cox) as this only displays the Chi squared and p-values. Any help or advise drawing a Nomogram will be appreciated. Thanks in advance Max Until Design is updated (which will be very soon) source('http://biostat.mc.vanderbilt.edu/tmp/cphnew.s') after library(Design). anova is not supposed to display parameter estimates. For one thing, there is often more than one parameter associated with a term in the model. Use summary(fit) to get hazard ratios for sensible covariate ranges. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Class for time of day?
On Thu, May 21, 2009 at 8:28 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: It uses hours/minutes/seconds for values 1 day and uses days and fractions of a day otherwise. Yes, my examples were documenting this idiosyncracy. For values and operations that it has not considered it falls back to the internal representation. Yes, my examples were documenting this bad behavior. Most of your examples start to make sense once you realize this. Of course I realize this. That was the point of my examples. I understand perfectly well what is *causing* the bad behavior. That doesn't make it less bad. What is the point of a class system if functions ignore the class and perform meaningless calculations on the internal representation? -s [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] postscript problems (landscape orientation)
On May 21, 2009, at 6:31 PM, Ted Harding wrote: On 21-May-09 23:02:28, David Scott wrote: Well most people deal with that problem by not using Acrobat to read .pdf files. On linux you can use evince or xpdf. On windows just use gsview32. Those readers don't lock the .pdf. I am with Peter and generally go straight to pdf these days. The only reason for going through postscript is if you want to use psfrag. David Scott Going off at a tangent to the original query, I would say that this is too limited a view! For one thing, PostScript (in its Encasulated manifestation) is readily imported and re-scalable by software which does not import PFF. Also, PS is an editable plain-text file (even though there may be chunks in hexadecimal for some graphics -- but it's still ASCII). One thing which I quite often do is edit the %%BoundingBox: line to improve the framing of the graphic -- and viewing it in ghostscript with watch mode on as one edits, one can easily adjust things to a satisfactory visual effect. If you know what you are doing, you can if you wish move things around, or add or delete things (especially bits of text) by using any plain-text editor on the PostScript file. Finally (though this may be a symptom of serious masochsim on my part), if I download a PDF in which the author has plotted the data, after I print to file in PostScript from Acrobat Reader I can usually obtain a very close approximation to the original data values by extracting the PS coordinates of the plotted points (and axis tick-marks) from the PostScript file. The only reason for going through postscript is if you want to use psfrag -- or psnup and/or psbook or ... PSTricks (http://www.tug.org/PSTricks/), which I use for creating flow chart types of figures, such as subject disposition charts in clinical trialsin Sweave, I can then fill in text labels in the various boxes using \Sexpr with counts, etc. Examples of use here: http://www.tug.org/PSTricks/main.cgi?file=examples On OSX I find that OSX' Preview works quite well in place of Adobe Reader, save for certain animations in PDF files. Also, for those still reading this thread and are on OSX, if you are not aware, there are additional plugins for QuickLook for EPS files and other such things: http://www.quicklookplugins.com/ HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Step by step: Making an R package with Fortran 95
On 5/22/2009 8:18 AM, sjnovick wrote: To all. I need your help. The big question is: How do I make an R library with Fortran 95 files? You may assume that I am a pretty decent programmer in both R and Fortran. I lay out a scenario and need your help! I know how to make an ordinary R package and have dabbled with R + Fortran 95 *.dll linking. I do not have a great handle on the whole Makevars file and whatever else I might need to get things working to build a new R package. By the way, I'm using R 2.8.1 and Windows XP (and sometimes Linux). Let's take this simple situation. Suppose I'm using Windows XP and place the Folder for the package Testme in C:\Program Files\R\R-2.8.1\src\library\Testme I'd put your own files in their own directory, rather than in the R source tree. But this shouldn't have caused your error... Files and folders: DESCRIPTION file -- I understand this file NAMESPACE file: This one has the contents useDynLib(Testme) exportPattern(^[^\\.]) man directory: I put my *.Rd files here. I understand this part. R directory: I put my *.R files here. One of these files calls upon a Fortran 95 subroutine with .Fortran(MySubroutine, var1=as.double(x), var2=as.double(y), etc.) I understand this part. src directory: I put my *.f95 files here. Also, I put a Makevars file here. What is in Makevars? You don't need this unless you're doing something special. In your case, you might be, see below. Suppose that I have two *.f95 files, named CalculatorModule.f95 and ANiceSubroutine.f95. CalculatorModule.f95 contains a Fortran 95 module. ANiceSubroutine.f95 uses the functions in Calculator.Module.f95 with the USE command. To be consistent with my R file (above), ANiceSubroutine.f95 contains the subroutine: MySubroutine. Finally: I issue the command from a DOS Shell in the directory C:\Program Files\R\R_2.8.1\src\gnuwin32 make pkg-Testme This results in errors. Here are the problems: 1. The order of compilation must be f95 -c CalculatorModule.f95 ANiceSubroutine.f95. Unfortunately, R compiles them alphabetically. So, I was given errors making the *.o files. You could put the dependency ANiceSubroutine.o: ANiceSubroutine.f95 CalculatorModule.o into your Makevars file, but see the instructions in Writing R Extensions if you add targets there. I overcame this problem by renaming the files a01.CalculatorModule.f95 and a02.ANiceSubroutine.f95. I would rather not have to name the files this way if I don't have to! When I did this, R compiled the Fortran files in the correct order, but I still have errors. 2. Here was the error output. Help? C:\Program Files\R\R-2.8.1\src\gnuwin32make pkg-Testme -- Making package Testme adding build stamp to DESCRIPTION installing NAMESPACE file and metadata making DLL ... g95 -O3 -c a01.CalculatorMod.f95 -o a01.CalculatorMod.o g95 -O3 -c a02.ANiceSubroutine.f95 -o a02.ANiceSubroutine.o windres --preprocessor=gcc -E -xc -DRC_INVOKED -I C:/PROGRA~1/R/R-28~1.1/inclu de -i Testme_res.rc -o Testme_res.o g95 -shared -s -o Testme.dll Testme.def a01.CalculatorMod.o a02.ANiceSubroutine.o Testme_res.o -LC:/PROGRA~1/R/R-28~1.1/bin-lR ld: Testme.def:6: syntax error ld:Testme.def: file format not recognized; treating as linker script ld:Testme.def:2: syntax error make[3]: *** [Testme.dll] Error 1 make[2]: *** [srcDynlib] Error 2 make[1]: *** [all] Error 2 make: *** [pkg-Testme] Error 2 Did you create the Testme.def file? What is in it? Duncan Murdoch Thanks for helping!!! :) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Query regarding na.omit function
Hi friends, I have a query regarding na.omit function.Please ,someone help me. I have a function xyz_function-function(arguments) { some code return(list(matrix=dataset)) } xyz_function_returnvalue-xyz_function(passed argumentss) *Case-I* xyz_function_returnvalue_deletingNArows-na.omit((xyz_function_returnvalue)) *Case-II* xyz_function_returnvalue_dataframe_deletingNArows-na.omit(as.data.frame((pair_raw_data$matrix))) Both case I and II don't work.My dataset has rows with NA values,that's for sure. Whereas this simple code,works fine.I mean from the data frame the rows containing NA values could be easily deleted. DF - data.frame(x = c(1, 2, 3), y = c(0, 10, NA)) DF-na.omit(DF) print(DF) -- Thanks Moumita [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Forcing a variableinto a model using stepAIC
Dear All, I am attempting to use forward and/or backward selection to determine the best model for the variables I have. Unfortunately, because I am dealing with patients and every patient is receiving treatment I need to force the variable for treatment into the model. Is there a way to do this using R? (Additionally, the model is stratified by randomisation period). I know that SAS can be used to do this but my SAS coding is poor and consequently it would be easier for me to use R, especially given the fractional polynomial transformations! Currently the model is as follows (without treatment). coxfita=coxph(Surv(rem.Remtime,rem.Rcens)~sind(nearma)+fsh(nearma)+fdr(nearma)+th1(nearma)+th2(nearma)+fp(cage)+fp(fint)+fp(tsb)+strata(rpa),data=nearma) Thank you for your help, Laura __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Query regarding na.omit function
PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. We have no idea of the structure of 'dataset' that is being returned by your function. This example works fine for dataframes: x - list(matrix=data.frame(x = c(1, 2, 3), y = c(0, 10, NA))) x $matrix x y 1 1 0 2 2 10 3 3 NA na.omit(x$matrix) x y 1 1 0 2 2 10 na.omit(as.data.frame(x$matrix)) x y 1 1 0 2 2 10 You might also look at 'complete.cases', but a real example would be helpful in providing information. On Fri, May 22, 2009 at 9:41 AM, Moumita Das das.moumita.onl...@gmail.comwrote: Hi friends, I have a query regarding na.omit function.Please ,someone help me. I have a function xyz_function-function(arguments) { some code return(list(matrix=dataset)) } xyz_function_returnvalue-xyz_function(passed argumentss) *Case-I* xyz_function_returnvalue_deletingNArows-na.omit((xyz_function_returnvalue)) *Case-II* xyz_function_returnvalue_dataframe_deletingNArows-na.omit(as.data.frame((pair_raw_data$matrix))) Both case I and II don't work.My dataset has rows with NA values,that's for sure. Whereas this simple code,works fine.I mean from the data frame the rows containing NA values could be easily deleted. DF - data.frame(x = c(1, 2, 3), y = c(0, 10, NA)) DF-na.omit(DF) print(DF) -- Thanks Moumita [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Class for time of day?
You could create a subclass of times with its own print and format methods that printed and formatted hours/minutes/seconds even if greater than one day if that is the main item you need. Regarding division you could contribute that to the chron package. I've contributed a few missing items and they were incorporated. Giving an error when it does not understand something would be dangerous as it could break much existing code so that would probably not be possible at this stage. The idea of defaulting to internal representations is based on the idea that you get many features for free since the way the internal representations work gives the right answer in many cases. Its best to stick with the implicit philosophy that underlies a package. If you want a different philosophy then its really tantamount to creating a new package. I don't think that one is right and the other wrong but simply represent different viewpoints. On Fri, May 22, 2009 at 9:33 AM, Stavros Macrakis macra...@alum.mit.edu wrote: On Thu, May 21, 2009 at 8:28 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: It uses hours/minutes/seconds for values 1 day and uses days and fractions of a day otherwise. Yes, my examples were documenting this idiosyncracy. For values and operations that it has not considered it falls back to the internal representation. Yes, my examples were documenting this bad behavior. Most of your examples start to make sense once you realize this. Of course I realize this. That was the point of my examples. I understand perfectly well what is *causing* the bad behavior. That doesn't make it less bad. What is the point of a class system if functions ignore the class and perform meaningless calculations on the internal representation? -s __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Forcing a variableinto a model using stepAIC
On 5/22/2009 9:58 AM, Laura Bonnett wrote: Dear All, I am attempting to use forward and/or backward selection to determine the best model for the variables I have. Unfortunately, because I am dealing with patients and every patient is receiving treatment I need to force the variable for treatment into the model. Is there a way to do this using R? (Additionally, the model is stratified by randomisation period). I know that SAS can be used to do this but my SAS coding is poor and consequently it would be easier for me to use R, especially given the fractional polynomial transformations! Currently the model is as follows (without treatment). coxfita=coxph(Surv(rem.Remtime,rem.Rcens)~sind(nearma)+fsh(nearma)+fdr(nearma)+th1(nearma)+th2(nearma)+fp(cage)+fp(fint)+fp(tsb)+strata(rpa),data=nearma) Thank you for your help, Laura See the scope argument to stepAIC in the MASS package. You can specify a formula in the 'lower' component of scope which includes the treatment variable. That will force the treatment variable to remain in every model examined in the stepwise search. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. (www.ndri.org) 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Query regarding na.omit function
Hi Jim, xyz_function-function(arguments) { some code pair_raw_data-changeMissingValuestoNA(pair_raw_data$matrix,pair_raw_data$dim,missing_values) return(list(matrix=pair_raw_data)) } Type of my dataset was a list.I tried checking that ,changing my dataset to some other types. Thanks Moumita On Fri, May 22, 2009 at 7:11 PM, Moumita Das das.moumita.onl...@gmail.comwrote: Hi friends, I have a query regarding na.omit function.Please ,someone help me. I have a function xyz_function-function(arguments) { some code return(list(matrix=dataset)) } xyz_function_returnvalue-xyz_function(passed argumentss) *Case-I* xyz_function_returnvalue_deletingNArows-na.omit((xyz_function_returnvalue)) *Case-II* xyz_function_returnvalue_dataframe_deletingNArows-na.omit(as.data.frame((pair_raw_data$matrix))) Both case I and II don't work.My dataset has rows with NA values,that's for sure. Whereas this simple code,works fine.I mean from the data frame the rows containing NA values could be easily deleted. DF - data.frame(x = c(1, 2, 3), y = c(0, 10, NA)) DF-na.omit(DF) print(DF) -- Thanks Moumita -- Thanks Moumita [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] EM algorithm mixture of multivariate gaussian
Hi, i would to know, if someone have ever write the code to estimate the parameter (mixing proportion, mean, a var/cov matrix) of a mixture of two multivariate normal distribution. I wrote it and it works (it could find mean and mixing proportion, if I fix the var/cov matrix), while if I fix anything, it doesn't work. My suspect is that when the algorithm iterates the var/cov matrix, something wrong happens, but i don't know how solve this problem. I enclose my script, if someone would give it a look, I would appreciate so much. thank you daniele # #Start Script # library(mvtnorm) libray(scatterplot3d) library(MASS) n=100 m1=c(5,-5) m2=c(-3,3) s1=matrix(c(2,1,1,3), 2,2) s2=matrix(c(4,1,1,6), 2,2) alpha=0.3 c1 - mvrnorm(round(n*alpha),m1,s1) c2 - mvrnorm(round(n*(1-alpha)),m2,s2) allval - rbind(c1,c2) x - allval[sample(n,n),] mixm- function(x,m1,m2,s1,s2, alpha) { f1-dmvnorm(x, m1, s1, log=FALSE) f2-dmvnorm(x, m2, s2, log=FALSE) f=alpha*f1+(1-alpha)*f2 f } plot(x) den-mixm(x,m1,m2,s1,s2,alpha) scatterplot3d(x[,1],x[,2],den, highlight.3d=TRUE, col.axis=blue, col.grid=lightblue,angle=120, pch=20) em2mn- function(y) { n-length(y[,1]) p-matrix(0,n,1) f1-matrix(0,n,1) f2-matrix(0,n,1) tau-matrix(0,n,2) eps-0.0001 mu01-c(0,0) mu02-c(0,0) sd01-matrix(0,2,2) sd02-matrix(0,2,2) cov1-matrix(0,2,2) cov2-matrix(0,2,2) # 1 inizializzare i valori alpha0= runif(1,0,1) for (j in 1:2) { mu01[j] - runif(1,min=quantile(y[,j], probs =0.25), max=quantile(y[,j], probs =0.75)) mu02[j] - runif(1,min=quantile(y[,j], probs =0.25), max=quantile(y[,j], probs =0.75)) } sd01- var(y[1:round(n/2),]) sd02- var(y[round(n/2):n,]) #sd01-s1 #sd02-s2 #prima iterazione iter-0 f1-dmvnorm(y, mu01, sd01, log=FALSE) f2-dmvnorm(y, mu02, sd02, log=FALSE) p=alpha0*f1+(1-alpha0)*f2 scatterplot3d(y[,1], y[,2], p , highlight.3d=TRUE, col.axis=blue, col.grid=lightblue,main=val iniz mistura normali multivariata, angle=120, pch=20) #verosimiglianza iniziale l0-sum(log(p)) l1-l0 alpha-alpha0 mu1-mu01 mu2-mu02 sd1-sd01 sd2-sd02 for (iter in 1:itermax) { #passo E for (i in 1:n) { tau[i,1]-(alpha*f1[i])/p[i] tau[i,2]-((1-alpha)*f2[i])/p[i] } #passo M alpha= mean(tau[,1]) mu1=colSums(tau[,1]*y)/sum(tau[,1]) mu2=colSums(tau[,2]*y)/sum(tau[,2]) ycen1-(y-mu1) ycen2-(y-mu2) cov1-matrix(0,2,2) cov2-matrix(0,2,2) for (i in 1:n){ cov1-cov1+ (tau[i,1]*(ycen1[i,])%*%t(ycen1[i,])) cov2-cov2+ (tau[i,2]*(ycen2[i,])%*%t(ycen2[i,])) } # w1-sqrt(tau[,1]) # w2-sqrt(tau[,2]) # ywei1-w1*ycen1 # ywei2-w2*ycen2 sd1-cov1/sum(tau[,1]) sd2-cov2/sum(tau[,2]) f1-dmvnorm(y,mu1,sd1,log=FALSE) f2-dmvnorm(y,mu2,sd2,log=FALSE) p-alpha*f1+(1-alpha)*f2 L-sum(log(p)) cat(iter,L,\t,alpha,\t,mu1,\t,mu2,\n) if (abs(L1-L)eps) break L1-L } denfin-mixm(x,mu1,mu2,sd1,sd2,alpha) scatterplot3d(x[,1],x[,2],denfin, highlight.3d=TRUE, col.axis=blue, col.grid=lightblue,main=densità valori finali,angle=120, pch=20) return(list(alpha0=alpha0,alpha=alpha,mu1=mu1,mu2=mu2,sd1=sd1,sd2=sd2)) } em2mn(x) ## end script ## -- Dr. Daniele Riggi, PhD student University of Milano-Bicocca Department of Statistics Building U7, Via Bicocca degli Arcimboldi, 8 20126 Milano, Italy cell. +39 328 3380690 mailto: daniele.ri...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] bug in rpart?
Greetings, I checked the Indian diabetes data again and get one tree for the data with reordered columns and another tree for the original data. I compared these two trees, the split points for these two trees are exactly the same but the fitted classes are not the same for some cases. And the misclassification errors are different too. I know how CART deal with ties --- even we are using the same data, the subjects to the left and right would not be the same if we just rearrange the order of covariates. But the problem is, the fitted trees are exactly the same on the split points. Shouldn't we get the same fitted values if the decisions are the same at each step? Why the same structured trees have different observations on the nodes? The source code for running the diabetes data example and the output of trees are attached. Your professional opinion is very much appreciated. library(mlbench) data(PimaIndiansDiabetes2) mydata-PimaIndiansDiabetes2 library(rpart) fit2-rpart(diabetes~., data=mydata,method=class) plot(fit2,uniform=T,main=CART for original data) text(fit2,use.n=T,cex=0.6) printcp(fit2) table(predict(fit2,type=class),mydata$diabetes) ## misclassifcation table: rows are fitted class neg pos neg 437 68 pos 63 200 pmydata-data.frame(mydata[,c(1,6,3,4,5,2,7,8,9)]) fit3-rpart(diabetes~., data=pmydata,method=class) plot(fit3,uniform=T,main=CART after exchaging mass glucose) text(fit3,use.n=T,cex=0.6) printcp(fit3) table(predict(fit3,type=class),pmydata$diabetes) ##after exchage the order of BODY mass and PLASMA glucose neg pos neg 436 64 pos 64 204 Best, -- -- Yuanyuan Huang Email: sunnyua...@gmail.com ReorderedTree.pdf Description: Adobe PDF document OriginalTree.pdf Description: Adobe PDF document __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Returning only a file path on Windows
I am choosing a file like this: #Bring up file selection box fn-file.choose() fp-file.path(fn,fsep='\\') Unfortunately, the file path contains the short file name and extension as well. I had hoped to get only the path so I could make my own long filenames (for output graphs) by concatenation with this file path. Of course I can split the string and assemble the components from the returned list: fp-strsplit(fp,'\\',fixed=TRUE) But there must be a better way? Thanks in advance, Alex van der Spek __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Behavior of seq with vector from
So if I want to concatenate the output of multiple seq calls, there's no clear way to to do this? For background, I have a number of data.frames with the same structure in a list. I want to 'collapse' the list into a single data.frame but only keeping certain columns from each underlying data.frame. In my example below, I want to keep columns 2,3 in each underlying data.frame. I'm using do.call('cbind', my.list) and then using the statement below to extract only the columns I need (other details omitted for brevity). If there's a built-in or pre-built function to do this, I'm all eyes. Brian PS if this is unclear, flame away, and I'll post some code -Original Message- From: Peter Dalgaard [mailto:p.dalga...@biostat.ku.dk] Sent: Friday, May 22, 2009 6:20 AM To: Rowe, Brian Lee Yung (Portfolio Analytics) Cc: r-help@r-project.org Subject: Re: [R] Behavior of seq with vector from Rowe, Brian Lee Yung (Portfolio Analytics) wrote: To get the value I want, I am using the following code: sort(as.vector(apply(array(c(2,3)), 1, seq, by=3,length.out=4))) [1] 2 3 5 6 8 9 11 12 So two questions: 1. Is seq designed/intended to be used with a vector from argument, and is this the desired behavior? 2. If so, is there a cleaner way of implementing what I want? 1. Hmm, not really. NA. 2. I'd view it as an outer sum, stringed out to a single vector, hence: c(outer(c(2,3), seq(0,,3,4), +)) [1] 2 3 5 6 8 9 11 12 -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 -- This message w/attachments (message) may be privileged, confidential or proprietary, and if you are not an intended recipient, please notify the sender, do not use or share it and delete it. Unless specifically indicated, this message is not an offer to sell or a solicitation of any investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Merrill Lynch. Subject to applicable law, Merrill Lynch may monitor, review and retain e-communications (EC) traveling through its networks/systems. The laws of the country of each sender/recipient may impact the handling of EC, and EC may be archived, supervised and produced in countries other than the country in which you are located. This message cannot be guaranteed to be secure or error-free. References to Merrill Lynch are references to any company in the Merrill Lynch Co., Inc. group of companies, which are wholly-owned by Bank of America Corporation. Securities and Insurance Products: * Are Not FDIC Insured * Are Not Bank Guaranteed * May Lose Value * Are Not a Bank Deposit * Are Not a Condition to Any Banking Service or Activity * Are Not Insured by Any Federal Government Agency. Attachments that are part of this E-communication may have additional important disclosures and disclaimers, which you should read. This message is subject to terms available at the following link: http://www.ml.com/e-communications_terms/. By messaging with Merrill Lynch you consent to the foregoing. -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] fitting Autoregressive Conditional Duration and Cox Proportional Hazard model in R
Hi all, Could anybody point me to some existing code in R for fitting Autoregressive Conditional Duration and Cox proportional hazard model and with model selection and model specification tests? Thank you! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] step by step debugger in R?
Really I think if there is a Visual Studio strength debugger, our collective time spent in developing R code will be greatly reduced. 2009/5/22 Uwe Ligges lig...@statistik.tu-dortmund.de: Michael wrote: Could anybody point me to the latest status of the most user-friendly debugger in R? I always have been happy with ?debug, ?recover and friends cited there. Although you also find additional software like the debug package .. Best, Uwe Ligges How I wish I don't have to stare at my (long) code for ages and stuck... Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] EM algorithm mixture of multivariate
Hi, i would to know, if someone have ever write the code to estimate the parameter (mixing proportion, mean, a var/cov matrix) of a mixture of two multivariate normal distribution. I wrote it and it works (it could find mean and mixing proportion, if I fix the var/cov matrix), while if I fix anything, it doesn't work. My suspect is that when the algorithm iterates the var/cov matrix, something wrong happens, but i don't know how solve this problem. I enclose my script, if someone would give it a look, I would appreciate so much. thank you daniele # #Start Script # library(mvtnorm) libray(scatterplot3d) library(MASS) n=100 m1=c(5,-5) m2=c(-3,3) s1=matrix(c(2,1,1,3), 2,2) s2=matrix(c(4,1,1,6), 2,2) alpha=0.3 c1 - mvrnorm(round(n*alpha),m1,s1) c2 - mvrnorm(round(n*(1-alpha)),m2,s2) allval - rbind(c1,c2) x - allval[sample(n,n),] mixm- function(x,m1,m2,s1,s2, alpha) { f1-dmvnorm(x, m1, s1, log=FALSE) f2-dmvnorm(x, m2, s2, log=FALSE) f=alpha*f1+(1-alpha)*f2 f } plot(x) den-mixm(x,m1,m2,s1,s2,alpha) scatterplot3d(x[,1],x[,2],den, highlight.3d=TRUE, col.axis=blue, col.grid=lightblue,angle=120, pch=20) em2mn- function(y) { n-length(y[,1]) p-matrix(0,n,1) f1-matrix(0,n,1) f2-matrix(0,n,1) tau-matrix(0,n,2) eps-0.0001 mu01-c(0,0) mu02-c(0,0) sd01-matrix(0,2,2) sd02-matrix(0,2,2) cov1-matrix(0,2,2) cov2-matrix(0,2,2) # 1 inizializzare i valori alpha0= runif(1,0,1) for (j in 1:2) { mu01[j] - runif(1,min=quantile(y[,j], probs =0.25), max=quantile(y[,j], probs =0.75)) mu02[j] - runif(1,min=quantile(y[,j], probs =0.25), max=quantile(y[,j], probs =0.75)) } sd01- var(y[1:round(n/2),]) sd02- var(y[round(n/2):n,]) #sd01-s1 #sd02-s2 #prima iterazione iter-0 f1-dmvnorm(y, mu01, sd01, log=FALSE) f2-dmvnorm(y, mu02, sd02, log=FALSE) p=alpha0*f1+(1-alpha0)*f2 scatterplot3d(y[,1], y[,2], p , highlight.3d=TRUE, col.axis=blue, col.grid=lightblue,main=val iniz mistura normali multivariata, angle=120, pch=20) #verosimiglianza iniziale l0-sum(log(p)) l1-l0 alpha-alpha0 mu1-mu01 mu2-mu02 sd1-sd01 sd2-sd02 for (iter in 1:itermax) { #passo E for (i in 1:n) { tau[i,1]-(alpha*f1[i])/p[i] tau[i,2]-((1-alpha)*f2[i])/p[i] } #passo M alpha= mean(tau[,1]) mu1=colSums(tau[,1]*y)/sum(tau[,1]) mu2=colSums(tau[,2]*y)/sum(tau[,2]) ycen1-(y-mu1) ycen2-(y-mu2) cov1-matrix(0,2,2) cov2-matrix(0,2,2) for (i in 1:n){ cov1-cov1+ (tau[i,1]*(ycen1[i,])%*%t(ycen1[i,])) cov2-cov2+ (tau[i,2]*(ycen2[i,])%*%t(ycen2[i,])) } # w1-sqrt(tau[,1]) # w2-sqrt(tau[,2]) # ywei1-w1*ycen1 # ywei2-w2*ycen2 sd1-cov1/sum(tau[,1]) sd2-cov2/sum(tau[,2]) f1-dmvnorm(y,mu1,sd1,log=FALSE) f2-dmvnorm(y,mu2,sd2,log=FALSE) p-alpha*f1+(1-alpha)*f2 L-sum(log(p)) cat(iter,L,\t,alpha,\t,mu1,\t,mu2,\n) if (abs(L1-L)eps) break L1-L } denfin-mixm(x,mu1,mu2,sd1,sd2,alpha) scatterplot3d(x[,1],x[,2],denfin, highlight.3d=TRUE, col.axis=blue, col.grid=lightblue,main=densità valori finali,angle=120, pch=20) return(list(alpha0=alpha0,alpha=alpha,mu1=mu1,mu2=mu2,sd1=sd1,sd2=sd2)) } em2mn(x) ## end script ## -- Dr. Daniele Riggi, PhD student University of Milano-Bicocca Department of Statistics Building U7, Via Bicocca degli Arcimboldi, 8 20126 Milano, Italy cell. +39 328 3380690 mailto: daniele.ri...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] axis(): disable prevention of overlapping tick labels for pgfSweave()
Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 8bit Dear R Users, I'm using pgfSweave() form R-Forge. How can I disable prevention of overlapping tick labels in axis()? If I use axis() with Sweave(Test1, driver = pgfSweaveDriver), R treats the LaTeX-formatting as long strings, therefore tries not to draw overlapping. Gerhard \documentclass{article} \usepackage{pgf} \begin{document} Test, echo = FALSE, fig = TRUE, pgf = TRUE= plot(1:15, axes = FALSE, xlab = ) axis(1, at = 1:2, tick = FALSE, line = 0, labels = 1:2) axis(1, at = 1:7, tick = FALSE, line = 1, labels = 1:7) axis(1, at = 1:2, tick = FALSE, line = 2, labels = paste({\\textit{, 1:2, }}, sep = )) axis(1, at = 1:7, tick = FALSE, line = 3, labels = paste({\\textit{, 1:7, }}, sep = )) @ \end{document} -- Gerhard Sch?n Department of Medical Biometry and Epidemiology University Medical Center Hamburg-Eppendorf Martinistr. 52, building W34 Phone: +49(40) 7410 57458 Fax: +49(40) 7410 57790 Web: www.uke.de/imbe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Class for time of day?
On Fri, May 22, 2009 at 10:03 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Regarding division you could contribute that to the chron package. I've contributed a few missing items and they were incorporated. Good to know. Maybe I'll do that Giving an error when it does not understand something would be dangerous as it could break much existing code so that would probably not be possible at this stage. But would it break any existing *correct* code? I find it hard to imagine any cases where adding 1 hour of difftime to times(12:00:00) should return 1.5 days rather than 13:00:00. The idea of defaulting to internal representations is based on the idea that you get many features for free since the way the internal representations work gives the right answer in many cases. I must admit I am rather shocked by this approach. Getting something for free is a bad bargain if what you get is nonsense. Its best to stick with the implicit philosophy that underlies a package. If you want a different philosophy then its really tantamount to creating a new package. I don't think that one is right and the other wrong but simply represent different viewpoints. So you would defend the viewpoint that 1 hour is the same thing as 1 day? -s [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] anova leads to an error
Skotara wrote: Dear R-list, the following code had been running well over the last months: exam - matrix(rnorm(100,0,1), 10, 10) gg - factor(c(rep(A, 5), rep(B, 5))) mlmfit - lm(exam ~ 1); mlmfitG - lm(exam ~ gg) result - anova(mlmfitG, mlmfit, X=~0, M=~1) Until, all of a sudden the following error occured: Fehler in apply(abs(sapply(deltassd, function(X) diag((T %*% X %*% t(T), : dim(X) must have a positive length I have not kept track of the changes in my R-version, so it might have to do with that. Now it is: R version 2.9.0 (2009-04-17). Does anybody know more about this error? I would help me a lot! Hmm, It didn't work in 2.8.1 either. Anyways, the direct cause of the problem is that the construction sapply(deltassd, function(X) diag((T %*% X %*% t(T will return a vector, not matrix, if T is 1 x p. This happens inside apply(abs(sapply(deltassd, function(X) diag((T %*% X %*% t(T), 1, max) and obviously, apply() is confused. We need an as.matrix(), which we do have at other points in the function. For a quick workaround, use test=Spherical. It is really a univariate problem so all the tests are equivalent (in fact, anova(lm(rowMeans(exam) ~ gg)) also works). Thank you very much! Nils __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Returning only a file path on Windows
On 5/22/2009 10:45 AM, am...@xs4all.nl wrote: I am choosing a file like this: #Bring up file selection box fn-file.choose() fp-file.path(fn,fsep='\\') file.path() constructs a path from component parts, it doesn't extract the path from a filename. You want dirname(fn). You may also want to look at normalizePath, to convert short names to long ones, or shortPathName, for the reverse. Duncan Murdoch Unfortunately, the file path contains the short file name and extension as well. I had hoped to get only the path so I could make my own long filenames (for output graphs) by concatenation with this file path. Of course I can split the string and assemble the components from the returned list: fp-strsplit(fp,'\\',fixed=TRUE) But there must be a better way? Thanks in advance, Alex van der Spek __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Behavior of seq with vector from
Try it this way: # test list of data frames L - list(anscombe[1:4], anscombe[5:8], anscombe[1:4], anscombe[5:8]) # get columns 2 and 3 from each component; cbind those together do.call(cbind, lapply(L, [, 2:3)) On Fri, May 22, 2009 at 11:01 AM, Rowe, Brian Lee Yung (Portfolio Analytics) b_r...@ml.com wrote: So if I want to concatenate the output of multiple seq calls, there's no clear way to to do this? For background, I have a number of data.frames with the same structure in a list. I want to 'collapse' the list into a single data.frame but only keeping certain columns from each underlying data.frame. In my example below, I want to keep columns 2,3 in each underlying data.frame. I'm using do.call('cbind', my.list) and then using the statement below to extract only the columns I need (other details omitted for brevity). If there's a built-in or pre-built function to do this, I'm all eyes. Brian PS if this is unclear, flame away, and I'll post some code -Original Message- From: Peter Dalgaard [mailto:p.dalga...@biostat.ku.dk] Sent: Friday, May 22, 2009 6:20 AM To: Rowe, Brian Lee Yung (Portfolio Analytics) Cc: r-help@r-project.org Subject: Re: [R] Behavior of seq with vector from Rowe, Brian Lee Yung (Portfolio Analytics) wrote: To get the value I want, I am using the following code: sort(as.vector(apply(array(c(2,3)), 1, seq, by=3,length.out=4))) [1] 2 3 5 6 8 9 11 12 So two questions: 1. Is seq designed/intended to be used with a vector from argument, and is this the desired behavior? 2. If so, is there a cleaner way of implementing what I want? 1. Hmm, not really. NA. 2. I'd view it as an outer sum, stringed out to a single vector, hence: c(outer(c(2,3), seq(0,,3,4), +)) [1] 2 3 5 6 8 9 11 12 -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 -- This message w/attachments (message) may be privileged, confidential or proprietary, and if you are not an intended recipient, please notify the sender, do not use or share it and delete it. Unless specifically indicated, this message is not an offer to sell or a solicitation of any investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Merrill Lynch. Subject to applicable law, Merrill Lynch may monitor, review and retain e-communications (EC) traveling through its networks/systems. The laws of the country of each sender/recipient may impact the handling of EC, and EC may be archived, supervised and produced in countries other than the country in which you are located. This message cannot be guaranteed to be secure or error-free. References to Merrill Lynch are references to any company in the Merrill Lynch Co., Inc. group of companies, which are wholly-owned by Bank of America Corporation. Securities and Insurance Products: * Are Not FDIC Insured * Are Not Bank Guaranteed * May Lose Value * Are Not a Bank Deposit * Are Not a Condition to Any Banking Service or Activity * Are Not Insured by Any Federal Government Agency. Attachments that are part of this E-communication may have additional important disclosures and disclaimers, which you should read. This message is subject to terms available at the following link: http://www.ml.com/e-communications_terms/. By messaging with Merrill Lynch you consent to the foregoing. -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Returning only a file path on Windows
Are you looking for choose.dir() ? Or basename() and dirname()? Uwe Ligges am...@xs4all.nl wrote: I am choosing a file like this: #Bring up file selection box fn-file.choose() fp-file.path(fn,fsep='\\') Unfortunately, the file path contains the short file name and extension as well. I had hoped to get only the path so I could make my own long filenames (for output graphs) by concatenation with this file path. Of course I can split the string and assemble the components from the returned list: fp-strsplit(fp,'\\',fixed=TRUE) But there must be a better way? Thanks in advance, Alex van der Spek __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] step by step debugger in R?
On 5/22/2009 10:59 AM, Michael wrote: Really I think if there is a Visual Studio strength debugger, our collective time spent in developing R code will be greatly reduced. If someone who knows how to write a debugger plugin for Eclipse wants to help, we could have that fairly easily. All the infrastructure is there; it's the UI part that's missing. Duncan Murdoch 2009/5/22 Uwe Ligges lig...@statistik.tu-dortmund.de: Michael wrote: Could anybody point me to the latest status of the most user-friendly debugger in R? I always have been happy with ?debug, ?recover and friends cited there. Although you also find additional software like the debug package .. Best, Uwe Ligges How I wish I don't have to stare at my (long) code for ages and stuck... Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survfit, summary, and survmean (was Changelog for survival package)
Dear Terry, sorry that I did not see this change, and thank you for it. It is very useful. Heinz At 15:28 22.05.2009, Terry Therneau wrote: Further I appreciate your new function survmean(). At the moment it seems to be intended as internal, and not documented in the help. The computations done by print.survfit are now a part of the results returned by summary.survfit. See 'table' in the output list of ?summary.survfit. Both call an internal survmean() function to ensure that any future updates stay in synchrony. This was a perennial (and justified) complaint with print.survfit. Per the standard print(x) always returns x, so there was no way to get the results of the print as an S object. Terry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Behavior of seq with vector from
##Is this what you mean ?? do.call(rbind,lapply(yourlist,[,,seq(from=1,by=3,length=4))) ## note the ,, that omits the row argument in the call to [ -- Bert Gunter Genentech Nonclinical Biostatistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Rowe, Brian Lee Yung (Portfolio Analytics) Sent: Friday, May 22, 2009 8:02 AM To: Peter Dalgaard Cc: r-help@r-project.org Subject: Re: [R] Behavior of seq with vector from So if I want to concatenate the output of multiple seq calls, there's no clear way to to do this? For background, I have a number of data.frames with the same structure in a list. I want to 'collapse' the list into a single data.frame but only keeping certain columns from each underlying data.frame. In my example below, I want to keep columns 2,3 in each underlying data.frame. I'm using do.call('cbind', my.list) and then using the statement below to extract only the columns I need (other details omitted for brevity). If there's a built-in or pre-built function to do this, I'm all eyes. Brian PS if this is unclear, flame away, and I'll post some code -Original Message- From: Peter Dalgaard [mailto:p.dalga...@biostat.ku.dk] Sent: Friday, May 22, 2009 6:20 AM To: Rowe, Brian Lee Yung (Portfolio Analytics) Cc: r-help@r-project.org Subject: Re: [R] Behavior of seq with vector from Rowe, Brian Lee Yung (Portfolio Analytics) wrote: To get the value I want, I am using the following code: sort(as.vector(apply(array(c(2,3)), 1, seq, by=3,length.out=4))) [1] 2 3 5 6 8 9 11 12 So two questions: 1. Is seq designed/intended to be used with a vector from argument, and is this the desired behavior? 2. If so, is there a cleaner way of implementing what I want? 1. Hmm, not really. NA. 2. I'd view it as an outer sum, stringed out to a single vector, hence: c(outer(c(2,3), seq(0,,3,4), +)) [1] 2 3 5 6 8 9 11 12 -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 -- This message w/attachments (message) may be privileged, confidential or proprietary, and if you are not an intended recipient, please notify the sender, do not use or share it and delete it. Unless specifically indicated, this message is not an offer to sell or a solicitation of any investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Merrill Lynch. Subject to applicable law, Merrill Lynch may monitor, review and retain e-communications (EC) traveling through its networks/systems. The laws of the country of each sender/recipient may impact the handling of EC, and EC may be archived, supervised and produced in countries other than the country in which you are located. This message cannot be guaranteed to be secure or error-free. References to Merrill Lynch are references to any company in the Merrill Lynch Co., Inc. group of companies, which are wholly-owned by Bank of America Corporation. Securities and Insurance Products: * Are Not FDIC Insured * Are Not Bank Guaranteed * May Lose Value * Are Not a Bank Deposit * Are Not a Condition to Any Banking Service or Activity * Are Not Insured by Any Federal Government Agency. Attachments that are part of this E-communication may have additional important disclosures and disclaimers, which you should read. This message is subject to terms available at the following link: http://www.ml.com/e-communications_terms/. By messaging with Merrill Lynch you consent to the foregoing. -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Behavior of seq with vector from
Yet another way of doing it: x - c(2,3) x + rep(seq(0, by=3, length=4), each=length(x)) [1] 2 3 5 6 8 9 11 12 On Fri, May 22, 2009 at 11:01 AM, Rowe, Brian Lee Yung (Portfolio Analytics) b_r...@ml.com wrote: So if I want to concatenate the output of multiple seq calls, there's no clear way to to do this? For background, I have a number of data.frames with the same structure in a list. I want to 'collapse' the list into a single data.frame but only keeping certain columns from each underlying data.frame. In my example below, I want to keep columns 2,3 in each underlying data.frame. I'm using do.call('cbind', my.list) and then using the statement below to extract only the columns I need (other details omitted for brevity). If there's a built-in or pre-built function to do this, I'm all eyes. Brian PS if this is unclear, flame away, and I'll post some code -Original Message- From: Peter Dalgaard [mailto:p.dalga...@biostat.ku.dk] Sent: Friday, May 22, 2009 6:20 AM To: Rowe, Brian Lee Yung (Portfolio Analytics) Cc: r-help@r-project.org Subject: Re: [R] Behavior of seq with vector from Rowe, Brian Lee Yung (Portfolio Analytics) wrote: To get the value I want, I am using the following code: sort(as.vector(apply(array(c(2,3)), 1, seq, by=3,length.out=4))) [1] 2 3 5 6 8 9 11 12 So two questions: 1. Is seq designed/intended to be used with a vector from argument, and is this the desired behavior? 2. If so, is there a cleaner way of implementing what I want? 1. Hmm, not really. NA. 2. I'd view it as an outer sum, stringed out to a single vector, hence: c(outer(c(2,3), seq(0,,3,4), +)) [1] 2 3 5 6 8 9 11 12 -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 -- This message w/attachments (message) may be privileged, confidential or proprietary, and if you are not an intended recipient, please notify the sender, do not use or share it and delete it. Unless specifically indicated, this message is not an offer to sell or a solicitation of any investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Merrill Lynch. Subject to applicable law, Merrill Lynch may monitor, review and retain e-communications (EC) traveling through its networks/systems. The laws of the country of each sender/recipient may impact the handling of EC, and EC may be archived, supervised and produced in countries other than the country in which you are located. This message cannot be guaranteed to be secure or error-free. References to Merrill Lynch are references to any company in the Merrill Lynch Co., Inc. group of companies, which are wholly-owned by Bank of America Corporation. Securities and Insurance Products: * Are Not FDIC Insured * Are Not Bank Guaranteed * May Lose Value * Are Not a Bank Deposit * Are Not a Condition to Any Banking Service or Activity * Are Not Insured by Any Federal Government Agency. Attachments that are part of this E-communication may have additional important disclosures and disclaimers, which you should read. This message is subject to terms available at the following link: http://www.ml.com/e-communications_terms/. By messaging with Merrill Lynch you consent to the foregoing. -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] levelplot in combination with xyplot
With levelplot I would like to combine fitted values and raw values in one plot. The surface should be based on fitted values and on top of those I would like distinguisable points based on the raw data with a colorscheme the same as the surface. # raw data x1 = rep(1:10, times = 10) x2 = rep(1:10, each = 10) y = 2 + (x1+rnorm(100, mean=0, sd=4)) + (x2+rnorm(100, mean=0, sd=2)) levelplot(y~x1+x2) # predicted data fit = lm(y ~ x1 + x2) x1.mesh = do.breaks(range(x1), 50) x2.mesh = do.breaks(range(x2), 50) grid= expand.grid(x1 = x1.mesh, x2 = x2.mesh) grid[[fit]] = predict(fit, newdata=grid) levelplot(fit ~ x1 + x2, data=grid) So, instead of plotting both levelplots (like here) I want the first on top of the second. I tried a bunch of things (including panel.xyplot within panel) but am now doubting whether it is possible? Thanks in advance, Eelke __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Behavior of seq with vector from
Brilliant! Thanks, Brian -Original Message- From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: Friday, May 22, 2009 11:20 AM To: Rowe, Brian Lee Yung (Portfolio Analytics) Cc: Peter Dalgaard; r-help@r-project.org Subject: Re: [R] Behavior of seq with vector from Try it this way: # test list of data frames L - list(anscombe[1:4], anscombe[5:8], anscombe[1:4], anscombe[5:8]) # get columns 2 and 3 from each component; cbind those together do.call(cbind, lapply(L, [, 2:3)) On Fri, May 22, 2009 at 11:01 AM, Rowe, Brian Lee Yung (Portfolio Analytics) b_r...@ml.com wrote: So if I want to concatenate the output of multiple seq calls, there's no clear way to to do this? For background, I have a number of data.frames with the same structure in a list. I want to 'collapse' the list into a single data.frame but only keeping certain columns from each underlying data.frame. In my example below, I want to keep columns 2,3 in each underlying data.frame. I'm using do.call('cbind', my.list) and then using the statement below to extract only the columns I need (other details omitted for brevity). If there's a built-in or pre-built function to do this, I'm all eyes. Brian PS if this is unclear, flame away, and I'll post some code -Original Message- From: Peter Dalgaard [mailto:p.dalga...@biostat.ku.dk] Sent: Friday, May 22, 2009 6:20 AM To: Rowe, Brian Lee Yung (Portfolio Analytics) Cc: r-help@r-project.org Subject: Re: [R] Behavior of seq with vector from Rowe, Brian Lee Yung (Portfolio Analytics) wrote: To get the value I want, I am using the following code: sort(as.vector(apply(array(c(2,3)), 1, seq, by=3,length.out=4))) [1] 2 3 5 6 8 9 11 12 So two questions: 1. Is seq designed/intended to be used with a vector from argument, and is this the desired behavior? 2. If so, is there a cleaner way of implementing what I want? 1. Hmm, not really. NA. 2. I'd view it as an outer sum, stringed out to a single vector, hence: c(outer(c(2,3), seq(0,,3,4), +)) [1] 2 3 5 6 8 9 11 12 -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 -- This message w/attachments (message) may be privileged, confidential or proprietary, and if you are not an intended recipient, please notify the sender, do not use or share it and delete it. Unless specifically indicated, this message is not an offer to sell or a solicitation of any investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Merrill Lynch. Subject to applicable law, Merrill Lynch may monitor, review and retain e-communications (EC) traveling through its networks/systems. The laws of the country of each sender/recipient may impact the handling of EC, and EC may be archived, supervised and produced in countries other than the country in which you are located. This message cannot be guaranteed to be secure or error-free. References to Merrill Lynch are references to any company in the Merrill Lynch Co., Inc. group of companies, which are wholly-owned by Bank of America Corporation. Securities and Insurance Products: * Are Not FDIC Insured * Are Not Bank Guaranteed * May Lose Value * Are Not a Bank Deposit * Are Not a Condition to Any Banking Service or Activity * Are Not Insured by Any Federal Government Agency. Attachments that are part of this E-communication may have additional important disclosures and disclaimers, which you should read. This message is subject to terms available at the following link: http://www.ml.com/e-communications_terms/. By messaging with Merrill Lynch you consent to the foregoing. -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Axis Limits in Scatterplot3d
Alan wrote: Hi, How do you obtain the limits of the plotting region in a scatterplot3d plot? `par('usr')' does not seem to give sensible values, and that vector only has 4 elements (not the expected 6). Well, I never designed anything to do that, but it is possible with the following function: gets3dusr - function(s3dobject){ es3d - environment(s3d$points3d) g3d - function(name) get(name, envir=es3d) c(c(g3d(x.min), g3d(x.max)) * g3d(x.scal), c(0, g3d(y.max)) * g3d(y.scal) + g3d(y.add), c(g3d(z.min), g3d(z.max)) * g3d(z.scal)) } #Example: s3d - scatterplot3d(rnorm(5), rnorm(5), rnorm(5)) gets3dusr(s3d) Uwe Ligges Alan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lattice strip argument check
Dear R-Users, Is there a way to check within the following dummy function if the strip argument is different from the default set in the function declaration? FYI, this argument should be used downstream in a xyplot call of my actual function. dummyfunction - function(..., strip = function(...) strip.default(...,strip.names=c(TRUE,TRUE))) {} Thanks for your help -- *Sebastien Bihorel, PharmD, PhD* PKPD Scientist Cognigen Corp Email: sebastien.biho...@cognigencorp.com mailto:sebastien.biho...@cognigencorp.com Phone: (716) 633-3463 ext. 323 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Class for time of day?
On Fri, May 22, 2009 at 11:01 AM, Stavros Macrakis macra...@alum.mit.edu wrote: On Fri, May 22, 2009 at 10:03 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Regarding division you could contribute that to the chron package. I've contributed a few missing items and they were incorporated. Good to know. Maybe I'll do that Giving an error when it does not understand something would be dangerous as it could break much existing code so that would probably not be possible at this stage. But would it break any existing *correct* code? I find it hard to imagine any cases where adding 1 hour of difftime to times(12:00:00) should return 1.5 days rather than 13:00:00. The idea of defaulting to internal representations is based on the idea that you get many features for free since the way the internal representations work gives the right answer in many cases. I must admit I am rather shocked by this approach. Getting something for free is a bad bargain if what you get is nonsense. Its best to stick with the implicit philosophy that underlies a package. If you want a different philosophy then its really tantamount to creating a new package. I don't think that one is right and the other wrong but simply represent different viewpoints. So you would defend the viewpoint that 1 hour is the same thing as 1 day? The way this might appear in code is if someone wanted to calculate the number of one hour intervals in 18 hours. One could write: t18 - times(18:00:00) t1 - times(1:00:00) as.numeric(t18) / as.numeric(t1) but since we all know that it uses internal representations unless it indicates otherwise a typical code snippet might shorten it to: as.numeric(t18 / t1) and all such code would break if one were to cause that to generate an error. (I think it would be ok if it generated a numeric automatically and that was the enhancement I had suggested to you.) You can try to argue that the user should not code that way but a huge amount of chron code exists by now (note that chron may pre-date R). If it were a new package one could consider raising new errors but at this point in time it would be unwise. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] good numerical optimization to use in R?
Have you tried the maxLik package? If that is not adequate, you can check CRAN Task View: Optimization and Mathematical Programming (http://cran.fhcrc.org/web/views/Optimization.html). Or try the new RSiteSearch.function in the RSiteSearch package. If none of these adequate, please provide another post with more detail about what you've tried and why it does not seem adequate, preferably providing provide commented, minimal, self-contained, reproducible code, as suggeste in the posting guide http://www.R-project.org/posting-guide.html;. Hope this helps. Spencer Michael wrote: Hi all, Could anybody point me to a good/robust numerical optimization program to use in R? I am doing some MLE fitting. Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in FUN with tapply and by
A subset of my raw data looks like this: -- Grip Technique Baseline.integrated Task Stroke..direction.Engag Disen PenDG PenUG PenDS PenUS Duration - Tripod,Barrel,Integrated,7,S70,230,510,270,510,781,1011,1011 Tripod,Barrel,Integrated,7,S71,na,na,na,na,na,na,na Round,NonPrefHand,Baseline,0,S00,na,na,110,250,380,520,520 Round,NonPrefHand,Baseline,0,S01,na,na,220,360,460,620,620 -- I computed some values (times) from the raw data --- t_p1=PenDG t_c1=PenUG-PenDG t_p2=PenDS-PenUG t_c2=PenUS-PenDS --- And I put those times in a data frame called times. For each of these times, I want to subtract the average for Baseline trials from the average for Integrated trials within the Grip and Technique factors. Call these differences the true cost of mode selection. truecost - function(time){as.numeric(tapply(time,Baseline.integrated,mean,na.rm=T)[2]-tapply(time,Baseline.integrated,mean,na.rm=T)[1])} To help explain what the truecost function does: tapply(t_p1,Baseline.integrated,mean,na.rm=T) Baseline Integrated 212.8000 252.8402 truecost(t_p1) [1] 40.04021 Then I try to create a table of average truecost as a function of levels of a factor. I think this is the same error with tapply and by. tapply(t_p1,list(Grip,Technique),truecost,na.rm=T) Error in FUN(X[[1L]], ...) : unused argument(s) (na.rm = TRUE) by(times,list(Grip,Technique),truecost,na.rm=T) Error in FUN(data[x, , drop = FALSE], ...) : unused argument(s) (na.rm = TRUE) Any ideas? Thomas Levine! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sciplot question
Hi, I would like to have lineplot.CI and barplot.CI to actually plot confidence intervals , instead of standard error. I understand I have to use the ci.fun option, but I'm not quite sure how. Like this : qt(0.975,df=n-1)*s/sqrt(n) but how can I apply it to visualize the length of the student's T confidence intervals rather than the stdandard error of the plotted means ? -- ~~ Best regards .~. Jarle Bjørgeengen /V\ Mob: +47 9155 7978// \\ http://www.uio.no/sok?person=jb /( )\ while(){if(s/^(.*\?)$/42 !/){print $1 $_}}^`~'^ ~~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] regrouping factor levels
Hi all, I had some trouble in regrouping factor levels for a variable. After some experiments, I have figured out how I can recode to modify the factor levels. I would now like some help to understand why some methods work and others don't. Here's my code : rm(list=ls()) ###some trials in recoding factor levels char-letters[1:10] fac-factor(char) levels(fac) print(fac) ##first method of recoding factors fac1-fac levels(fac1)[c(a,b,c)]-A levels(fac1)[c(d,e,f)]-B levels(fac1)[c(g,h,i,j)]-C levels(fac1) print(fac1) ##second method fac2-fac levels(fac2)[c(1,2,3)]-A levels(fac2)[c(2,3,4)]-B # not c(4,5,6) levels(fac2)[c(3,4,5,6)]-C # not c(7,8,9,10) levels(fac2) print(fac2) #third method fac3-fac levels(fac3)-list(A=c(a,b,c),B=c(d,e,f),C=c(g,h,i,j)) levels(fac3) print(fac3) I first tried method 1 and had no luck with it at all. The levels A, B, and C just got added to the existing levels without affecting the fac variable. After some time, I was able to figure out how I should use method 2. After reading the help documentation, I arrived at method 3. I would appreciate help in understanding why the first method does not work. In my application, I had long factor names and Tinn-R just would not accept statements running to several lines. Partial substitution was desirable then. Having spent a considerable amount of time on this, I would like to understand the underlying problem with method 1 as it is. The deeper understanding could be useful for me later. Thanking You, Ravi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Goodness of fit in quantile regression
Laura: Part of the issue may depend on what you mean by goodness-of-ft. If you are looking for some global measure like a pseudo R or AIC to select among models, you ought to be able to make those calculations off the objective function that was minimized as you recognized. If qr.fit.sfn() does not return the objective function like rq.fit.br(), the simplex routine, you still ought to be able to do the calculations by performing the asymmetric weighting of the residuals from the model (see Roger Koenker's 2005 book). Now, if by goodness-of-fit you mean how the model fits in local regions of the predictor space, then you might want to check out Stef van Buuren's work on worm plots to diagnose fit in quantile regression. Don't remember where he published this but his email is stef.vanbuu...@tno.nl Brian Brian S. Cade, PhD U. S. Geological Survey Fort Collins Science Center 2150 Centre Ave., Bldg. C Fort Collins, CO 80526-8818 email: brian_c...@usgs.gov tel: 970 226-9326 From: laura m. mayorala...@gmail.com To: r-help@r-project.org Date: 05/22/2009 03:29 AM Subject: [R] Goodness of fit in quantile regression Sent by: r-help-boun...@r-project.org Dear R users, I've used the function qr.fit.sfn to estimate a quantile regression on a panel data set. Now I would like to compute an statistic to measure the goodness of fit of this model. Does someone know how could I do that? I could compute a pseudo R2 but in order to do that I would need the value of the objetive function at the optimum and I don't see how to get this from the function qr.fit.sfn. If someone has any good idea about how to solve this problem I would be most grateful! Best Laura -- View this message in context: http://www.nabble.com/Goodness-of-fit-in-quantile-regression-tp23666962p23666962.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in FUN with tapply and by
Error message is self-explanatory: there is an unused parameter 'na.rm=TRUE'. You are calling your function 'truecost' which only has a single parameter 'time' and you are attempting to pass in 'na.rm=TRUE' which it will not accept. You don't need it. On Fri, May 22, 2009 at 12:36 PM, Thomas Levine thomas.lev...@gmail.comwrote: A subset of my raw data looks like this: -- Grip Technique Baseline.integrated Task Stroke..direction.Engag Disen PenDG PenUG PenDS PenUS Duration - Tripod,Barrel,Integrated,7,S70,230,510,270,510,781,1011,1011 Tripod,Barrel,Integrated,7,S71,na,na,na,na,na,na,na Round,NonPrefHand,Baseline,0,S00,na,na,110,250,380,520,520 Round,NonPrefHand,Baseline,0,S01,na,na,220,360,460,620,620 -- I computed some values (times) from the raw data --- t_p1=PenDG t_c1=PenUG-PenDG t_p2=PenDS-PenUG t_c2=PenUS-PenDS --- And I put those times in a data frame called times. For each of these times, I want to subtract the average for Baseline trials from the average for Integrated trials within the Grip and Technique factors. Call these differences the true cost of mode selection. truecost - function(time){as.numeric(tapply(time,Baseline.integrated,mean,na.rm=T)[2]-tapply(time,Baseline.integrated,mean,na.rm=T)[1])} To help explain what the truecost function does: tapply(t_p1,Baseline.integrated,mean,na.rm=T) Baseline Integrated 212.8000 252.8402 truecost(t_p1) [1] 40.04021 Then I try to create a table of average truecost as a function of levels of a factor. I think this is the same error with tapply and by. tapply(t_p1,list(Grip,Technique),truecost,na.rm=T) Error in FUN(X[[1L]], ...) : unused argument(s) (na.rm = TRUE) by(times,list(Grip,Technique),truecost,na.rm=T) Error in FUN(data[x, , drop = FALSE], ...) : unused argument(s) (na.rm = TRUE) Any ideas? Thomas Levine! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need a faster function to replace missing data
Here are 2 functions, which.just.above and which.just.below, which may help you. They will tell which element in a reference dataset is the first just above (or just below) each element in the main dataset (x). They return NA if there is no reference element above (or below) an element of x. The strict argument lets you say if the inequalities are strict or if equality is acceptable. They are vectorized so are pretty quick. E.g., which.just.below(c(14,14.5), 11:15, strict=TRUE) [1] 3 4 which.just.above(c(14,14.5), 11:15, strict=FALSE) [1] 4 5 They should work with any class of data that order() and sort() work on. In particular, POSIXct times work. The attached file has a demonstration function called 'test' with some examples. In your case the 'reference' data would be the times at which your backup measurements were taken and the 'x' data would be the times of the pings. You can look at the elements of 'reference' just before and just after each ping (or just the pings that are missing locations) and decide how to combine the data from the bracketing reference elements to inpute a location for the ping. Here are the functions, in case the attachment doesn't make it through. I'm sure some mailer will throw in some newlines so it will be corrupted. which.just.above - function(x, reference, strict = T) { # output[k] will be index of smallest value in reference vector # larger than x[k]. If strict=F, replace 'larger than' by # 'larger than or equal to'. # We should allow NA's in x (but we don't). NA's in reference # should not be allowed. if(any(is.na(x)) || any(is.na(reference))) stop(NA's in input) if(strict) i - c(rep(T, length(reference)), rep(F, length(x)))[order( c(reference, x))] else i - c(rep(F, length(x)), rep(T, length(reference)))[order(c( x, reference))] i - cumsum(i)[!i] + 1. i[i length(reference)] - NA # i is length of x and has values in range 1:length(reference) or NA # following needed if reference is not sorted i - order(reference)[i] # following needed if x is not sorted i[order(order(x))] } which.just.below - function(x, reference, strict = T) { # output[k] will be index of largest value in reference vector # less than x[k]. If strict=F, replace 'less than' by # 'less than or equal to'. Neither x nor reference need be # sorted, although they should not have NA's (in theory, NA's # in x are ok, but not in reference). if(any(is.na(x)) || any(is.na(reference))) stop(NA's in input) if(!strict) i - c(rep(T, length(reference)), rep(F, length(x)))[order( c(reference, x))] else i - c(rep(F, length(x)), rep(T, length(reference)))[order(c( x, reference))] i - cumsum(i)[!i] i[i = 0] - NA # i is length of x and has values in range 1:length(reference) or NA # following needed if reference is not sorted i - order(reference)[i] # following needed if x is not sorted i[order(order(x))] } Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Tim Clark Sent: Thursday, May 21, 2009 9:45 PM To: r-help@r-project.org Subject: [R] Need a faster function to replace missing data Dear List, I need some help in coming up with a function that will take two data sets, determine if a value is missing in one, find a value in the second that was taken at about the same time, and substitute the second value in for where the first should have been. My problem is from a fish tracking study. We put acoustic tags in fish and track them for several days. Location data is supposed to be automatically recorded every time we detect a ping from the fish. Unfortunately the GPS had some problems and sometimes the fishes depth was recorded but not its location. I fortunately had a back-up GPS that was taking location data every five minutes. I would like to merge the two files, replacing the missing value in the vscan (automatic) file with the location from the garmin file. Since we were getting vscan records every 1-2 seconds and garmin records every 5 minutes, I need to find the right place in the vscan file to place the garmin record - i.e. the closest in time, but not greater than 5 minutes. I have written a function that does this. However, it works with my test data but locks up my computer with my real data. I have several million vscan records and several thousand garmin records. Is there a better way to do this? My function and test data: myvscan-data.frame(c(1,NA,1.5),times(c(12:00:00,12:14:00, 12:20:00)))
Re: [R] Need a faster function to replace missing data
I think this does what you want. It uses 'findInterval' to determine where a possible match is: myvscan-data.frame(c(1,NA,1.5),as.POSIXct(c(12:00:00,12:14:00,12:20:00), format=%H:%M:%S)) # convert to numeric names(myvscan)-c(Latitude,DateTime) myvscan$tn - as.numeric(myvscan$DateTime) # numeric for findInterval mygarmin-data.frame(c(20,30,40),as.POSIXct(c(12:00:00,12:10:00,12:15:00), format=%H:%M:%S)) names(mygarmin)-c(Latitude,DateTime) mygarmin$tn - as.numeric(mygarmin$DateTime) # use 'findInterval' na.indx - which(is.na(myvscan$Latitude)) # find NAs # replace with garmin latitude myvscan$Latitude[na.indx] - mygarmin$Latitude[findInterval(myvscan$tn[na.indx], mygarmin$tn)] myvscan LatitudeDateTime tn 1 1.0 2009-05-22 12:00:00 1243008000 2 30.0 2009-05-22 12:14:00 1243008840 3 1.5 2009-05-22 12:20:00 1243009200 On Fri, May 22, 2009 at 12:45 AM, Tim Clark mudiver1...@yahoo.com wrote: Dear List, I need some help in coming up with a function that will take two data sets, determine if a value is missing in one, find a value in the second that was taken at about the same time, and substitute the second value in for where the first should have been. My problem is from a fish tracking study. We put acoustic tags in fish and track them for several days. Location data is supposed to be automatically recorded every time we detect a ping from the fish. Unfortunately the GPS had some problems and sometimes the fishes depth was recorded but not its location. I fortunately had a back-up GPS that was taking location data every five minutes. I would like to merge the two files, replacing the missing value in the vscan (automatic) file with the location from the garmin file. Since we were getting vscan records every 1-2 seconds and garmin records every 5 minutes, I need to find the right place in the vscan file to place the garmin record - i.e. the closest in time, but not greater than 5 minutes. I have written a function that does this. However, it works with my test data but locks up my computer with my real data. I have several million vscan records and several thousand garmin records. Is there a better way to do this? My function and test data: myvscan-data.frame(c(1,NA,1.5),times(c(12:00:00,12:14:00,12:20:00))) names(myvscan)-c(Latitude,DateTime) mygarmin-data.frame(c(20,30,40),times((12:00:00,12:10:00,12:15:00))) names(mygarmin)-c(Latitude,DateTime) minute.diff-1/24/12 #Time diff is in days, so this is 5 minutes for (k in 1:nrow(myvscan)) { if (is.na(myvscan$Latitude[k])) { if ((min(abs(mygarmin$DateTime-myvscan$DateTime[k]))) minute.diff ) { index.min.date-which.min(abs(mygarmin$DateTime-myvscan$DateTime[k])) myvscan$Latitude[k]-mygarmin$Latitude[index.min.date] }}} I appreciate your help and advice. Aloha, Tim Tim Clark Department of Zoology University of Hawaii __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] step by step debugger in R?
Duncan Murdoch wrote: On 5/22/2009 10:59 AM, Michael wrote: Really I think if there is a Visual Studio strength debugger, our collective time spent in developing R code will be greatly reduced. If someone who knows how to write a debugger plugin for Eclipse wants to help, we could have that fairly easily. All the infrastructure is there; it's the UI part that's missing. Duncan Murdoch [I've copied Mark Bravington and Robert Gentleman to the list as they are likely to have views here, and I am not sure they monitor R-help] Hello, Making a front-end to debugging was one of the proposed google summer of code for this year [1], it was not retained eventually, but I am still motivated. Pretty much all infrastructure is there, and some work has been done __very recently__ in R's debugging internals (ability to step up). As I see it, the ability to call some sort of hook each time the debugger waits for input would make it much easier for someone to write front-ends. A recent post of mine (patch included) [2] on R-devel suggested a custom prompt for browser which would do the trick, but I now think that a hook would be more appropriate. Without something similar to that, there is no way that I know of for making a front-end, unless maybe if you embed R ... (please let me know how I am wrong) There is also the debug package [3,4] which does __not__ work with R internals but rather works with instrumenting tricks at the R level. debug provides a tcl/tk front-end. It is my understanding that it does not work using R internals (do_browser, ...) because it was not possible at the time, and I believe this is still not possible today, but I might be wrong. I'd prefer to be wrong actually. Romain [1] : http://www.r-project.org/soc09/ideas.html#p5 [2] : https://stat.ethz.ch/pipermail/r-devel/2009-April/053128.html [3] : http://cran.r-project.org/web/packages/debug/index.html [4] : http://cran.r-project.org/doc/Rnews/Rnews_2003-3.pdf -- Romain Francois Independent R Consultant +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bug in rpart?
Yuanyuan wrote: Greetings, I checked the Indian diabetes data again and get one tree for the data with reordered columns and another tree for the original data. I compared these two trees, the split points for these two trees are exactly the same but the fitted classes are not the same for some cases. And the misclassification errors are different too. I know how CART deal with ties --- even we are using the same data, the subjects to the left and right would not be the same if we just rearrange the order of covariates. But the problem is, the fitted trees are exactly the same on the split points. Shouldn't we get the same fitted values if the decisions are the same at each step? Why the same structured trees have different observations on the nodes? Because they may use different surrogate variables. Note that your data contain missing values that are handled by surrogates. Best, Uwe Ligges The source code for running the diabetes data example and the output of trees are attached. Your professional opinion is very much appreciated. library(mlbench) data(PimaIndiansDiabetes2) mydata-PimaIndiansDiabetes2 library(rpart) fit2-rpart(diabetes~., data=mydata,method=class) plot(fit2,uniform=T,main=CART for original data) text(fit2,use.n=T,cex=0.6) printcp(fit2) table(predict(fit2,type=class),mydata$diabetes) ## misclassifcation table: rows are fitted class neg pos neg 437 68 pos 63 200 pmydata-data.frame(mydata[,c(1,6,3,4,5,2,7,8,9)]) fit3-rpart(diabetes~., data=pmydata,method=class) plot(fit3,uniform=T,main=CART after exchaging mass glucose) text(fit3,use.n=T,cex=0.6) printcp(fit3) table(predict(fit3,type=class),pmydata$diabetes) ##after exchage the order of BODY mass and PLASMA glucose neg pos neg 436 64 pos 64 204 Best, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Class for time of day?
On Fri, May 22, 2009 at 12:28 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: ...The way this might appear in code is if someone wanted to calculate the number of one hour intervals in 18 hours. One could write: t18 - times(18:00:00) t1 - times(1:00:00) as.numeric(t18) / as.numeric(t1) but since we all know that it uses internal representations unless it indicates otherwise Um, yes, I suppose that was the attitude in the 60's and 70's, but I think we have moved on from there. cf. http://en.wikipedia.org/wiki/Data_abstraction a typical code snippet might shorten it to: as.numeric(t18 / t1) and all such code would break if one were to cause that to generate an error. (18/24 day)/(1/24 day) is the perfectly meaningful dimensionless number 18, so this code should not break with a correct implementation of '/'. (cf. http://en.wikipedia.org/wiki/Dimensional_analysis). Alas, chron gives the nonsense result of 18 days. -s [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can I estimate a Box-Cox function with R?
Thanks Gregory. I see that that with boxcox.lm() the optimal lambda is obtained and plotted against log-likelihood. library(MASS) boxcox(Volume ~ log(Height) + log(Girth), data = trees, lambda = seq(-0.25, 0.25, length = 10)) But has how can I see the fit of the same linear model with the optimal BoxCox transformation, the standard error for lambda etc.? Best. Ikerne. Have you looked at the boxcox function in the MASS package? That may do what you want. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Ikerne del Valle Sent: Thursday, May 21, 2009 4:29 AM To: fernando.tus...@ehu.es Cc: r-help@r-project.org Subject: [R] How can I estimate a Box-Cox function with R? Dear Fernando and all: Thanks for your help. Now works. This is a training example to learn how to estimate a Box-Cox (right and/or left side transformations) with R (as LIMDEP does) in order to compare these estimations with the ones derived by applying NLS, ones the dependent variable has been divided by its geometric mean (see below) as suggested by (Zarembka (1974) and Spitzer (1984). However the example of the demand of money seems not to work. Any idea to face the error messages or how to estimate a Box-Cox function with R? Best regards, Ikerne library(nlrwr) r- c(4.50,4.19,5.16,5.87,5.95,4.88,4.50,6.44,7.83,6.25,5.50,5.46,7.46,10.2 8,11.77,13.42,11.02,8.50,8.80,7.69) Lr-log(r) M- c(480.00,524.30,566.30,589.50,628.20,712.80,805.20,861.00,908.40,1023.1 0,1163.60,1286.60,1388.90,1497.90,1631.40,1794.40,1954.90,2188.80,2371. 70,2563.60) LM-log(M) Y- c(2208.30,2271.40,2365.60,2423.30,2416.20,2484.80,2608.50,2744.10,2729. 30,2695.00,2826.70,2958.60,3115.20,3192.40,3187.10,3248.80,3166.00,3277 .70,3492.00,3573.50) LY-log(Y) gmM-exp((1/20)*sum(LM)) GM-M/gmM Gr-r/gmM GY-Y/gmM money-data.frame(r,M,Y,Lr,LM,LY,GM,Gr,GY) attach(money) ols1-lm(GM~r+Y) output1-summary(ols1) coef1-ols1$coefficients a1-coef1[[1]] b11-coef1[[2]] b21-coef1[[3]] ols2-lm(GM~Gr+GY) output2-summary(ols2) coef2-ols2$coefficients a2-coef2[[1]] b12-coef2[[2]] b22-coef2[[3]] money.m1- nls(GM~a+b*r^g+c*Y^g,data=money,start=list(a=a1,b=b11,g=1,c=b21)) money.m2- nls(GM~a+b*Gr^g+c*GY^g,data=money,start=list(a=a2,b=b12,g=1,c=b22)) Ikerne del Valle Erkiaga Department of Applied Economics V Faculty of Economic and Business Sciences University of the Basque Country Avda. Lehendakari Agirre, Nº 83 48015 Bilbao (Bizkaia) Spain __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Class for time of day?
On Fri, May 22, 2009 at 1:55 PM, Stavros Macrakis macra...@alum.mit.edu wrote: On Fri, May 22, 2009 at 12:28 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: ...The way this might appear in code is if someone wanted to calculate the number of one hour intervals in 18 hours. One could write: t18 - times(18:00:00) t1 - times(1:00:00) as.numeric(t18) / as.numeric(t1) but since we all know that it uses internal representations unless it indicates otherwise Um, yes, I suppose that was the attitude in the 60's and 70's, but I think we have moved on from there. cf. http://en.wikipedia.org/wiki/Data_abstraction a typical code snippet might shorten it to: as.numeric(t18 / t1) and all such code would break if one were to cause that to generate an error. (18/24 day)/(1/24 day) is the perfectly meaningful dimensionless number 18, so this code should not break with a correct implementation of '/'. (cf. http://en.wikipedia.org/wiki/Dimensional_analysis). Alas, chron gives the nonsense result of 18 days. Your point was that otherwise undefined operations should produce an error and I was illustrating why that could not be introduced at this stage. I had already suggested that you implement division if you found it important and that was not the source of any disagreement. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Naming a random effect in lmer
The first exaample on the lmer help page uses a formula Reaction ~ Days + (Days|Subject). Here, Subject is the name of a column in the data.frame sleepstudy, with levels 308, 309, ... . Does this answer your question? If no, please provide commented, minimal, self-contained, reproducible code, as requested in the posting guide http://www.R-project.org/posting-guide.html;. Your example is not self-contained, reproducible. Hope this helps. Spencer Leigh Ann Starcevich wrote: Dear guRus: I am using lmer for a mixed model that includes a random intercept for a set of effects that have the same distribution, Normal(0, sig2b). This set of effects is of variable size, so I am using an as.formula statement to create the formula for lmer. For example, if the set of random effects has dimension 8, then the lmer call is: Zs- paste(Z,1:mb,sep=) Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, paste(paste(Zs,collapse=+), fit2.4a-lmer(Trendformula, data = testsamp) which, for mb=8, expands to: fit1-lmer(LogY ~ WYear + (1 | Site) + (1 | Year) + (1 | Z1+ Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8), data = testsamp) I have no problems with this. However, if the set of random effects has a dimension of 30, then the lmer call is: fit2-lmer(LogY ~ WYear + (1 | Site) + (1 | Year) + (1 | Z1+Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9 + Z10 + Z11 + Z12 + Z13 + Z14 + Z15 + Z16 + Z17 + Z18 + Z19 + Z20 + Z21 + Z22 + Z23 + Z24 + Z25 + Z26 + Z27 + Z28 + Z29+ Z30), data = testsamp) In this case, I get an error because the name Z1+Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9 + Z10 + Z11 + Z12 + Z13 + Z14 + Z15 + Z16 + Z17 + Z18 + Z19 + Z20 + Z21 + Z22 + Z23 + Z24 + Z25 + Z26 + Z27 + Z28 + Z29+ Z30 is too long to print in the output. Is there any way to name the random effect in lmer so that the shorter (and more descriptive) name may be used and the error avoided? Or is there a way to combine these into a single variable prior to the lmer function call? In SAS, I am able to parameterize these as a Toeplitz structure with bandwidth 1. Thanks for any help. Leigh Ann Starcevich Doctoral student Oregon State University Corvallis, Oregon [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Object not found
Hello, I run into a problem: ftable(table(Fire, Standard, StoAll), col.vars=c(Fire,Standard)) Error in table(Fire, Standard, StoAll) : object 'Fire' not found I do not understand that because when I read the table everything seems correct. Stocking_all-read.table(P://Benchmark//analysis//r//stocking_10//stock ing_all.csv,header=TRUE,sep=,) names(Stocking_all) [1] ID_basic_tallesttree FID_plot Fire [4] Time_FireStandard Fire_Standard [7] ESR Fire_Stand StoARS2000 [10] StoWAS2008 StoAll And I also get a summary for the variables. I have no clue and would very much appreciate your help. Stefanie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Object not found
You read them into a dataframe so you need to do ftable(table(Stocking_all$Fire, Stocking_all$Standard, StoAll), col.vars=c(Fire,Standard)) On Fri, May 22, 2009 at 2:33 PM, Gaertner, Stefanie stefanie.gaert...@ales.ualberta.ca wrote: Hello, I run into a problem: ftable(table(Fire, Standard, StoAll), col.vars=c(Fire,Standard)) Error in table(Fire, Standard, StoAll) : object 'Fire' not found I do not understand that because when I read the table everything seems correct. Stocking_all-read.table(P://Benchmark//analysis//r//stocking_10//stock ing_all.csv,header=TRUE,sep=,) names(Stocking_all) [1] ID_basic_tallesttree FID_plot Fire [4] Time_FireStandard Fire_Standard [7] ESR Fire_Stand StoARS2000 [10] StoWAS2008 StoAll And I also get a summary for the variables. I have no clue and would very much appreciate your help. Stefanie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in FUN with tapply and by
str(time) function (x, ...) str(t_p1) num [1:576] 190 180 190 200 210 200 220 190 230 230 ... str(Baseline.integrated) Factor w/ 2 levels Baseline,Integrated: 1 1 1 1 1 1 1 1 1 1 ... str(Technique) Factor w/ 2 levels Barrel,NonPrefHand: 1 1 1 1 1 1 1 1 1 1 ... str(Grip) Factor w/ 2 levels Round,Tripod: 1 1 1 1 1 1 1 1 1 1 ... On Fri, May 22, 2009 at 2:46 PM, jim holtman jholt...@gmail.com wrote: You need to supply str for the original arguments; the error message had a different set of parameters. On Fri, May 22, 2009 at 2:36 PM, Thomas Levine thomas.lev...@gmail.comwrote: That produces the following error tapply(t_p1,list(Grip,Technique),truecost) Error in tapply(time, Baseline.integrated, mean, na.rm = T) : arguments must have same length On Fri, May 22, 2009 at 1:06 PM, jim holtman jholt...@gmail.com wrote: Error message is self-explanatory: there is an unused parameter 'na.rm=TRUE'. You are calling your function 'truecost' which only has a single parameter 'time' and you are attempting to pass in 'na.rm=TRUE' which it will not accept. You don't need it. On Fri, May 22, 2009 at 12:36 PM, Thomas Levine thomas.lev...@gmail.com wrote: A subset of my raw data looks like this: -- Grip Technique Baseline.integrated Task Stroke..direction.Engag Disen PenDG PenUG PenDS PenUS Duration - Tripod,Barrel,Integrated,7,S70,230,510,270,510,781,1011,1011 Tripod,Barrel,Integrated,7,S71,na,na,na,na,na,na,na Round,NonPrefHand,Baseline,0,S00,na,na,110,250,380,520,520 Round,NonPrefHand,Baseline,0,S01,na,na,220,360,460,620,620 -- I computed some values (times) from the raw data --- t_p1=PenDG t_c1=PenUG-PenDG t_p2=PenDS-PenUG t_c2=PenUS-PenDS --- And I put those times in a data frame called times. For each of these times, I want to subtract the average for Baseline trials from the average for Integrated trials within the Grip and Technique factors. Call these differences the true cost of mode selection. truecost - function(time){as.numeric(tapply(time,Baseline.integrated,mean,na.rm=T)[2]-tapply(time,Baseline.integrated,mean,na.rm=T)[1])} To help explain what the truecost function does: tapply(t_p1,Baseline.integrated,mean,na.rm=T) Baseline Integrated 212.8000 252.8402 truecost(t_p1) [1] 40.04021 Then I try to create a table of average truecost as a function of levels of a factor. I think this is the same error with tapply and by. tapply(t_p1,list(Grip,Technique),truecost,na.rm=T) Error in FUN(X[[1L]], ...) : unused argument(s) (na.rm = TRUE) by(times,list(Grip,Technique),truecost,na.rm=T) Error in FUN(data[x, , drop = FALSE], ...) : unused argument(s) (na.rm = TRUE) Any ideas? Thomas Levine! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Barchart in lattice - wrong order of groups, data labels on top of each other, and a legend question
Thank you very much, Gabor! On Thu, May 21, 2009 at 9:37 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: If you are willing to go down one level and work at the grid level then you can do it without modifying the panel function. Below gg.ls$name lists the grid object names. Within that list look at the ones that have rect in their name. Among those are 5 in success (the 2nd through 6th rect objects). Similarly look for a text object in that vicinity. That would be the the third object of those with text in their name. We want to reset the positioning of the text in the text object using the info from the rect objects that form the bars. Thus: # first run your code as posted, then run this library(grid) gg - grid.grab() gg.ls - grid.ls(gg) # based on gg.ls$name we make the observations above rect.names - grep(rect, gg.ls$name, value = TRUE)[2:6] text.names - grep(text, gg.ls$name, value = TRUE)[3] rect.x - c(sapply(rect.names, function(x) grid.get(x)$x)) rect.height - c(sapply(rect.names, function(x) grid.get(x)$height)) text.name - grep(text, gg.ls$name, value = TRUE)[3] grid.edit(text.name, x = unit(rect.x, native), y = unit(rect.height+1, native)) On Thu, May 21, 2009 at 3:58 PM, Dimitri Liakhovitski ld7...@gmail.com wrote: Deepayan, thank you very much for your response. I have a general question. And please remember - I am really just a beginner in R. Is it truly the case that in order to build quite a basic bar chart with value labels attached to it I have to be a true R graphics guru - because the only way to do achieve what I am trying to achive is to modify the underlying R function (panel.barchart)? Really? Dimitri On Tue, May 19, 2009 at 7:53 PM, Deepayan Sarkar deepayan.sar...@gmail.com wrote: On Mon, May 18, 2009 at 11:47 AM, Dimitri Liakhovitski ld7...@gmail.com wrote: Hello! I have a question about my lattice barchart that I am trying to build in Section 3 below. I can't figure out a couple of things: 1. When I look at the dataframe test that I am trying to plot, it looks right to me (the group Total is always the first out of 5). However, in the chart it is the last. Why? 2. How can I make sure the value labels (on y) are not sitting on top of each other but on top of the respective bar? 3. Is there any way to make the legend group items horizontally as opposed to now (vertically - taking up too much space) For 1 and 3, use auto.key = list(points = FALSE, rectangles = TRUE, reverse.rows = TRUE, columns = 2, space = bottom) From ?xyplot (under 'key'): 'reverse.rows' logical, defaulting to 'FALSE'. If 'TRUE', all components are reversed _after_ being replicated (the details of which may depend on the value of 'rep'). This is useful in certain situations, e.g. with a grouped 'barchart' with 'stack = FALSE' with the categorical variable on the vertical axis, where the bars in the plot will usually be ordered from bottom to top, but the corresponding legend will have the levels from top to bottom (unless, of course, 'reverse.rows = TRUE'). Note that in this case, unless all columns have the same number or rows, they will no longer be aligned. 'columns' the number of columns column-blocks the key is to be divided into, which are drawn side by side. 2 is hard with a simple custom panel function, because you need to replicate some fairly involved calculations that are performed in panel.barchart. Your best bet is to start with a copy of panel.barchart, and then add calls to panel.text at suitable places. -Deepayan Thanks a lot! Dimitri ### Section 1: generates my data set data - just run: # N-100 myset1-c(1,2,3,4,5) probs1-c(.05,.10,.15,.40,.30) myset2-c(0,1) probs2-c(.65,.30) myset3-c(1,2,3,4,5,6,7) probs3-c(.02,.03,.10,.15,.20,.30,.20) group-unlist(lapply(1:4,function(x){ out-rep(x,25) return(out) })) set.seed(1) a-sample(myset1, N, replace = TRUE,probs1) a[which(rbinom(100,2,.01)==1)]-NA set.seed(12) b-sample(myset1, N, replace = TRUE,probs1) b[which(rbinom(100,2,.01)==1)]-NA set.seed(123) c-sample(myset2, N, replace = TRUE,probs2) set.seed(1234) d-sample(myset2, N, replace = TRUE,probs2) set.seed(12345) e-sample(myset3, N, replace = TRUE,probs3) e[which(rbinom(100,2,.01)==1)]-NA set.seed(123456) f-sample(myset3, N, replace = TRUE,probs3) f[which(rbinom(100,2,.01)==1)]-NA data-data.frame(group,a=a,b=b,c=c,d=d,e=e,f=f) data[group]-lapply(data[group],function(x) { x[x %in% 1]-Group 1 x[x %in% 2]-Group 2 x[x %in% 3]-Group 3
Re: [R] step by step debugger in R?
Hi, Romain Francois wrote: Duncan Murdoch wrote: On 5/22/2009 10:59 AM, Michael wrote: Really I think if there is a Visual Studio strength debugger, our collective time spent in developing R code will be greatly reduced. If someone who knows how to write a debugger plugin for Eclipse wants to help, we could have that fairly easily. All the infrastructure is there; it's the UI part that's missing. Duncan Murdoch [I've copied Mark Bravington and Robert Gentleman to the list as they are likely to have views here, and I am not sure they monitor R-help] Hello, Making a front-end to debugging was one of the proposed google summer of code for this year [1], it was not retained eventually, but I am still motivated. Pretty much all infrastructure is there, and some work has been done __very recently__ in R's debugging internals (ability to step up). As I see it, the ability to call some sort of hook each time the debugger waits for input would make it much easier for someone to write I have still not come to an understanding of what this is supposed to do? When you have the browser prompt you can call any function or code you want to. There is no need for something special to allow you to do that. front-ends. A recent post of mine (patch included) [2] on R-devel suggested a custom prompt for browser which would do the trick, but I now think that a hook would be more appropriate. Without something similar to that, there is no way that I know of for making a front-end, unless maybe if you embed R ... (please let me know how I am wrong) I think you are wrong. I can't see why it is needed. The external debugger has lots of options for handling debugging. It can rewrite code (see examples in trace for how John Chambers has done this to support tracing at a location), which is AFAIK a pretty standard approach to writing debuggers. It can figure out where the break point is (made a bit easier by allowing it to put in pieces of text in the call to browser). These are things the internal debugger can't do. There is also the debug package [3,4] which does __not__ work with R internals but rather works with instrumenting tricks at the R level. debug provides a tcl/tk front-end. It is my understanding that it does not work using R internals (do_browser, ...) because it was not possible at the time, and I believe this is still not possible today, but I might be wrong. I'd prefer to be wrong actually. I don't understand this statement. It has always been possible to work with the internal version - but one can also take the approach of rewriting code. There are some difficulties supporting all the operations that one would like by rewriting code and I think a combination of external controls and the internal debugger will get most of the functionality that anyone wants. There are somethings that are hard and once I have a more complete list I will be adding this to the appropriate manual. I will also be documenting the changes that I have been making, but that project is in flux and won't be done until the end of August, so people who want to look at it are welcome (it is in R-devel), but it is in development and could change pretty much without notice. Romain noted that we now support stepping out from one place to another function. We also have a debugonce flag that lets you get close to step in, but step in is very hard in R. I am mostly interested in writing tools in R that can be used by anyone that wants to write an external debugger and am not that interested in any particular external debugger. I would be happy to listen to feature requests or issues that arise - but the discussion should probably be on R-devel mailing list. best wishes Robert Romain [1] : http://www.r-project.org/soc09/ideas.html#p5 [2] : https://stat.ethz.ch/pipermail/r-devel/2009-April/053128.html [3] : http://cran.r-project.org/web/packages/debug/index.html [4] : http://cran.r-project.org/doc/Rnews/Rnews_2003-3.pdf -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 rgent...@fhcrc.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] step by step debugger in R?
As a newbie I'm trying to figure out how much more than RKWard does is wanted. The code turns colors as syntax is checked and errors are noted. It seems like a reasonable IDE. Maybe someone is looking for the same in windows? John F Lindsey 803-790-5006 Home , 803-790-5008 Cell 919-439-9088 Home Office 3850 Northshore Rd., Columbia,SC 29206-3350 john.f.lind...@gmail.com , john-lind...@sc.rr.com http://www.linkedin.com/in/johnlindseysc Robert Gentleman wrote: Hi, Romain Francois wrote: Duncan Murdoch wrote: On 5/22/2009 10:59 AM, Michael wrote: Really I think if there is a Visual Studio strength debugger, our collective time spent in developing R code will be greatly reduced. If someone who knows how to write a debugger plugin for Eclipse wants to help, we could have that fairly easily. All the infrastructure is there; it's the UI part that's missing. Duncan Murdoch [I've copied Mark Bravington and Robert Gentleman to the list as they are likely to have views here, and I am not sure they monitor R-help] Hello, Making a front-end to debugging was one of the proposed google summer of code for this year [1], it was not retained eventually, but I am still motivated. Pretty much all infrastructure is there, and some work has been done __very recently__ in R's debugging internals (ability to step up). As I see it, the ability to call some sort of hook each time the debugger waits for input would make it much easier for someone to write I have still not come to an understanding of what this is supposed to do? When you have the browser prompt you can call any function or code you want to. There is no need for something special to allow you to do that. front-ends. A recent post of mine (patch included) [2] on R-devel suggested a custom prompt for browser which would do the trick, but I now think that a hook would be more appropriate. Without something similar to that, there is no way that I know of for making a front-end, unless maybe if you embed R ... (please let me know how I am wrong) I think you are wrong. I can't see why it is needed. The external debugger has lots of options for handling debugging. It can rewrite code (see examples in trace for how John Chambers has done this to support tracing at a location), which is AFAIK a pretty standard approach to writing debuggers. It can figure out where the break point is (made a bit easier by allowing it to put in pieces of text in the call to browser). These are things the internal debugger can't do. There is also the debug package [3,4] which does __not__ work with R internals but rather works with instrumenting tricks at the R level. debug provides a tcl/tk front-end. It is my understanding that it does not work using R internals (do_browser, ...) because it was not possible at the time, and I believe this is still not possible today, but I might be wrong. I'd prefer to be wrong actually. I don't understand this statement. It has always been possible to work with the internal version - but one can also take the approach of rewriting code. There are some difficulties supporting all the operations that one would like by rewriting code and I think a combination of external controls and the internal debugger will get most of the functionality that anyone wants. There are somethings that are hard and once I have a more complete list I will be adding this to the appropriate manual. I will also be documenting the changes that I have been making, but that project is in flux and won't be done until the end of August, so people who want to look at it are welcome (it is in R-devel), but it is in development and could change pretty much without notice. Romain noted that we now support stepping out from one place to another function. We also have a debugonce flag that lets you get close to step in, but step in is very hard in R. I am mostly interested in writing tools in R that can be used by anyone that wants to write an external debugger and am not that interested in any particular external debugger. I would be happy to listen to feature requests or issues that arise - but the discussion should probably be on R-devel mailing list. best wishes Robert Romain [1] : http://www.r-project.org/soc09/ideas.html#p5 [2] : https://stat.ethz.ch/pipermail/r-devel/2009-April/053128.html [3] : http://cran.r-project.org/web/packages/debug/index.html [4] : http://cran.r-project.org/doc/Rnews/Rnews_2003-3.pdf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vcd package --- change layout of plot
On Thu, 21 May 2009, Prew, Paul wrote: Hello, I'm trying to use the vcd package to analyze survey data. Expert judges ranked possible features for product packaging. Seven features were listed, and 19 judges split between 2 cities ranked them. The following code (1) works, but the side-by-side plots for Cities PX, SF are shrunk too much. Stacking PX on top of SF would make for a better plot. (I could switch the order of Feature and Rank dimensions, and go with the default side-by-side, but would prefer not to). Several comments: - Adding keep_aspect_ratio = TRUE does probably change what you call shrunk too much. By default is aspect ratio is kept fixed for 2-way displays. - I would switch the splitting order of Feature/Rank because you probably want the distribution of ranks for a given feature and not the other way round. In that case, you might find etting split_vertical = TRUE aesthetically more pleasing. But it's probably a matter of taste. - Setting gp = shading_max is not really the best choice: If you would want a conditional independence test based on the maximum of residuals you should use panel = cotab_coindep. See the JCGS paper in the references and the accompanying vignette(residual-shadings, package = vcd) for more details. But note that the double maximum test is not the most powerful test against ordinal alternatives (as one might expect due to the ordered Rank variable). hth, Z (1) cotabplot(~ Rank + Feature| Cities, data = Pack.dat, gp = shading_max, rot_labels = c(90, 0, 0, 0),just_labels = c(left, left, left, right),set_varnames = c(Feature = )) Reading the vcd help, I got lost trying to understand the panel-generating parameters I should use. My best guess was below (2), but gave an error message. Clearly, I don't know what the paneling is asking for. This is where I would like some advice, if anyone is familiar with vcd. (2) Tried to change the layout of trellis plot from horizontal to vertical Pack.mos-mosaic(~Feature + Rank, data = Pack.tab, gp = shading_max, rot_labels = c(0, 0, 0, 0),just_labels = c(left, left, left, right),set_varnames = c(Feature = )) ## attempt to create an object for panel argument in cotabplot function pushViewport(viewport(layout = grid.layout(ncol = 1))) pushViewport(viewport(layout.pos.row = 1)) ## tell vcd to change the default layout, and what to put in the top plot cotabplot(~ Feature + Rank | Cities, data = Pack.dat, panel = Pack.mos, Pack.dat[[PX]], gp = shading_max, rot_labels = c(0, 0, 0, 0)) ## create mosaic plot that's conditional on Cities; first plot Cities = PX ## panel argument is an attempt to modify an example in the vcd help file popViewport() ## create the graphic Error: Cannot pop the top-level viewport (grid and graphics output mixed?) # no point in gong on to code the plot for layout.pos.row = 2 str(Pack.tab) Error in `[.structable`(x, i, args[[2]]) : subscript out of bounds class(Pack.tab) [1] structable ftable dim(Pack.tab) [1] 7 2 7 Cities PX SF Rank Feature 1Flexible 2 0 Integrate.Probes 1 2 Large/heavy 1 0 Lockout 0 1 Recyclable 3 5 Rigid0 0 Small/light 2 1 2Flexible 1 6 Integrate.Probes 2 0 Large/heavy 1 1 Lockout 1 0 Recyclable 2 0 Rigid1 0 Small/light 2 2 3Flexible 1 1 Integrate.Probes 3 0 Large/heavy 1 1 Lockout 2 1 Recyclable 1 3 Rigid0 0 Small/light 0 3 4Flexible 3 0 Integrate.Probes 0 2 Large/heavy 0 0 Lockout 2 2 Recyclable 0 1 Rigid1 2 Small/light 3 2 5Flexible 1 1 Integrate.Probes 1 1 Large/heavy 0 3 Lockout 0 2 Recyclable 2 0 Rigid3 1 Small/light 2 1 6Flexible 0 1 Integrate.Probes 1 3 Large/heavy 3 2 Lockout 3 1 Recyclable 0 0 Rigid2 2 Small/light 0 0 7Flexible 1 0 Integrate.Probes 1 1 Large/heavy 3 2 Lockout 1 2 Recyclable 1 0 Rigid2 4 Small/light 0 0 sessionInfo() R version 2.9.0 RC (2009-04-10 r48318) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United
Re: [R] Naming a random effect in lmer
Here is a test data set and code. I am including the data set after the code and discussion to make reading easier. Apologies for the size of the data set, but my problem occurs when there are a lot of Z variables. Thanks for your time. # Enter data below # Sample code library(lme4) mb- length(unique(testsamp$WYear)) # Create the formula for the set of identically distributed random effects Zs- paste(Z,2:(mb-1)),sep=) Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs,collapse=+), fittest-lmer(Trendformula, data = testsamp) summary(fittest) # Here I get an error because the name of the random effect is too long to print # in the random effects output (I think). # The error message is: Error in names(bars) - unlist(lapply(bars, function(x) # deparse(x[[3]]))) : 'names' attribute [3] must be the same length as the vector [2] # However, when fewer Z variables are used in the random portion of the model, # there is no error. # Using only Z2 + + Z9 for the random intercept Zs2- paste(Z,2:9,sep=) Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs2,collapse=+), fittest2-lmer(Trendformula2, data = testsamp) summary(fittest2) # Is there a way to either name the set of iid random effects something else or # to define a random variable that could be used in the model to create a # random intercept? # I have had some success in lme, but it would be helpful for my simulation if I # could conduct this analysis with lmer. My model in lme is not correctly # estimating one of the variance components (random Site intercept). # I am using: detach(package:lme4) library(nlme) random.model.lme -as.formula(paste(~-1+,paste(paste(Z,2:(mb-1),sep=),collapse=+))) n-dim(testsamp)[1] testsampgroup - rep(1,n) testsamp.lme -cbind(testsamp,testsampgroup) testgroupSamp- groupedData(LogY ~ WYearCen|testsampgroup, inner= ~Site, data= data.frame(testsamp.lme)) fittest3-lme(LogY ~ WYearCen, random= pdBlocked(list(pdIdent(~-1+WYearCen:as.factor(Site)), pdIdent(~-1+as.factor(Site)), pdIdent(random.model.lme))),data= testgroupSamp) summary(fittest3) VarCorr(fittest3) # Data testsamp - structure(list(Year = c(2008, 2008, 2008, 2008, 2008, 2008, 2008, 2009, 2009, 2009, 2009, 2009, 2009, 2009, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019, 2019, 2019, 2020, 2020, 2020, 2020, 2020, 2020, 2020, 2021, 2021, 2021, 2021, 2021, 2021, 2021, 2022, 2022, 2022, 2022, 2022, 2022, 2022), WYear = c(0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 14), Site = c(4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94), LogY = c(0.848648866298552, 0.809143925760456, 0.734173952725014, 1.46749967704437, 0.716106254860468, 0.843871512951468, 1.09120092433378, 0.800893809796851, 0.996674977596997, 0.917613604481207, 1.71928772722884, 0.797853604855215, 0.922298691760041, 0.964654422529188, 0.782903180421921, 1.13457969553106, 1.21628917384868, 2.03084776647495, 0.872954085910578, 1.02192794559856, 0.746307251774509, 0.439812203188778, 1.11164109549224, 1.18414357836729, 2.00157711459358, 0.66577753155877, 0.856374433428581, 0.343862060001402, 0.278505653147057, 1.20632152691478, 1.32289150679746, 2.19814430598707, 0.538363941164496, 0.820038163290321, 0.070054765524828, 0.0479738684639024, 1.29137364087568, 1.52436357249377, 2.32150525025777, 0.595507040392793, 0.851417610550757, -0.115908144193410, 0.0118306018140099, 1.39448009350962, 1.71677754106603, 2.59146662837284, 0.595750060671620, 0.855387479311679, -0.430729785591898, 0.0178423104900579, 1.60964246000316, 1.99184029256509, 2.86865842252168, 0.69512483409, 0.96175860396451, -0.600991172113926, -0.174420349224615, 1.73794158380868, 2.06718359946362, 3.04502112974038, 0.730730638403177, 0.961110819398807, -0.856693722990918,
[R] stripchart and ordering of plot
Take for example the following stripchart that's created: b - 1:5 a - 11:15 e - 21:25 f - -11:-15 foo - rbind(b,a,e,f) stripchart(foo ~ rownames(foo)) In this case, I would like the bottom part of the plot to be the f vector, followed by the e vector, etc. However, R reorders the matrix and plots according to the alphabetical order of the rownames of foo. Is there a way to plot the matrix in the current order of the rownames? Thanks, Andrew [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vcd package --- change layout of plot
Dear Achim, thank you very much for the suggestions, they work well. Agreed that an ordinal logistic regression seems a more powerful choice of analysis method, and I'm using that also. Thanks for providing the vcd package, it's proving quite helpful. Regards, Paul Paul Prew | Statistician 651-795-5942 | fax 651-204-7504 Ecolab Research Center | Mail Stop ESC-F4412-A 655 Lone Oak Drive | Eagan, MN 55121-1560 -Original Message- From: Achim Zeileis [mailto:achim.zeil...@wu-wien.ac.at] Sent: Friday, May 22, 2009 3:06 PM To: Prew, Paul Cc: r-help@r-project.org Subject: Re: [R] vcd package --- change layout of plot On Thu, 21 May 2009, Prew, Paul wrote: Hello, I'm trying to use the vcd package to analyze survey data. Expert judges ranked possible features for product packaging. Seven features were listed, and 19 judges split between 2 cities ranked them. The following code (1) works, but the side-by-side plots for Cities PX, SF are shrunk too much. Stacking PX on top of SF would make for a better plot. (I could switch the order of Feature and Rank dimensions, and go with the default side-by-side, but would prefer not to). Several comments: - Adding keep_aspect_ratio = TRUE does probably change what you call shrunk too much. By default is aspect ratio is kept fixed for 2-way displays. - I would switch the splitting order of Feature/Rank because you probably want the distribution of ranks for a given feature and not the other way round. In that case, you might find etting split_vertical = TRUE aesthetically more pleasing. But it's probably a matter of taste. - Setting gp = shading_max is not really the best choice: If you would want a conditional independence test based on the maximum of residuals you should use panel = cotab_coindep. See the JCGS paper in the references and the accompanying vignette(residual-shadings, package = vcd) for more details. But note that the double maximum test is not the most powerful test against ordinal alternatives (as one might expect due to the ordered Rank variable). hth, Z (1) cotabplot(~ Rank + Feature| Cities, data = Pack.dat, gp = shading_max, rot_labels = c(90, 0, 0, 0),just_labels = c(left, left, left, right),set_varnames = c(Feature = )) Reading the vcd help, I got lost trying to understand the panel-generating parameters I should use. My best guess was below (2), but gave an error message. Clearly, I don't know what the paneling is asking for. This is where I would like some advice, if anyone is familiar with vcd. (2) Tried to change the layout of trellis plot from horizontal to vertical Pack.mos-mosaic(~Feature + Rank, data = Pack.tab, gp = shading_max, rot_labels = c(0, 0, 0, 0),just_labels = c(left, left, left, right),set_varnames = c(Feature = )) ## attempt to create an object for panel argument in cotabplot function pushViewport(viewport(layout = grid.layout(ncol = 1))) pushViewport(viewport(layout.pos.row = 1)) ## tell vcd to change the default layout, and what to put in the top plot cotabplot(~ Feature + Rank | Cities, data = Pack.dat, panel = Pack.mos, Pack.dat[[PX]], gp = shading_max, rot_labels = c(0, 0, 0, 0)) ## create mosaic plot that's conditional on Cities; first plot Cities = PX ## panel argument is an attempt to modify an example in the vcd help file popViewport() ## create the graphic Error: Cannot pop the top-level viewport (grid and graphics output mixed?) # no point in gong on to code the plot for layout.pos.row = 2 str(Pack.tab) Error in `[.structable`(x, i, args[[2]]) : subscript out of bounds class(Pack.tab) [1] structable ftable dim(Pack.tab) [1] 7 2 7 Cities PX SF Rank Feature 1Flexible 2 0 Integrate.Probes 1 2 Large/heavy 1 0 Lockout 0 1 Recyclable 3 5 Rigid0 0 Small/light 2 1 2Flexible 1 6 Integrate.Probes 2 0 Large/heavy 1 1 Lockout 1 0 Recyclable 2 0 Rigid1 0 Small/light 2 2 3Flexible 1 1 Integrate.Probes 3 0 Large/heavy 1 1 Lockout 2 1 Recyclable 1 3 Rigid0 0 Small/light 0 3 4Flexible 3 0 Integrate.Probes 0 2 Large/heavy 0 0 Lockout 2 2 Recyclable 0 1 Rigid1 2 Small/light 3 2 5Flexible 1 1 Integrate.Probes 1 1 Large/heavy 0 3 Lockout 0 2 Recyclable 2 0 Rigid3 1
[R] Goodness of fit for MLE?
Hi all, How do I evaluate how good is my MLE fit? Moreover, suppose I am having 30 data points, and 50 points, how do I compare which one gives better goodness-of-fit? What are the common test procedures to assess the goodness of fit for MLE? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Naming a random effect in lmer
I miscommunicated: In every application I've seen with large numbers of parameters to estimate, most of those parameters are specific instances of different levels of a random effect. For example, a colleague recently did a fixed effects analysis of a longitudinal abundance survey of a large number of species of wildlife. This colleague claimed that treating species as a random effect was inappropriate, because the species were not selected by use of random numbers. With only two or three species, if the scientists were only concerned with those two or three species, this may be reasonable. However, most scientists -- and people who study their work -- will want to generalize beyond only the species in the study. Even if scientists were only interested in two or three species for which they had data, Stein's phenomenon suggests that even a rabid Bayesophobe would likely get lower mean square error at the expense of some bias from pretending the species were randomly selected from some larger population (http://en.wikipedia.org/wiki/James-Stein_estimator). If I were asked to referee a paper or serve on a thesis committee for a study estimating 30 random effects, I would not be happy unless there were a convincing discussion of why it was appropriate to estimate all those random effects separately; right now, I can not imagine an application where that would be appropriate. If you and your advisor still feel that what you are doing makes sense, I suggest you first get the source code via svn checkout svn://svn.r-forge.r-project.org/svnroot/lme4 (or by downloading lme4_0.999375-30.tar.gz from http://cran.fhcrc.org/web/packages/lme4/index.html;), then walk through the code line by line using the debug function (or browser or the debug package). From this, you will likely see either (a) how you can do what you want differently to achieve the same result or (b) how to modify the code so it does what you want. Hope this helps. Spencer Leigh Ann Starcevich wrote: Here is a test data set and code. I am including the data set after the code and discussion to make reading easier. Apologies for the size of the data set, but my problem occurs when there are a lot of Z variables. Thanks for your time. # Enter data below # Sample code library(lme4) mb- length(unique(testsamp$WYear)) # Create the formula for the set of identically distributed random effects Zs- paste(Z,2:(mb-1)),sep=) Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs,collapse=+), fittest-lmer(Trendformula, data = testsamp) summary(fittest) # Here I get an error because the name of the random effect is too long to print # in the random effects output (I think). # The error message is: Error in names(bars) - unlist(lapply(bars, function(x) # deparse(x[[3]]))) : 'names' attribute [3] must be the same length as the vector [2] # However, when fewer Z variables are used in the random portion of the model, # there is no error. # Using only Z2 + … + Z9 for the random intercept Zs2- paste(Z,2:9,sep=) Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs2,collapse=+), fittest2-lmer(Trendformula2, data = testsamp) summary(fittest2) # Is there a way to either name the set of iid random effects something else or # to define a random variable that could be used in the model to create a # random intercept? # I have had some success in lme, but it would be helpful for my simulation if I # could conduct this analysis with lmer. My model in lme is not correctly # estimating one of the variance components (random Site intercept). # I am using: detach(package:lme4) library(nlme) random.model.lme -as.formula(paste(~-1+,paste(paste(Z,2:(mb-1),sep=),collapse=+))) n-dim(testsamp)[1] testsampgroup - rep(1,n) testsamp.lme -cbind(testsamp,testsampgroup) testgroupSamp- groupedData(LogY ~ WYearCen|testsampgroup, inner= ~Site, data= data.frame(testsamp.lme)) fittest3-lme(LogY ~ WYearCen, random= pdBlocked(list(pdIdent(~-1+WYearCen:as.factor(Site)), pdIdent(~-1+as.factor(Site)), pdIdent(random.model.lme))),data= testgroupSamp) summary(fittest3) VarCorr(fittest3) # Data testsamp - structure(list(Year = c(2008, 2008, 2008, 2008, 2008, 2008, 2008, 2009, 2009, 2009, 2009, 2009, 2009, 2009, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019, 2019, 2019, 2020, 2020, 2020, 2020, 2020, 2020, 2020, 2021, 2021, 2021, 2021, 2021, 2021, 2021, 2022, 2022, 2022, 2022, 2022, 2022, 2022), WYear = c(0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3,
[R] mle() question
Is there a way to code the mle() function in library stats4 such that it switches optimizing methods midstream (i.e. BFGS to Newton and back to BFGS, etc.)? Thanks, Stephen Collins, MPP | Analyst Health Benefits | Aon Consulting [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Naming a random effect in lmer
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of spencerg Sent: Friday, May 22, 2009 3:01 PM To: Leigh Ann Starcevich Cc: r-help@r-project.org Subject: Re: [R] Naming a random effect in lmer [ ... elided statistical advice ... ] If you and your advisor still feel that what you are doing makes sense, I suggest you first get the source code via svn checkout svn://svn.r-forge.r-project.org/svnroot/lme4 (or by downloading lme4_0.999375-30.tar.gz from http://cran.fhcrc.org/web/packages/lme4/index.html;), then walk through the code line by line using the debug function (or browser or the debug package). From this, you will likely see either (a) how you can do what you want differently to achieve the same result or (b) how to modify the code so it does what you want. The coding error is right in the error message: Error in names(bars) - unlist(lapply(bars, function(x) deparse(x[[3]]))) and I suspect that traceback() would tell you that came from a call to lmerFactorList. That code implicitly assumes that deparse() will produce a scalar character vector, but it doesn't if the input expression is complicated enough. Changing the deparse(x[[3]]) to deparse(x[[3]])[1] or paste(collapse= , deparse(x[[3]])[1]) would fix it. The first truncates the name and the second my make a very long name. There is at least one other use of that idiom in the lme4 code and your dataset and analysis may require that all of them be fixed. Hope this helps. Spencer Leigh Ann Starcevich wrote: Here is a test data set and code. I am including the data set after the code and discussion to make reading easier. Apologies for the size of the data set, but my problem occurs when there are a lot of Z variables. Thanks for your time. # Enter data below # Sample code library(lme4) mb- length(unique(testsamp$WYear)) # Create the formula for the set of identically distributed random effects Zs- paste(Z,2:(mb-1)),sep=) Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs,collapse=+), fittest-lmer(Trendformula, data = testsamp) summary(fittest) # Here I get an error because the name of the random effect is too long to print # in the random effects output (I think). # The error message is: Error in names(bars) - unlist(lapply(bars, function(x) # deparse(x[[3]]))) : 'names' attribute [3] must be the same length as the vector [2] # However, when fewer Z variables are used in the random portion of the model, # there is no error. # Using only Z2 + ... + Z9 for the random intercept Zs2- paste(Z,2:9,sep=) Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs2,collapse=+), fittest2-lmer(Trendformula2, data = testsamp) summary(fittest2) # Is there a way to either name the set of iid random effects something else or # to define a random variable that could be used in the model to create a # random intercept? # I have had some success in lme, but it would be helpful for my simulation if I # could conduct this analysis with lmer. My model in lme is not correctly # estimating one of the variance components (random Site intercept). # I am using: detach(package:lme4) library(nlme) random.model.lme -as.formula(paste(~-1+,paste(paste(Z,2:(mb-1),sep=),col lapse=+))) n-dim(testsamp)[1] testsampgroup - rep(1,n) testsamp.lme -cbind(testsamp,testsampgroup) testgroupSamp- groupedData(LogY ~ WYearCen|testsampgroup, inner= ~Site, data= data.frame(testsamp.lme)) fittest3-lme(LogY ~ WYearCen, random= pdBlocked(list(pdIdent(~-1+WYearCen:as.factor(Site)), pdIdent(~-1+as.factor(Site)), pdIdent(random.model.lme))),data= testgroupSamp) summary(fittest3) VarCorr(fittest3) # Data testsamp - structure(list(Year = c(2008, 2008, 2008, 2008, 2008, 2008, 2008, 2009, 2009, 2009, 2009, 2009, 2009, 2009, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019, 2019, 2019, 2020, 2020, 2020, 2020, 2020, 2020, 2020, 2021, 2021, 2021, 2021, 2021, 2021, 2021, 2022, 2022, 2022, 2022, 2022, 2022, 2022), WYear = c(0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13,
[R] how to insert NULLs in lists?
I'm an experienced programmer, but learning R is making me lose the little hair I have left... list(NULL) [[1]] NULL length(list(NULL)) [1] 1 x - list() x[[1]] - NULL x list() length(x) [1] 0 From the above experiment, it is clear that, although one can create a one-element list consisting of a NULL element, one can't get the same result by assigning NULL to the first element of an empty list. And it gets worse: x - list(1, 2, 3) length(x) [1] 3 x[[2]] - NULL length(x) [1] 2 I just could NOT believe my eyes! Am I going crazy??? What I'm trying to do is so simple and straightforward: I want to be able to append NULL to a list, and, after the appending, have the last element of the list be NULL. Is that so unreasonable? How can it be done? TIA! Kynn [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to insert NULLs in lists?
Continuing your example below, x[4] - list(NULL) I won't try to defend the semantics,which have been complained about before. However, note that with lists, x[i] is the list which consists of one member, the ith component of x, which is not the same as x[[i]], the ith component, itself; so the assignment above sets the list that contains the 4th component of x to the list that contains NULL, doing what you desire. For similar reasons, x - c(x,list(NULL)) would also work. I believe all of this is documented in the man page for [ ; but I grant that it takes some close reading and experimentation to get it. -- Bert Gunter Genentech Nonclinical Statistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Kynn Jones Sent: Friday, May 22, 2009 3:52 PM To: r-help@r-project.org Subject: [R] how to insert NULLs in lists? I'm an experienced programmer, but learning R is making me lose the little hair I have left... list(NULL) [[1]] NULL length(list(NULL)) [1] 1 x - list() x[[1]] - NULL x list() length(x) [1] 0 From the above experiment, it is clear that, although one can create a one-element list consisting of a NULL element, one can't get the same result by assigning NULL to the first element of an empty list. And it gets worse: x - list(1, 2, 3) length(x) [1] 3 x[[2]] - NULL length(x) [1] 2 I just could NOT believe my eyes! Am I going crazy??? What I'm trying to do is so simple and straightforward: I want to be able to append NULL to a list, and, after the appending, have the last element of the list be NULL. Is that so unreasonable? How can it be done? TIA! Kynn [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to insert NULLs in lists?
Hi Kynn: this oddity is discussed in Patrick Burn's document called The R Inferno. I don't recall the fix so I'm not sure if below is the same as what his book says to do but it seems to do what you want. x - list() x[[1]] - 2 x length(x) print(str(x)) x[2] - list(NULL) x length(x) print(str(x)) Still, I would look at that document rather than just using the above because I'm not an expeRt so there might be a better way ? Also, if you cant find Pat's website, let me know and I'll find it. I'm pretty sure that, if you google Patrick Burns, his site should be up at the top and then his R-Inferno is easy to find from there. It's quite a useful document so I highly recommend it. On May 22, 2009, Kynn Jones kyn...@gmail.com wrote: I'm an experienced programmer, but learning R is making me lose the little hair I have left... list(NULL) [[1]] NULL length(list(NULL)) [1] 1 x - list() x[[1]] - NULL x list() length(x) [1] 0 From the above experiment, it is clear that, although one can create a one-element list consisting of a NULL element, one can't get the same result by assigning NULL to the first element of an empty list. And it gets worse: x - list(1, 2, 3) length(x) [1] 3 x[[2]] - NULL length(x) [1] 2 I just could NOT believe my eyes! Am I going crazy??? What I'm trying to do is so simple and straightforward: I want to be able to append NULL to a list, and, after the appending, have the last element of the list be NULL. Is that so unreasonable? How can it be done? TIA! Kynn [[alternative HTML version deleted]] __ [1]r-h...@r-project.org mailing list [2]https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide [3]http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. References 1. mailto:R-help@r-project.org 2. https://stat.ethz.ch/mailman/listinfo/r-help 3. http://www.R-project.org/posting-guide.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need a faster function to replace missing data
Jim, Thanks! I like the way you use indexing instead of the loops. However, the find.Interval function does not give the right result. I have been playing with it and it seems to give the closest number that is less than the one of interest. In this case, the correct replacement should have been 40, not 30, since 12:15 from mygarmin is closer to 12:14 in myvscan than 12:10. Is there a way to get the function to find the closest in value instead of the next smaller value? I was trying to use which.min to get the closet date but can't seem to get it to work right either. Aloha, Tim Tim Clark Department of Zoology University of Hawaii --- On Fri, 5/22/09, jim holtman jholt...@gmail.com wrote: From: jim holtman jholt...@gmail.com Subject: Re: [R] Need a faster function to replace missing data To: Tim Clark mudiver1...@yahoo.com Cc: r-help@r-project.org Date: Friday, May 22, 2009, 7:24 AM I think this does what you want. It uses 'findInterval' to determine where a possible match is: myvscan-data.frame(c(1,NA,1.5),as.POSIXct(c(12:00:00,12:14:00,12:20:00), format=%H:%M:%S)) # convert to numeric names(myvscan)-c(Latitude,DateTime) myvscan$tn - as.numeric(myvscan$DateTime) # numeric for findInterval mygarmin-data.frame(c(20,30,40),as.POSIXct(c(12:00:00,12:10:00,12:15:00), format=%H:%M:%S)) names(mygarmin)-c(Latitude,DateTime) mygarmin$tn - as.numeric(mygarmin$DateTime) # use 'findInterval' na.indx - which(is.na(myvscan$Latitude)) # find NAs # replace with garmin latitude myvscan$Latitude[na.indx] - mygarmin$Latitude[findInterval(myvscan$tn[na.indx], mygarmin$tn)] myvscan Latitude DateTime tn 1 1.0 2009-05-22 12:00:00 1243008000 2 30.0 2009-05-22 12:14:00 1243008840 3 1.5 2009-05-22 12:20:00 1243009200 On Fri, May 22, 2009 at 12:45 AM, Tim Clark mudiver1...@yahoo.com wrote: Dear List, I need some help in coming up with a function that will take two data sets, determine if a value is missing in one, find a value in the second that was taken at about the same time, and substitute the second value in for where the first should have been. My problem is from a fish tracking study. We put acoustic tags in fish and track them for several days. Location data is supposed to be automatically recorded every time we detect a ping from the fish. Unfortunately the GPS had some problems and sometimes the fishes depth was recorded but not its location. I fortunately had a back-up GPS that was taking location data every five minutes. I would like to merge the two files, replacing the missing value in the vscan (automatic) file with the location from the garmin file. Since we were getting vscan records every 1-2 seconds and garmin records every 5 minutes, I need to find the right place in the vscan file to place the garmin record - i.e. the closest in time, but not greater than 5 minutes. I have written a function that does this. However, it works with my test data but locks up my computer with my real data. I have several million vscan records and several thousand garmin records. Is there a better way to do this? My function and test data: myvscan-data.frame(c(1,NA,1.5),times(c(12:00:00,12:14:00,12:20:00))) names(myvscan)-c(Latitude,DateTime) mygarmin-data.frame(c(20,30,40),times((12:00:00,12:10:00,12:15:00))) names(mygarmin)-c(Latitude,DateTime) minute.diff-1/24/12 #Time diff is in days, so this is 5 minutes for (k in 1:nrow(myvscan)) { if (is.na(myvscan$Latitude[k])) { if ((min(abs(mygarmin$DateTime-myvscan$DateTime[k]))) minute.diff ) { index.min.date-which.min(abs(mygarmin$DateTime-myvscan$DateTime[k])) myvscan$Latitude[k]-mygarmin$Latitude[index.min.date] }}} I appreciate your help and advice. Aloha, Tim Tim Clark Department of Zoology University of Hawaii __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rank reduction method in time series analysis?
Hi all, Suppose I have 100 simultaneous time series, what's the best statistical procedure to figure out a transformation of the data, and see if we could squeeze most of the information into a few transformed time series? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need a faster function to replace missing data
Here is a modification that should now find the closest: myvscan-data.frame(c(1,NA,1.5),as.POSIXct(c(12:00:00,12:14:00,12:20:00), + format=%H:%M:%S)) # convert to numeric names(myvscan)-c(Latitude,DateTime) myvscan$tn - as.numeric(myvscan$DateTime) # numeric for findInterval mygarmin-data.frame(c(20,30,40),as.POSIXct(c(12:00:00,12:10:00,12:15:00), + format=%H:%M:%S)) names(mygarmin)-c(Latitude,DateTime) mygarmin$tn - as.numeric(mygarmin$DateTime) # use 'findInterval' na.indx - which(is.na(myvscan$Latitude)) # find NAs # create matrix of values to test the range indices - findInterval(myvscan$tn[na.indx],mygarmin$tn) x - cbind(indices, +abs(myvscan$tn[na.indx] - mygarmin$tn[indices]), # lower +abs(myvscan$tn[na.indx] - mygarmin$tn[indices + 1])) #higher # now determine which index is closer closest - x[,1] + (x[,2] x[,3]) # determine the proper index # replace with garmin latitude myvscan$Latitude[na.indx] - mygarmin$Latitude[closest] myvscan LatitudeDateTime tn 1 1.0 2009-05-23 12:00:00 124308 2 40.0 2009-05-23 12:14:00 1243080840 3 1.5 2009-05-23 12:20:00 1243081200 On Fri, May 22, 2009 at 7:39 PM, Tim Clark mudiver1...@yahoo.com wrote: Jim, Thanks! I like the way you use indexing instead of the loops. However, the find.Interval function does not give the right result. I have been playing with it and it seems to give the closest number that is less than the one of interest. In this case, the correct replacement should have been 40, not 30, since 12:15 from mygarmin is closer to 12:14 in myvscan than 12:10. Is there a way to get the function to find the closest in value instead of the next smaller value? I was trying to use which.min to get the closet date but can't seem to get it to work right either. Aloha, Tim Tim Clark Department of Zoology University of Hawaii --- On Fri, 5/22/09, jim holtman jholt...@gmail.com wrote: From: jim holtman jholt...@gmail.com Subject: Re: [R] Need a faster function to replace missing data To: Tim Clark mudiver1...@yahoo.com Cc: r-help@r-project.org Date: Friday, May 22, 2009, 7:24 AM I think this does what you want. It uses 'findInterval' to determine where a possible match is: myvscan-data.frame(c(1,NA,1.5),as.POSIXct(c(12:00:00,12:14:00,12:20:00), format=%H:%M:%S)) # convert to numeric names(myvscan)-c(Latitude,DateTime) myvscan$tn - as.numeric(myvscan$DateTime) # numeric for findInterval mygarmin-data.frame(c(20,30,40),as.POSIXct(c(12:00:00,12:10:00,12:15:00), format=%H:%M:%S)) names(mygarmin)-c(Latitude,DateTime) mygarmin$tn - as.numeric(mygarmin$DateTime) # use 'findInterval' na.indx - which(is.na(myvscan$Latitude)) # find NAs # replace with garmin latitude myvscan$Latitude[na.indx] - mygarmin$Latitude[findInterval(myvscan$tn[na.indx], mygarmin$tn)] myvscan LatitudeDateTime tn 1 1.0 2009-05-22 12:00:00 1243008000 2 30.0 2009-05-22 12:14:00 1243008840 3 1.5 2009-05-22 12:20:00 1243009200 On Fri, May 22, 2009 at 12:45 AM, Tim Clark mudiver1...@yahoo.com wrote: Dear List, I need some help in coming up with a function that will take two data sets, determine if a value is missing in one, find a value in the second that was taken at about the same time, and substitute the second value in for where the first should have been. My problem is from a fish tracking study. We put acoustic tags in fish and track them for several days. Location data is supposed to be automatically recorded every time we detect a ping from the fish. Unfortunately the GPS had some problems and sometimes the fishes depth was recorded but not its location. I fortunately had a back-up GPS that was taking location data every five minutes. I would like to merge the two files, replacing the missing value in the vscan (automatic) file with the location from the garmin file. Since we were getting vscan records every 1-2 seconds and garmin records every 5 minutes, I need to find the right place in the vscan file to place the garmin record - i.e. the closest in time, but not greater than 5 minutes. I have written a function that does this. However, it works with my test data but locks up my computer with my real data. I have several million vscan records and several thousand garmin records. Is there a better way to do this? My function and test data: myvscan-data.frame(c(1,NA,1.5),times(c(12:00:00,12:14:00,12:20:00))) names(myvscan)-c(Latitude,DateTime) mygarmin-data.frame(c(20,30,40),times((12:00:00,12:10:00,12:15:00))) names(mygarmin)-c(Latitude,DateTime) minute.diff-1/24/12 #Time diff is in days, so this is 5 minutes for (k in 1:nrow(myvscan)) { if (is.na(myvscan$Latitude[k])) { if
Re: [R] stripchart and ordering of plot
Thanks, that works great. Andrew On Fri, May 22, 2009 at 5:42 PM, William Dunlap wdun...@tibco.com wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Andrew Yee Sent: Friday, May 22, 2009 2:16 PM To: r-h...@stat.math.ethz.ch Subject: [R] stripchart and ordering of plot Take for example the following stripchart that's created: b - 1:5 a - 11:15 e - 21:25 f - -11:-15 foo - rbind(b,a,e,f) stripchart(foo ~ rownames(foo)) In this case, I would like the bottom part of the plot to be the f vector, followed by the e vector, etc. However, R reorders the matrix and plots according to the alphabetical order of the rownames of foo. Is there a way to plot the matrix in the current order of the rownames? If you supply a factor it will use the order of the factor levels. If you supply a character vector, it converts it into a factor and the converter puts the levels in alphabetical order by default. Hence one solution is to change that character vector into a factor with the desired order of levels: stripchart(foo ~ factor(rownames(foo), levels=c(f,e,a,b)) Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com Thanks, Andrew [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.