[R] Cannot set correct miktex path for pdflatex
Hi all, I am having a problem in R where R is finding an old non existent version of miktex rather than the new version. This occurs despite having set the path to the correct location. For example in bash if I look for the location of pdflatex: $ which pdflatex /c/Program Files/MiKTeX 2.9/miktex/bin/x64/pdflatex It points to the correct MikTex installation. However in R: Sys.which("pdflatex") pdflatex C:\\PROGRA~1\\MIKTEX~1.9\\miktex\\bin\\x64\\pdflatex.exe" Points to the old (1.9) version of Miktex. This is my session info: R version 3.5.1 (2018-07-02) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 Matrix products: default locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.5.1 Any thoughts? I have unsuccessfully tried: 1. Adding correct MikTex path to my Renviorn.site. this adds MikTex to my path and I can see the addition, but doesn’t fix the problem 2. Adding MikTex path to my $PATH variable. This lets bash find the right version of miktex but doesn’t help in R 3. Making sure I only have on version of MIktex. I don’t have tiny tex installed (nor can I because I need the full MIkTex for other work) . Best, Claire [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can't seem to install packages
Hello, I can't seem to install R packages, since it seemed there were some permission problems I chmoded /usr/share/R/ and /usr/lib/R/. However, there are still errors in the process. Here's my config: sessionInfo() R version 3.1.1 (2014-07-10) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ggplot2_1.0.1BiocInstaller_1.16.5 loaded via a namespace (and not attached): [1] colorspace_1.2-6 digest_0.6.8 grid_3.1.1 gtable_0.1.2 [5] magrittr_1.5 MASS_7.3-40 munsell_0.4.2plyr_1.8.2 [9] proto_0.3-10 Rcpp_0.11.6 reshape2_1.4.1 scales_0.2.4 [13] stringi_0.4-1stringr_1.0.0tcltk_3.1.1 tools_3.1.1 And here are some packages I tried to install: * install.packages(XML)* Installing package into ���/packages/rsat/R-scripts/Rpackages��� (as ���lib��� is unspecified) trying URL 'http://ftp.igh.cnrs.fr/pub/CRAN/src/contrib/XML_3.98-1.1.tar.gz' Content type 'text/html' length 1582216 bytes (1.5 Mb) opened URL == downloaded 1.5 Mb * installing *source* package ���XML��� ... ** package ���XML��� successfully unpacked and MD5 sums checked checking for gcc... gcc checking for C compiler default output file name... rm: cannot remove 'a.out.dSYM': Is a directory a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking how to run the C preprocessor... gcc -E checking for sed... /bin/sed checking for pkg-config... /usr/bin/pkg-config checking for xml2-config... no Cannot find xml2-config ERROR: configuration failed for package ���XML��� * removing ���/packages/rsat/R-scripts/Rpackages/XML��� The downloaded source packages are in ���/tmp/RtmphODjkn/downloaded_packages��� Warning message: In install.packages(XML) : installation of package ���XML��� had non-zero exit status * install.packages(Biostrings)* Installing package into ���/packages/rsat/R-scripts/Rpackages��� (as ���lib��� is unspecified) Warning message: package ���Biostrings��� is not available (for R version 3.1.1) * biocLite(Biostrings)* [...] io_utils.c:16:18: fatal error: zlib.h: No such file or directory #include zlib.h ^ compilation terminated. /usr/lib/R/etc/Makeconf:128: recipe for target 'io_utils.o' failed make: *** [io_utils.o] Error 1 ERROR: compilation failed for package ���Biostrings��� * removing ���/packages/rsat/R-scripts/Rpackages/Biostrings��� The downloaded source packages are in ���/tmp/RtmphODjkn/downloaded_packages��� Warning message: In install.packages(pkgs = pkgs, lib = lib, repos = repos, ...) : installation of package ���Biostrings��� had non-zero exit status I've used R on several machines before and never had such problems. Thanks for any clue! -- Claire Rioualen -- Lab. Technological Advances for Genomics and Clinics (TAGC) INSERM Unit U1090, Aix-Marseille Université (AMU). 163, Avenue de Luminy, 13288 MARSEILLE cedex 09. France [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can't seem to install packages
Hello, Indeed I've had a lot of dependencies issues, but I'm solving them one after the other. Thanks for your time! CR On Thu, May 28, 2015 at 5:33 PM, Martin Morgan mtmor...@fredhutch.org wrote: On 05/28/2015 08:21 AM, Duncan Murdoch wrote: On 28/05/2015 6:10 AM, Claire Rioualen wrote: Hello, I can't seem to install R packages, since it seemed there were some permission problems I chmoded /usr/share/R/ and /usr/lib/R/. However, there are still errors in the process. Here's my config: sessionInfo() R version 3.1.1 (2014-07-10) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ggplot2_1.0.1BiocInstaller_1.16.5 loaded via a namespace (and not attached): [1] colorspace_1.2-6 digest_0.6.8 grid_3.1.1 gtable_0.1.2 [5] magrittr_1.5 MASS_7.3-40 munsell_0.4.2plyr_1.8.2 [9] proto_0.3-10 Rcpp_0.11.6 reshape2_1.4.1 scales_0.2.4 [13] stringi_0.4-1stringr_1.0.0tcltk_3.1.1 tools_3.1.1 And here are some packages I tried to install: * install.packages(XML)* Installing package into ���/packages/rsat/R-scripts/Rpackages��� (as ���lib��� is unspecified) trying URL ' http://ftp.igh.cnrs.fr/pub/CRAN/src/contrib/XML_3.98-1.1.tar.gz' Content type 'text/html' length 1582216 bytes (1.5 Mb) opened URL == downloaded 1.5 Mb * installing *source* package ���XML��� ... ** package ���XML��� successfully unpacked and MD5 sums checked checking for gcc... gcc checking for C compiler default output file name... rm: cannot remove 'a.out.dSYM': Is a directory a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking how to run the C preprocessor... gcc -E checking for sed... /bin/sed checking for pkg-config... /usr/bin/pkg-config checking for xml2-config... no Cannot find xml2-config ERROR: configuration failed for package ���XML��� * removing ���/packages/rsat/R-scripts/Rpackages/XML��� this is a missing system dependency, requiring the libxml2 'dev' headers. On my linux this is sudo apt-get installl libxml2-dev likely you'll also end up needing curl via libcurl4-openssl-dev or similar The downloaded source packages are in ���/tmp/RtmphODjkn/downloaded_packages��� Warning message: In install.packages(XML) : installation of package ���XML��� had non-zero exit status * install.packages(Biostrings)* Installing package into ���/packages/rsat/R-scripts/Rpackages��� (as ���lib��� is unspecified) Warning message: package ���Biostrings��� is not available (for R version 3.1.1) * biocLite(Biostrings)* Yes,Bioconductor versions packages differently from CRAN (we have twice-yearly releases and stable 'release' and 'devel' branches). Following the instructions for package installation at http://bioconductor.org/packages/Biostrings but... [...] io_utils.c:16:18: fatal error: zlib.h: No such file or directory #include zlib.h ^ this seems like a relatively basic header to be missing, installable from zlib1g-dev, but I wonder if you're taking a mis-step earlier, e.g., trying to install on a cluster node that is configured for software use but not installation? Also the instructions here to install R http://cran.r-project.org/bin/linux/ would likely include these basic dependencies 'out of the box'. Martin compilation terminated. /usr/lib/R/etc/Makeconf:128: recipe for target 'io_utils.o' failed make: *** [io_utils.o] Error 1 ERROR: compilation failed for package ���Biostrings��� * removing ���/packages/rsat/R-scripts/Rpackages/Biostrings��� The downloaded source packages are in ���/tmp/RtmphODjkn/downloaded_packages��� Warning message: In install.packages(pkgs = pkgs, lib = lib, repos = repos, ...) : installation of package ���Biostrings��� had non-zero exit status I've used R on several machines before and never had such problems. Thanks for any clue! It's hard to read your message (I think it was posted in HTML), but I think those are all valid errors in building those packages. You appear to be missing some of their dependencies. This is not likely related to permissions. Duncan Murdoch __ R-help@r-project.org mailing
Re: [R] Stepwise rQTL-unknown warning message and odd QTL curve
Sorry, I'll try to provide more detail about what I have done so far with code and any relevant output results. library(qtl) sawfly.cross - read.cross(format=csv, file=~/Desktop/Sawfly_data/QTL/Sawfly_QTL.csv, na.strings=NA, genotypes=c(A, B), alleles=c(A, B), estimate.map=F) --Read the following data: 430 individuals 506 markers 19 phenotypes --Cross type: bc print(sawfly.cross) --This is an object of class cross. It is too complex to print, so we provide just this summary. Backcross No. individuals:430 No. phenotypes: 19 Percent phenotyped: 99.8 99.8 99.3 99.1 99.1 99.1 99.1 99.1 99.5 99.8 99.8 99.5 98.8 99.8 99.8 99.8 99.8 98.4 99.5 No. chromosomes:7 Autosomes: 1 2 3 4 5 6 7 Total markers: 506 No. markers:103 89 75 74 65 51 49 Percent genotyped: 96.2 Genotypes (%): AA:49.7 AB:50.3 Backcross No. individuals:430 No. phenotypes: 19 Percent phenotyped: 99.8 99.8 99.3 99.1 99.1 99.1 99.1 99.1 99.5 99.8 99.8 99.5 98.8 99.8 99.8 99.8 99.8 98.4 99.5 No. chromosomes:7 Autosomes: 1 2 3 4 5 6 7 Total markers: 506 No. markers:103 89 75 74 65 51 49 Percent genotyped: 96.2 Genotypes (%): AA:49.7 AB:50.3 sawfly.cross - calc.genoprob(sawfly.cross, step=2.5, error.prob=0.1, map.function=kosambi, stepwidth=fixed) **I am using head size as a covariant.** head.covar - pull.pheno(sawfly.cross, pheno.col=19) sawfly.cross.stepwise.peryellow - scantwo(sawfly.cross, pheno.col=2, model=normal, method=hk, addcovar=head.covar, use=all.obs, clean.output=F, verbose=T, n.perm=1000, batchsize=100); save.image(~/Desktop/Sawfly_data/QTL/SawflyQTL.RData) --Warning messages: 1: In checkcovar(cross, pheno.col, addcovar, intcovar, perm.strata, : Dropping 1 individuals with missing phenotypes. 2: In checkcovar(cross, pheno.col, addcovar, intcovar, perm.strata, : Dropping 1 individuals with missing covariates. sawfly.cross.stepwise.peryellow.pen - calc.penalties(alpha=0.05, perms=sawfly.cross.stepwise.peryellow) sawfly.cross.stepwise.peryellow.stepqtl - stepwiseqtl(sawfly.cross, pheno.col=2, method=hk, max.qtl=10, penalties=sawfly.cross.stepwise.peryellow.pen , verbose=T, keeplodprofile=T, covar=head.covar, scan.pairs=F, keeptrace=T) --Error in covar[!hasmissing, , drop = FALSE] : incorrect number of dimensions **I corrected this with the next piece of code sawfly.cross.stepwise.peryellow.stepqtl - stepwiseqtl(sawfly.cross, pheno.col=2, method=hk, max.qtl=10, penalties=sawfly.cross.stepwise.peryellow.pen , verbose=T, keeplodprofile=T, covar=as.data.frame(sawfly.cross$pheno$Head.Area), scan.pairs=F, keeptrace=T) The stepwise than ran and I got to the point where I got the warning message I posted about:Warning message: In lastout[[i]] - (max(lastout[[i]]) - dropresult[rn == qn[i], 3]) : longer object length is not a multiple of shorter object length I proceeded to examine the output sawfly.cross.stepwise.peryellow.stepqtl QTL object containing genotype probabilities. name chrpos n.gen Q1 1@106.1 1 106.11 2 Q2 2@180.0 2 179.97 2 Q3 3@181.9 3 181.91 2 Q4 3@181.9 3 181.91 2 Q5 5@142.5 5 142.50 2 Formula: y ~ sawfly.cross$pheno$Head.Area + Q1 + Q2 + Q3 + Q4 + Q5 + Q4:Q5 pLOD: 166.23 In my late night of googling, I did see that the warning can indicate that dimensions of the arguments do not match, but I do not know how to translate that to my data or output. Thank you. On Sat, May 23, 2015 at 3:36 AM, Uwe Ligges lig...@statistik.tu-dortmund.de wrote: On 23.05.2015 01:07, Claire O'Quin wrote: Hi There, I am running a stepwise QTL for a backcross and got the following warning message: Warning message: In lastout[[i]] - (max(lastout[[i]]) - dropresult[rn == qn[i], 3]) : longer object length is not a multiple of shorter object length So dimensions of the arguments may not match? I can not discern what this means. When I created my plot, the QTL curve on chromosome 3 is very odd (tried attaching it), so I suspect that the warning is connected to that odd curve plot. I tried running the fitqtl just to see what would happen and got an error (Error in solve.default(t(Z) %*% Z, t(Z) %*% X) : system is computationally singular: reciprocal condition number = 1.49755e-24). Any thoughts about what is going on? No, without knoing what the arguments and the actual code was. Best, Uwe Ligges Thank you, Claire --- Claire O'Quin, PhD Postdoctoral Research Scholar University of Kentucky http://www.linnenlab.com/home.html __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self
Re: [R] Stepwise rQTL-unknown warning message and odd QTL curve
Thank you for that information. I have found an rQTL help group and will try to see if folks over there can help. I apologize for not doing a very good job of communicating my issues over here. I will try my best to produce a reproducible example and post it here if I don't make any progress on resolving my issues. Thank you everyone for your time. On Sat, May 23, 2015 at 10:16 AM, John Kane jrkrid...@inbox.com wrote: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example and http://adv-r.had.co.nz/Reproducibility.html John Kane Kingston ON Canada -Original Message- From: lig...@statistik.tu-dortmund.de Sent: Sat, 23 May 2015 09:36:15 +0200 To: claire.oq...@uky.edu, r-help@r-project.org Subject: Re: [R] Stepwise rQTL-unknown warning message and odd QTL curve On 23.05.2015 01:07, Claire O'Quin wrote: Hi There, I am running a stepwise QTL for a backcross and got the following warning message: Warning message: In lastout[[i]] - (max(lastout[[i]]) - dropresult[rn == qn[i], 3]) : longer object length is not a multiple of shorter object length So dimensions of the arguments may not match? I can not discern what this means. When I created my plot, the QTL curve on chromosome 3 is very odd (tried attaching it), so I suspect that the warning is connected to that odd curve plot. I tried running the fitqtl just to see what would happen and got an error (Error in solve.default(t(Z) %*% Z, t(Z) %*% X) : system is computationally singular: reciprocal condition number = 1.49755e-24). Any thoughts about what is going on? No, without knoing what the arguments and the actual code was. Best, Uwe Ligges Thank you, Claire --- Claire O'Quin, PhD Postdoctoral Research Scholar University of Kentucky http://www.linnenlab.com/home.html __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! Check it out at http://www.inbox.com/earth -- --- Claire O'Quin, PhD Postdoctoral Research Scholar University of Kentucky http://www.linnenlab.com/home.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Post-hoc tests on linear mixed model give mixed results.
Dear all, I am quite new to R so apologies if I fail to ask properly. I have done a test comparing bat species richness in five habitats as assessed by three methods. I used a linear mixed model in lme4 and got habitat, method and the interaction between the two as significant, with the random effects explaining little variation. I then ran Tukey's post hoc tests as pairwise comparisons in three ways: Firstly in lsmeans: lsmeans(LMM.richness, pairwise~Habitat*Method, adjust=tukey) Then in agricolae: tx - with(diversity, interaction(Method, Habitat)) amod - aov(Richness ~ tx, data=diversity) library(agricolae) interaction -HSD.test(amod, tx, group=TRUE) interaction Then in ghlt 'multcomp': summary(glht(LMM.richness, linfct=mcp(Habitat=Tukey))) summary(glht(LMM.richness, linfct=mcp(Method=Tukey))) tuk - glht(amod, linfct = mcp(tx = Tukey)) summary(tuk) # standard display tuk.cld - cld(tuk) # letter-based display opar - par(mai=c(1,1,1.5,1)) par(mfrow=c(1,1)) plot(tuk.cld) par(opar) I got somewhat different levels of significance from each method, with ghlt giving me the greatest number of significant results and lsmeans the least. All the results from all packages make sense based on the graphs of the data. Can anyone tell me if there are underlying reasons why these tests might be more or less conservative, whether in any case I have failed to specify anything correctly or whether any of these post-hoc tests are not suitable for linear mixed models? Thankyou for your time, Claire [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Post-hoc tests on linear mixed model give mixed results.
Thanks Bert, Will post on r-sig-mixed-models list. Can't help it being in html though as i sent the query via -email. Cheers Claire Date: Thu, 22 May 2014 09:29:44 -0700 Subject: Re: [R] Post-hoc tests on linear mixed model give mixed results. From: gunter.ber...@gene.com To: c.word...@live.com CC: r-help@r-project.org Wrong list! This does not concern R programming. Post on the r-sig-mixed-models list instead in **PLAIN TEXT** rather than html. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Thu, May 22, 2014 at 6:52 AM, Claire c.word...@live.com wrote: Dear all, I am quite new to R so apologies if I fail to ask properly. I have done a test comparing bat species richness in five habitats as assessed by three methods. I used a linear mixed model in lme4 and got habitat, method and the interaction between the two as significant, with the random effects explaining little variation. I then ran Tukey's post hoc tests as pairwise comparisons in three ways: Firstly in lsmeans: lsmeans(LMM.richness, pairwise~Habitat*Method, adjust=tukey) Then in agricolae: tx - with(diversity, interaction(Method, Habitat)) amod - aov(Richness ~ tx, data=diversity) library(agricolae) interaction -HSD.test(amod, tx, group=TRUE) interaction Then in ghlt 'multcomp': summary(glht(LMM.richness, linfct=mcp(Habitat=Tukey))) summary(glht(LMM.richness, linfct=mcp(Method=Tukey))) tuk - glht(amod, linfct = mcp(tx = Tukey)) summary(tuk) # standard display tuk.cld - cld(tuk) # letter-based display opar - par(mai=c(1,1,1.5,1)) par(mfrow=c(1,1)) plot(tuk.cld) par(opar) I got somewhat different levels of significance from each method, with ghlt giving me the greatest number of significant results and lsmeans the least. All the results from all packages make sense based on the graphs of the data. Can anyone tell me if there are underlying reasons why these tests might be more or less conservative, whether in any case I have failed to specify anything correctly or whether any of these post-hoc tests are not suitable for linear mixed models? Thankyou for your time, Claire [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] svyglm error message
Thanks for your reply, Thomas. Yes, this is NCES data. There are no negative or missing weights. I am not a programmer and so I'm afraid I don't understand what you mean by not being able to have blank cells in a data.frame object - What I mean specifically is that in the csv file which I imported into R to create the dataset (using read.csv) there were blank cells for any missing data. This has never given me problems with R in the past using glm or related functions. Traceback gives the following: 3: glm.fit(XX, YY, weights = wi/sum(wi), start = beta0, offset = offs, family = fam, control = contrl, intercept = incpt) 2: svyglm.svyrep.design(model, design = surveydatastructure, family = quasibinomial()) 1: svyglm(model, design = surveydatastructure, family = quasibinomial()) As for the model: I need to run this code on a number of different models. By playing around with this a lot, I have found that I get the error if I include one particular variable (HSCRDANY) in the model. I have checked all the values for that variable, and there are only three: yes, no and empty cells for missing data (or however one correctly phrases that for a dataframe in R). Another variable, HSGPA, which has empty cells for all the same individuals, and which is also a categorical variable, does not have this problem. So, for example, this model works fine: DISTEDUC~1+RACE+GENDER+RISKINDX+GPA+REMEVER+HSGPA+PELLAMT+FEDBEND+CAGI+PAREDUC+PRIMLANG+CITIZEN2 But this model returns the error message listed above: DISTEDUC~1+RACE+GENDER+RISKINDX+GPA+REMEVER+HSGPA*+HSCRDANY* +PELLAMT+FEDBEND+CAGI+PAREDUC+PRIMLANG+CITIZEN2 I don't understand what it is about the specific variable HSCRDANY which would prompt this error message? I'm not sure what else to look for to try to figure out what the issue with this particular variable may be? Thanks again for your time! On Tue, Feb 11, 2014 at 1:05 PM, Thomas Lumley tlum...@uw.edu wrote: This is some sort of NCES data, right? I can't see any way to get that particular error (which happens inside glm.fit()) for a logistic model. Are there any negative or missing weights? What do you mean 'represented by blank cells' -- you can't have blank cells in a data.frame object? What does traceback() give after the error? What is the model? -thomas On Mon, Feb 10, 2014 at 4:10 PM, Claire Wladis cwla...@gmail.com wrote: Hello, I am using the survey package for the first time to analyze a dataset that has both weights and 200 BRR replication weights. When I try to run svyglm on the output from svrepdesign, I get an error message that I do not know how to interpret, and an extended period of time searching for this error on the web hasn't returned any results that seem relevant to my situation. I have no idea how to proceed with my analysis at this point, so I am hoping that someone with more experience with this package and with R in general would be willing to help me figure out what the problem is. Here is my code: surveydatastructure - svrepdesign(repweights=dataset[, 29:228] , data=dataset, weights=dataset$WTA000) modeloutput - svyglm(model, design=surveydatastructure, family=quasibinomial() ) The model is defined in an earlier line of code, but for the sake of readability here, I have not included it. The dataset has a binary dependent variable and a combination of categorical and continuous variables as dependent variables. There is missing data in the dataset, represented by blank cells in the data frame. The data itself is restricted but I can describe any part of it as necessary. Here is the error message that R returns when I enter the svyglm function line of code: Error in if (!(validmu(mu) valideta(eta))) stop(cannot find valid starting values: please specify some, : missing value where TRUE/FALSE needed Thanks for reading my post, and thanks in advance for any help! Sincerely, Claire [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Thomas Lumley Professor of Biostatistics University of Auckland [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] svyglm error message
Yes, when I say that the cells are blank in the data frames I do mean that the contents of the cells are blank characters . I have put in a lot of time trying to understand R, but I have no formal programming background, so I do not necessarily always know the correct terminology for something, and this can be hard to look up in reverse (i.e. if someone uses a term I don't know, I can look it up, but I find it hard to know how to figure out what something is called). Thank you for helping me to understand how to describe this particular concept using the correct terminology. On Tue, Feb 11, 2014 at 5:11 PM, Bert Gunter gunter.ber...@gene.com wrote: Disclaimer: I have not followed this thread and claim no statistical expertise. I just wanted to point out a couple of misconceptions that may be relevant. Inline below. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Tue, Feb 11, 2014 at 1:56 PM, Claire Wladis cwla...@gmail.com wrote: Thanks for your reply, Thomas. Yes, this is NCES data. There are no negative or missing weights. I am not a programmer and so I'm afraid I don't understand what you mean by not being able to have blank cells in a data.frame object (In my opinion) This claim does not absolve you of the responsibility of learning how to properly use R. If you do not wish to put in the requisite effort, then you should not use R. Find something else. - What I mean specifically is that in the csv file which I imported into R to create the dataset (using read.csv) there were blank cells for any missing data. This has never given me problems with R in the past using glm or related functions. Traceback gives the following: 3: glm.fit(XX, YY, weights = wi/sum(wi), start = beta0, offset = offs, family = fam, control = contrl, intercept = incpt) 2: svyglm.svyrep.design(model, design = surveydatastructure, family = quasibinomial()) 1: svyglm(model, design = surveydatastructure, family = quasibinomial()) As for the model: I need to run this code on a number of different models. By playing around with this a lot, I have found that I get the error if I include one particular variable (HSCRDANY) in the model. I have checked all the values for that variable, and there are only three: yes, no and empty cells for missing data (or however one correctly phrases that for a dataframe in R). There is no such thing as empty cells -- R is **not** Excel (thank heaven!). **Blank** values are **not** missing in character vectors: they are blank characters, (if that is, in fact, what your data input did -- I'm never sure with .csv files). An Introduction to R or, if you prefer, various good R web tutorials explain this. If you do not care to put in the effort to learn about it, as I said above, you probably shouldn't be using R. Another variable, HSGPA, which has empty cells for all the same individuals, and which is also a categorical variable, does not have this problem. So, for example, this model works fine: DISTEDUC~1+RACE+GENDER+RISKINDX+GPA+REMEVER+HSGPA+PELLAMT+FEDBEND+CAGI+PAREDUC+PRIMLANG+CITIZEN2 But this model returns the error message listed above: DISTEDUC~1+RACE+GENDER+RISKINDX+GPA+REMEVER+HSGPA*+HSCRDANY* +PELLAMT+FEDBEND+CAGI+PAREDUC+PRIMLANG+CITIZEN2 I don't understand what it is about the specific variable HSCRDANY which would prompt this error message? I'm not sure what else to look for to try to figure out what the issue with this particular variable may be? Thanks again for your time! On Tue, Feb 11, 2014 at 1:05 PM, Thomas Lumley tlum...@uw.edu wrote: This is some sort of NCES data, right? I can't see any way to get that particular error (which happens inside glm.fit()) for a logistic model. Are there any negative or missing weights? What do you mean 'represented by blank cells' -- you can't have blank cells in a data.frame object? What does traceback() give after the error? What is the model? -thomas On Mon, Feb 10, 2014 at 4:10 PM, Claire Wladis cwla...@gmail.com wrote: Hello, I am using the survey package for the first time to analyze a dataset that has both weights and 200 BRR replication weights. When I try to run svyglm on the output from svrepdesign, I get an error message that I do not know how to interpret, and an extended period of time searching for this error on the web hasn't returned any results that seem relevant to my situation. I have no idea how to proceed with my analysis at this point, so I am hoping that someone with more experience with this package and with R in general would be willing to help me figure out what the problem is. Here is my code: surveydatastructure - svrepdesign(repweights=dataset[, 29
[R] svyglm error message
Hello, I am using the survey package for the first time to analyze a dataset that has both weights and 200 BRR replication weights. When I try to run svyglm on the output from svrepdesign, I get an error message that I do not know how to interpret, and an extended period of time searching for this error on the web hasn't returned any results that seem relevant to my situation. I have no idea how to proceed with my analysis at this point, so I am hoping that someone with more experience with this package and with R in general would be willing to help me figure out what the problem is. Here is my code: surveydatastructure - svrepdesign(repweights=dataset[, 29:228] , data=dataset, weights=dataset$WTA000) modeloutput - svyglm(model, design=surveydatastructure, family=quasibinomial() ) The model is defined in an earlier line of code, but for the sake of readability here, I have not included it. The dataset has a binary dependent variable and a combination of categorical and continuous variables as dependent variables. There is missing data in the dataset, represented by blank cells in the data frame. The data itself is restricted but I can describe any part of it as necessary. Here is the error message that R returns when I enter the svyglm function line of code: Error in if (!(validmu(mu) valideta(eta))) stop(cannot find valid starting values: please specify some, : missing value where TRUE/FALSE needed Thanks for reading my post, and thanks in advance for any help! Sincerely, Claire [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R and Windows 8
Hello: I'd like to know if R will run under Windows 8? Thank you, CJO [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R function data variable name argument
Hello fellow R-ers, I have spent some time on this and it is driving me NUTS! I am sure there is a solution, so please help. I am trying to create a function that will plot different lines for subsets of a dataset. For example, I am trying to look at different drug groups (drug2), let's say 1,2,3,4, and 5. The data has 2 different rates, college students and high school students (var names cs and hs). Here is my code: stuff-function(druglist, rate, yaxislabel) { drugs-read.csv(drugs.csv, header=T) druglist.data-drugs[which(drugs$drug2 %in% druglist),] par(xpd=NA,oma=c(3,0,3,16),usr=c(1,40,0,0.5)) #print(rate) plot(druglist.data$counter, druglist.data$rate, type=n, xlab=Quarter, ylab=yaxislabel, xaxt='n', yaxt='n', main=, ylim=c(0,.5)) for (i in druglist) { line-druglist.data[which(druglist.data$drug2==i),] lines(line$counter, line$rate, type='l') print(line$drug2) print(line$counter) print(line$rate) } } stuff(c(1, 2, 3, 4, 5), 'cs') stuff(c(1, 2, 3, 4, 5), 'hs') I have played with attach and detach so I don't have to use the $, but that did not work. I also read several posts suggesting substitute(), but I don't know where to use it. The code works until the for loop, but at this point the line$rate outputs as 'NULL'. Please help. Thanks! Claire, MS Statistical Research Specialist The Denver Health email system has made the following annotations -CONFIDENTIALITY NOTICE - This e-mail transmission, and any documents, files or previous e-mail messages attached to it may contain information that is confidential or legally privileged. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that you must not read this transmission and that any disclosure, copying, printing, distribution or use of any of the information contained in or attached to this transmission is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately notify the sender by telephone or return e-mail and delete the original transmission and its attachments without reading or saving in any manner. Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generalized Hyperbolic distribution
How to use the package generalized hyperbolic distribution in order to estimate the four parameters in the NIG-distribution? I have a data material with stock returns that I want to fit the parameters to. -- View this message in context: http://r.789695.n4.nabble.com/Generalized-Hyperbolic-distribution-tp3504369p3504369.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Key combination that removes all R objects
Dear readers, There is a combination of keys that I have (on several occasions now) typed by accident into R (2.10.0) which removes all the objects in the environment, and clears the console, as though I had typed rm(list=ls()). Unfortunately I don't know what the combination of keys are, so I am struggling to find out more about this behaviour on my own and I was hoping that someone has come across it before. Does anyone i) know what combination of keys this is (so that I am more cautious around them in future) or better yet, ii) know how to disable this shortcut? Thanks for your time, Claire Issued by UBS AG or affiliates to professional investors for information only and its accuracy/completeness is not guaranteed. All opinions may change without notice and may differ to opinions/recommendations expressed by other business areas of UBS. UBS may maintain long/short positions and trade in instruments referred to. Unless stated otherwise, this is not a personal recommendation, offer or solicitation to buy/sell and any prices/quotations are indicative only. UBS may provide investment banking and other services to, and/or its employees may be directors of, companies referred to. To the extent permitted by law, UBS does not accept any liability arising from the use of this communication. \251 UBS 2010. All rights reserved. Intended for recipient only and not for further distribution without the consent of UBS. UBS Limited is a company registered in England Wales under company number 2035362, whose registered office is at 1 Finsbury Avenue, London, EC2M 2PP, United Kingdom. UBS AG (London Branch) is registered as a branch of a foreign company under number BR004507, whose registered office is at 1 Finsbury Avenue, London, EC2M 2PP, United Kingdom. UBS Clearing and Execution Services Limited is a company registered in England Wales under company number 03123037, whose registered office is at 1 Finsbury Avenue, London, EC2M 2PP, United Kingdom. UBS reserves the right to retain all messages. Messages are protected and accessed only in legally justified cases. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Tukey HSD
Hello, I am a little out of date and am still using S-Plus instead of R. I haven't been able to find the right place to ask this question, so I thought I would ask it here and hope that someone can help. I am unable to located the TukeyHSD() function in S-Plus. I have been able to use the GUI to run Tukey's test, but the results don't provide a p-value. Does anyone know how I can go about obtaining or calculating a p-value for Tukey's test using S-Plus? Thank you! _ Hotmail is redefining busy with tools for the New Busy. Get more from your inbox. N:WL:en-US:WM_HMP:042010_2 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glmpath in R
Steve Lianoglou mailinglist.honeypot at gmail.com writes: Hi Claire, I'm replying and CC-ing to the R-help list to get more eyes on your question since others will likely have more/better advice, and perhaps someone else in the future will have a similar question, and might find this thread handy. I've removed your specific research aim since that might be private information, but you can include that later if others find it necessary to know in order help. On Apr 5, 2010, at 5:44 PM, Claire Wooton wrote: Dear Steve, I came across your posting on the R-help mailing list concerning finding the best lambda in a LASSO-model, and I was wondering whether you would be able to offer any advice based on your experience. I'm attempting to build a logistic regression model to explore [REDACTED] and recently decided to build a LASSO-model, having learned of the problems with stepwise variable selection. While I've done a fair amount of reading on the topic, I'm still a bit uncertain when it comes to selecting an appropriate value for lambda when using the glmpath package. Any advice you could offer would be much appreciated. In general, what I've done is to use cross validation to find this best value for lambda, which I'm defining as the value of lambda that gives me the model with the lowest objective score on my testing data. The objective score is in quotes, because it can change given the problem. For instance, for normal regression, the best objective score could be the lowest mean squared error (or highest spearman rank) on my held out examples. In your case, for logistic regression, this could just be accuracy of the class labels. So, I do the CV and get 1 value of lambda for each fold in the CV that returns the model that has the best generalization properties on held out data. After doing the 10 fold cv (once, or many times), you could take the avg. value for lambda and use that for my 'downstream analysis' by building a model on all of my data with that value of lambda. I'd also do some smoke tests to see how sensitive your model is w.r.t the data it is given to train on. Do your best lambdas over each fold vary a lot? How different is the model between folds -- are the same predictor vars non-zero? What's their variance? Etc. Also, what's your objective in building the model? Do you just want something with high predictive accuracy? Are you trying to draw conclusions on the model that you build -- like infer meaning from its coefs? This should probably go in the beginning of the email, but it's better late than never: I should add the disclaimer that I'm not a real statistician, and I'm calling uncle in advance to the card carrying statisticians on this list that might argue that (i) this approach isn't principled enough, (ii) you shouldn't really take any statistical advice on a mailing list; and (iii) you'd be best off consulting a local statistician. Does that answer your question? If not, could you elaborate more about what you're after? Please don't forget to CC the R-help list on any further communication. Thanks, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact Hi Steve, Thanks very much for your reply. My main objective in building the model is to determine the relative strength of the variables in predicting my presence/absence data. It's really an exploratory method, I'm interested in whether the associations that have been observed out in the field come out in the model. I'm also using rpart to build a classification tree to get a sense of any interactions. I was planning to use cross-validation to identify a value of lambda that gives minimum mean cv error and the largest value of lambda such that error is within 1 SE of the minimum. I'm not entirely sure how to proceed in building the full model using this value of lambda. At this point do I simply use predict.glmpath (or predict.glmnet) setting the value of s to lambda and return the coefficients? I plan to validate the chosen coefficient estimates through a bootstrap analysis. Beyond conducting this smoke test, I'm wondering how I should assess the resulting model. Can I assess the fit and predictive accuracy of a glmnet object? Thanks again for your help. I am also planning on discussing my work with a professor in statistics. I appreciate the insight though as I attempt to wrap my head around these methods. Cheers, Claire __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Changed results in analyses run in sem and nlme ??
I'm uncertain how helpful it will be to give example code, but last week, this model gave an error message to the tune of failed to converge after about 5 minutes of run-time : library(nlme) model.A- lme (fixed = avbranch~ wk*trt*pop , random = ~wk|ID/fam/pop, data=branch) It seemed that failure to converge made sense, since there were many weeks (wk) and values for ID/fam/pop. I settled for this model: model.A2 - lme (fixed = avbranch~ wk*trt*pop , random = ~1|ID/fam/pop, data=branch) However, when I tried the model.A on a different dependent variable this week, it converged. Since the challenge to convergence (many levels of wk and ID/fam/pop) was the same as it had been before, I went back and tried model.A on the other analysis, and it also ran. I then started checking results for everything I'd done in the past three weeks in packages that use ML methods (FIML, REML)--and got different outcomes. I've quadruple-checked to be sure I'm using the same code and the same data (I use .csv files for simplicity), and see no differences. However, results from nlme and sem packages are both different. I had not saved detailed output, but had recorded parameters, model-fit statistics, and convergence failures. Could some new package I installed could have changed the way that MLE methods are functioning in the work environment? Everything in the search path looks as it did before, but could something like Rcmdr (just installed, but not now in the search path) change other parts of the environment? Is that possible? If so, how do I check? If not, could anything (besides user error--obviously the simplest solutions are that the code or the data really is different or I recorded the results incorrectly before) account for different results using the same data and code? My main concern is about which results are correct: convergence seems to be happening much faster now, and some models are showing better fits, but the difference makes me nervous. Any ideas would be helpful. Thanks, Claire Lay [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] arcsine transformation
I have been trying to preform both a bartlett's test and an arcsine transformation on some average percentage data. I've tried inputting it different ways and I keep getting the same error message: head(workingdata) DYAD BEFORE AFTER 1 BG-FL 4.606772 5.787520 2 BG-LL 5.467503 7.847395 3 AD-MV 5.333735 11.107380 4 MM-FL 5.578708 12.063500 5 MM-MV 2.037605 6.415303 6 MM-RM 6.158885 11.911080 bartlett.test(BEFORE ~ AFTER) Error in bartlett.test.default(c(4.606772, 5.467503, 5.333735, 5.578708, : there must be at least 2 observations in each group asin(BEFORE) [1] NaN NaN NaN NaN NaN NaN NaN Warning message: In asin(BEFORE) : NaNs produced I'm at a loss here and I would greatly appreciate any guidance that could be given me. Thank you! -- Claire Sheller Department of Anthropology Tulane University New Orleans, LA 70118 615-210-9129 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.