Re: [R] difftime producing NA values in R 2.12.2
This is daylight savings time issue. Use chron or set your TZ environment variable to a standard-time-only timezone (or don't enter nonexistent time values for the timezone in which you wish to compute). --- Jeff Newmiller The . . Go Live... DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Adrienne Wootten amwoo...@ncsu.edu wrote: R-listers, I have noticed several posts on issues with difftime producing NA's but they have been for older versions of R. Here's the issue associated with difftime that I am dealing with in R 2.12.2. preciptime = strptime(01/10/2007 14:00,format=%m/%d/%Y %H:%M) class(preciptime) [1] POSIXlt POSIXt # Now using difftime, this is what happens difftime(strptime(03/11/2007 01:00,format=%m/%d/%Y %H:%M),preciptime,units=hours) Time difference of 1427 hours difftime(strptime(03/11/2007 02:00,format=%m/%d/%Y %H:%M),preciptime,units=hours) Time difference of NA hours difftime(strptime(03/11/2007 03:00,format=%m/%d/%Y %H:%M),preciptime,units=hours) Time difference of 1428 hours This doesn't make sense to me since both times used in difftime are in the same format after using strptime, but the differences are coming out wrong. It should be 1427, 1428, and 1429, so I'm confused as to how to fix this. The idea with the program is to compute the time in hours since last rainfall, so everything gets thrown off with this producing NA's. For reference, Operating system is Windows 7 Enterprise, R is version 2.12.2 (64-bit), any guidance is appreciated. Thanks in advance! A -- Adrienne Wootten Graduate Research Assistant State Climate Office of North Carolina Department of Marine, Earth and Atmospheric Sciences North Carolina State University _ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Example(chron) doesn't work
Hi, there, I have a similar problem. The chron example gives NA. dates doesn't work but times does. I would appreciate it if there's a fix for it. Thanks, Helena example(chron) chron dts - dates(c(02/27/92, 02/27/92, 01/14/92, chron+02/28/92, 02/01/92)) chron dts [1] NA NA NA NA NA chron # [1] 02/27/92 02/27/92 01/14/92 02/28/92 02/01/92 chron tms - times(c(23:03:20, 22:29:56, 01:03:30, chron+18:21:03, 16:56:26)) chron tms [1] 23:03:20 22:29:56 01:03:30 18:21:03 16:56:26 chron # [1] 23:03:20 22:29:56 01:03:30 18:21:03 16:56:26 chron x - chron(dates = dts, times = tms) chron x [1] (NA NA) (NA NA) (NA NA) (NA NA) (NA NA) chron # [1] (02/27/92 23:03:19) (02/27/92 22:29:56) (01/14/92 01:03:30) chron # [4] (02/28/92 18:21:03) (02/01/92 16:56:26) chron chron # We can add or subtract scalars (representing days) to dates or chron # chron objects: chron c(dts[1], dts[1] + 10) Error in y + ifelse(m 2, 0, -1) : non-numeric argument to binary operator In addition: Warning message: In matrix(unlist(lapply(dots, origin)), nrow = 3) : data length [2] is not a sub-multiple or multiple of the number of rows [3] packageDescription(chron)$Version [1] 2.3-42 R.version.string [1] R version 2.13.1 (2011-07-08) win.version() [1] Windows 7 x64 (build 7600) -- View this message in context: http://r.789695.n4.nabble.com/Example-chron-doesn-t-work-tp801580p3939363.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Library chron
if else worked. many thanks. -- View this message in context: http://r.789695.n4.nabble.com/Library-chron-tp3935969p3939374.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with a scatter plot
Hi everyone, I have some data about a market research which I want to arrange in one plot for easy viewing, the data looks something like: ProductColorStoreA StoreB StoreC StoreD Price ProdA R NA4.33 2 4.33 35 GNA4.33 2 4.33 35 B NA4.33 2 3.76 58 YNA 3.723 5.33 23 ProdB B5.44 NA 4.22 3.7687 ProdC G 4.77 3.224.77 2.10 65 B ... ... ...... .. And so on... I want to create a plot where the colors of the hits represent the Product (A,B,C..), the characther represent the color (X for yellow, box for green, etc..), the X axis is the price and the Y axis is the number (0-5) from the different Stores (A,B,C,D). I've thought either to create a matrix of 4 plots ( for the 4 stores) or in some creative way combine them into one plot? Please help me or point me in the right direction as to which functions to look into, I've been playing around with ggplot for a few days, but can't seem to wrap my head around it yet... Thanks -- View this message in context: http://r.789695.n4.nabble.com/Help-with-a-scatter-plot-tp3939585p3939585.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Building package/DESCRIPTION file not existing?
As a first step try to get rid of the warning by doing what it says: CYGWIN environment variable option nodosfilewarning turns off this warning. So set (at least): CYGWIN=nodosfilewarnings and go ahead. Uwe Ligges On 25.10.2011 02:10, Francois Rousseu wrote: Hello useRs I am trying to build a package for personal use and for making easier working with other people but I keep getting the same error message about the DESCRIPTION file not existing. when trying to install from a source tar.gz file: Error in .read_description(dfile) : file 'C:/Users/Propriétaire/AppData/Local/Temp/RtmpHFMONb/R.INSTALL647a3535/mypkg/DESCRIPTION' does not exist when trying to build a binary version: Error in .read_description(dfile) : file 'C:/Users/Propriétaire/Documents/RETROBIRD/mypkg/DESCRIPTION' does not exist In this last case, the DESCRIPTION file is certainly there! Also, help and DESCRIPTION files are edited and my path variable seems to be set correctly as I can access R and tex (form miktex 2.9) from the console. I feel it might be related to language issues (windows on my system is in french, see sessionInfo() at bottom of message) or something about temporary directories, but I really can't find the problem. I've looked into the cygwin warning, but it didn't seemed to be the problem, though I may be wrong. Any hints? Below is the complete sequence with errors. Thanks, Francois Rousseu setwd(C:/Users/Propriétaire/Documents/RETROBIRD/) library(devtools) f- function(x,y) x+y d- data.frame(a=1, b=2) package.skeleton(list=c(f,d), name=mypkg) ## editing of help and description files Creating directories ... Creating DESCRIPTION ... Creating Read-and-delete-me ... Saving functions and data ... Making help files ... Done. Further steps are described in './mypkg/Read-and-delete-me'. build(C:/Users/Propriétaire/Documents/RETROBIRD/mypkg) * checking for file 'C:\Users\Propriétaire\Documents\RETROBIRD\mypkg/DESCRIPTION' ... OK * preparing 'mypkg': * checking DESCRIPTION meta-information ... OK * checking for LF line-endings in source and make files * checking for empty or unneeded directories * looking to see if a 'data/datalist' file should be added * building 'mypkg_1.0.tar.gz' cygwin warning: MS-DOS style path detected: C:/Users/Propri\xC3\xA9taire/Documents/RETROBIRD/mypkg_1.0.tar.gz Preferred POSIX equivalent is: /cygdrive/c/Users/Propri\xC3\xA9taire/Documents/RETROBIRD/mypkg_1.0.tar.gz CYGWIN environment variable option nodosfilewarning turns off this warning. Consult the user's guide for more details about POSIX paths: http://cygwin.com/cygwin-ug-net/using.html#using-pathnames [1] C:/Users/Propriétaire/Documents/RETROBIRD/mypkg_1.0.tar.gz install.packages(pkgs=mypkg_1.0.tar.gz,lib=C:/Users/Propriétaire/Documents/R/win-library/2.13,repos=NULL,type=source) * installing *source* package 'mypkg' ... Error in .read_description(dfile) : file 'C:/Users/Propriétaire/AppData/Local/Temp/RtmpHFMONb/R.INSTALL647a3535/mypkg/DESCRIPTION' does not exist ERROR: installing package DESCRIPTION failed for package 'mypkg' * removing 'C:/Users/Propriétaire/Documents/R/win-library/2.13/mypkg' Warning messages: 1: running command 'C:/PROGRA~1/R/R-213~1.0/bin/x64/R CMD INSTALL -l C:/Users/Propriétaire/Documents/R/win-library/2.13 mypkg_1.0.tar.gz' had status 1 2: In install.packages(pkgs = mypkg_1.0.tar.gz, lib = C:/Users/Propriétaire/Documents/R/win-library/2.13, : installation of package 'mypkg_1.0.tar.gz' had non-zero exit status build(C:/Users/Propriétaire/Documents/RETROBIRD/mypkg,binary=T) * installing to library 'C:/Users/Propriétaire/Documents/R/win-library/2.13' * installing *source* package 'mypkg' ... Error in .read_description(dfile) : file 'C:/Users/Propriétaire/Documents/RETROBIRD/mypkg/DESCRIPTION' does not exist ERROR: installing package DESCRIPTION failed for package 'mypkg' * removing 'C:/Users/Propriétaire/Documents/R/win-library/2.13/mypkg' Error: Command failed (1) In addition: Warning message: running command 'C:/PROGRA~1/R/R-213~1.0/bin/x64/R CMD INSTALL C:\Users\Propriétaire\Documents\RETROBIRD\mypkg --build' had status 1 sessionInfo() R version 2.13.0 (2011-04-13) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=French_Canada.1252 LC_CTYPE=French_Canada.1252 [3] LC_MONETARY=French_Canada.1252 LC_NUMERIC=C [5] LC_TIME=French_Canada.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] roxygen2_2.1 digest_0.5.1 devtools_0.4 loaded via a namespace (and not attached): [1] brew_1.0-6 plyr_1.6 RCurl_1.6-10.1 stringr_0.5tools_2.13.0 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and
Re: [R] Reading in and modifying multiple datasets in a loop
On 24.10.2011 23:10, Debs Majumdar wrote: Thanks Uwe. This works perfectly. ### owd- setwd(pth) fls- list.files(pattern=^chr) ufls- unique(sapply(strsplit(fls, _), [, 1)) for(i in ufls){ of- strsplit(i, \\.)[[1]] of- paste(of[1], tail(of, 1), sep=.) impute2databel(genofile = i, samplefile = paste(i, info, sep=_), outfile = of, makeprob=TRUE, old=FALSE) } setwd(owd) I have a question regarding how strsplit works. When my files are the following: chr1.one.phased.impute2.chunk1 chr1.one.phased.impute2.chunk1_info chr1.one.phased.impute2.chunk1_info_by_sample chr1.one.phased.impute2.chunk1_summary chr1.one.phased.impute2.chunk1_warnings ufls- unique(sapply(strsplit(fls, _), [, 1)) This works like a charm. I have another dataset where the files are study1_chr1.one.phased.impute2.chunk1 study1_chr1.one.phased.impute2.chunk1_info study1_chr1.one.phased.impute2.chunk1_info_by_sample study1_chr1.one.phased.impute2.chunk1_summary study1_chr1.one.phased.impute2.chunk1_warnings ... and so on. and I wanted to run the same loop but I was unable to change strsplit so that it will work when the files are names ads above: I tried ufls- unique(sapply(strsplit(fls, _), [, 2)) unique(gsub((_.*)_.*, \\1, x)) Should do if there is a first underscore. Uwe Ligges but this knocks off study1 (modified code below). What modification do I need to make to make this run: fls- list.files(pattern=study1_chr) ufls- unique(sapply(strsplit(fls, _), [, 2)) library(GenABEL) for(i in ufls){ of- strsplit(i, \\.)[[1]] of- paste(of[1], tail(of, 1), sep=.) impute2databel(genofile = i, samplefile = paste(i, info, sep=_), outfile = of, makeprob=TRUE, old=FALSE) } # Thanks, Debs - Original Message - From: Debs Majumdardebs_st...@yahoo.com To: r-help@r-project.orgr-help@r-project.org Cc: Sent: Friday, October 21, 2011 2:32 PM Subject: Reading in and modifying multiple datasets in a loop Hi, I have been given a set of around 300 files where there are 5 files corresponding to each chunk. E.g. Chunk 1 for chr1 contains these 5 files: chr1.one.phased.impute2.chunk1 chr1.one.phased.impute2.chunk1_info chr1.one.phased.impute2.chunk1_info_by_sample chr1.one.phased.impute2.chunk1_summary chr1.one.phased.impute2.chunk1_warnings For chr 1 there are 47 chunks, chr2 has 42 chunks...and it ends at chr22 with 23 chunks. I am using the DatABEL package to convert them databel format using the following command: impute2databel(genofile=chr1.one.phased.impute2.chunk1, samplefile=chr1.one.phased.impute2.chunk1_info, outfile=chr1.chunk1, makeprob=TRUE, old=FALSE) which uses two files per chunk. Is there a way I can automate this so that the code goes through each chunk of each chromosome and does the conversion to databel format. Thanks, -Debs __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simulation from discrete uniform
If you wanted a discrete uniform from 1-10 use: ceiling(10*runif(1)) if you wanted from 0-12, use: ceiling(13*runif(1))-1 -- View this message in context: http://r.789695.n4.nabble.com/Simulation-from-discrete-uniform-tp3434980p3939694.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with a scatter plot
On 10/26/2011 05:48 PM, RanRL wrote: Hi everyone, I have some data about a market research which I want to arrange in one plot for easy viewing, the data looks something like: ProductColorStoreA StoreB StoreC StoreD Price ProdA R NA4.33 2 4.33 35 GNA4.33 2 4.33 35 B NA4.33 2 3.76 58 YNA 3.723 5.33 23 ProdB B5.44 NA 4.22 3.7687 ProdC G 4.77 3.224.77 2.10 65 B ... ... ...... .. And so on... I want to create a plot where the colors of the hits represent the Product (A,B,C..), the characther represent the color (X for yellow, box for green, etc..), the X axis is the price and the Y axis is the number (0-5) from the different Stores (A,B,C,D). I've thought either to create a matrix of 4 plots ( for the 4 stores) or in some creative way combine them into one plot? Please help me or point me in the right direction as to which functions to look into, I've been playing around with ggplot for a few days, but can't seem to wrap my head around it yet... Hi RanRL, I swapped the colors and product names, but this rather inelegant code might do what you want: ranrl-read.table(ranrl.dat,header=TRUE) plot(ranrl$Price,ranrl$StoreC,ylim=range(ranrl[,3:5],na.rm=TRUE), type=n,xlab=Price,ylab=Number sold) text(ranrl$Price[1],ranrl[1,3:5], paste(ProdA,names(ranrl)[3:5],sep=\n), col=red) text(ranrl$Price[2],ranrl[2,3:5], paste(ProdA,names(ranrl)[3:5],sep=\n), col=green) text(ranrl$Price[3],ranrl[3,3:5], paste(ProdA,names(ranrl)[3:5],sep=\n), col=blue) text(ranrl$Price[4],ranrl[4,3:5], paste(ProdA,names(ranrl)[3:5],sep=\n), col=yellow) text(ranrl$Price[5],ranrl[5,3:5], paste(ProdB,names(ranrl)[3:5],sep=\n), col=blue) text(ranrl$Price[6],ranrl[6,3:5], paste(ProdC,names(ranrl)[3:5],sep=\n), col=green) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Random Forest Classification
Hi All, I wrant to do Random Forest classification. I installed R, randomForest classifier package for R but dont know how to use it. Is there any Open Source Remote sensing application which do RF classification on satellite images? Anyone r has random forest classification example? Any language or package example no problem. Does anyone did it in R? if yes how? I google RF Classification but most of them are for medical disease and research not for Remote Sensing -- Regards, Mohammed Rashad K M M.S. (By Research) student Lab for Spatial Informatics Department of CSE International Institute of Information Technology Hyderabad, India [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Want to exclude axis numbering in plot.ca
plot.ca gives numbers on each axis. How do I stipulate to exclude these. Have read the R Documentation plot.ca but see no option to exclude axis numbers. Any suggestions? -- Mark Webb Line +27 (21) 786 4379 Cell +27 (72) 199 1000 [Poor reception] Fax +27 (86) 260 1946 Skype tomarkwebb Email targetlinkm...@gmail.com Client ftp http://targetlinkresearch.co.za/cftp/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lock a package to specific R version
On 25.10.2011 11:42, Mehmet Suzen wrote: Hi, I was wondering if it is possible to lock a package to a specific version of R. Dependency attribute in the package DESCRIPTION only accepts= AFAIU (http://cran.r-project.org/doc/manuals/R-exts.html#fn-3 ) Any work around? Intervals are possible., and you can restrict them to one version as follows: Depends: R (= 2.13.2), R (= 2.13.2) Uwe Ligges Thanks, Mehmet LEGAL NOTICE This message is intended for the use o...{{dropped:10}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Building package/DESCRIPTION file not existing?
Francois Rousseu francoisrous...@hotmail.com on Mon, 24 Oct 2011 20:10:27 -0400 writes: Hello useRs I am trying to build a package for personal use and for making easier working with other people but I keep getting the same error message about the DESCRIPTION file not existing. when trying to install from a source tar.gz file: Error in .read_description(dfile) : file 'C:/Users/Propri�taire/AppData/Local/Temp/RtmpHFMONb/R.INSTALL647a3535/mypkg/DESCRIPTION' does not exist when trying to build a binary version: Error in .read_description(dfile) : file 'C:/Users/Propri�taire/Documents/RETROBIRD/mypkg/DESCRIPTION' does not exist In this last case, the DESCRIPTION file is certainly there! Also, help and DESCRIPTION files are edited and my path variable seems to be set correctly as I can access R and tex (form miktex 2.9) from the console. I feel it might be related to language issues (windows on my system is in french, see sessionInfo() at bottom of message) or something about temporary directories, but I really can't find the problem. I've looked into the cygwin warning, but it didn't seemed to be the problem, though I may be wrong. Yes, I'm almost sure it's the language issues. I've recently taught a course on R Package building and on Windows, the user had problems because of an 'ä' (a-Umlaut) in one of the directories in her 'path'. So if you work from another place than 'C:/Users/Propri�taire/' this may solve the main problem. Bonnes salutations, Martin Maechler, ETH Zurich Any hints? Below is the complete sequence with errors. Thanks, Francois Rousseu [.] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gam predictions with negbin model
Hi, I wonder if predict.gam is supposed to work with family=negbin() definition? It seems to me that the values returned by type=response are far off the observed values. Here is an example output from the negbin examples: set.seed(3) n-400 dat-gamSim(1,n=n) g-exp(dat$f/5) dat$y-rnbinom(g,size=3,mu=g) b-gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=negbin(3),data=dat) summary(y) Min. 1st Qu. MedianMean 3rd Qu.Max. 0.6061 1.6340 2.8120 2.7970 3.9250 4.9830 summary(predict(b,type=response)) Min. 1st Qu. MedianMean 3rd Qu.Max. 0.8972 3.1610 4.8140 6.1170 8.1300 28.0100 I.e. the range and mean of observed values (y) are smaller than those of the predictions from the gam model. Should I somehow apply the estimated theta on these predictions? regards, Kari __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lock a package to specific R version
On Wed, 26 Oct 2011, Uwe Ligges wrote: On 25.10.2011 11:42, Mehmet Suzen wrote: Hi, I was wondering if it is possible to lock a package to a specific version of R. Dependency attribute in the package DESCRIPTION only accepts= AFAIU (http://cran.r-project.org/doc/manuals/R-exts.html#fn-3 ) Any work around? Intervals are possible., and you can restrict them to one version as follows: Depends: R (= 2.13.2), R (= 2.13.2) Or even use == The point of the footnote is that install.packages() will download a package only checking any = requirements (and I suspect it will then install a binary version of a package). R CMD INSTALL will not install it from the sources, and library() will not load it. I don't see why you would want to do this: why would a package work with 2.13.1 and not 2.13.2, or 2.13.2 and not 2.14.0? Ranges may make sense. Uwe Ligges Thanks, Mehmet LEGAL NOTICE This message is intended for the use o...{{dropped:10}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gam predictions with negbin model
On Wed, 26 Oct 2011, Kari Ruohonen wrote: Hi, I wonder if predict.gam is supposed to work with family=negbin() definition? It seems to me that the values returned by type=response are far off the observed values. Here is an example output from the negbin examples: set.seed(3) n-400 dat-gamSim(1,n=n) g-exp(dat$f/5) dat$y-rnbinom(g,size=3,mu=g) b-gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=negbin(3),data=dat) summary(y) Min. 1st Qu. MedianMean 3rd Qu.Max. 0.6061 1.6340 2.8120 2.7970 3.9250 4.9830 summary(predict(b,type=response)) Min. 1st Qu. MedianMean 3rd Qu.Max. 0.8972 3.1610 4.8140 6.1170 8.1300 28.0100 I.e. the range and mean of observed values (y) What exactly is y in the code above? I guess you mean dat$y: R summary(dat$y) Min. 1st Qu. MedianMean 3rd Qu.Max. 0.000 2.000 4.000 6.235 8.000 68.000 which looks rather reasonable... Z are smaller than those of the predictions from the gam model. Should I somehow apply the estimated theta on these predictions? regards, Kari __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nls -- singular gradient problem
Hi list, i see this question is quiet a regular feature, but searching the past instances i could not find an answer to my specific problem. Simply, trying to optimize this model gives a singular gradient problem -- tough optim() seems to be able to solve it would like to do these things in nls(). Treated-Puromycin[Puromycin$state==treated,] weighted.MM-function(resp,conc,K){ pred-K[1]+(1-exp(K[2]))*conc (resp-pred) } Pur.wt-nls(~weighted.MM(rate,conc,K),data=Treated,start=list(K=c(0,0.1))) Please advise, Best, -- View this message in context: http://r.789695.n4.nabble.com/nls-singular-gradient-problem-tp3939939p3939939.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gam predictions with negbin model
On 26/10/11 12:10, Achim Zeileis wrote: On Wed, 26 Oct 2011, Kari Ruohonen wrote: Hi, I wonder if predict.gam is supposed to work with family=negbin() definition? It seems to me that the values returned by type=response are far off the observed values. Here is an example output from the negbin examples: set.seed(3) n-400 dat-gamSim(1,n=n) g-exp(dat$f/5) dat$y-rnbinom(g,size=3,mu=g) b-gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=negbin(3),data=dat) summary(y) Min. 1st Qu. MedianMean 3rd Qu.Max. 0.6061 1.6340 2.8120 2.7970 3.9250 4.9830 summary(predict(b,type=response)) Min. 1st Qu. MedianMean 3rd Qu.Max. 0.8972 3.1610 4.8140 6.1170 8.1300 28.0100 I.e. the range and mean of observed values (y) What exactly is y in the code above? I guess you mean dat$y: R summary(dat$y) Min. 1st Qu. MedianMean 3rd Qu.Max. 0.000 2.000 4.000 6.235 8.000 68.000 which looks rather reasonable... Z Thanks - what a stupid mistake, an old .RData hanging around even if I started a new R instance. Terribly sorry and many apologies. Kari __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with a scatter plot
Hi: The sort of thing you appear to want is fairly straightforward to in lattice and ggplot2, as both have ways to automate conditioning plots. Since you were looking at ggpot2, let's consider that problem. You don't really show enough data to provide a useful demonstration, but let's see if we can capture the essential structure. I want to create a plot where the colors of the hits represent the Product (A,B,C..), the character represents the color (X for yellow, box for green, etc..), the X axis is the price and the Y axis is the number (0-5) from the different Stores (A,B,C,D). I've thought either to create a matrix of 4 plots ( for the 4 stores) or in some creative way combine them into one plot? The first step is to melt the data so that Store becomes a factor variable and its corresponding values are assigned to another variable. To that end, one can invoke the very useful melt() function in the reshape2 package: library('reshape2') mdata - melt(mydata, id = c('Product', 'Color', 'Price')) This creates a new data frame with variables Product, Color, Price, variable and value. variable contains StoreA, ... StoreD as factor levels and value is a numeric variable consisting of the corresponding values. For its structure, see str(mdata) If you want to change StoreA - StoreD to A - D, say, then you could optionally do mdata$variable - factor(mdata$variable, labels = LETTERS[1:4]) Assuming that you've done enough reading to understand what aesthetics are about, the problem is essentially this: x = Price y = value shape = Color faceting variable = variable (the stores) So a template of a ggplot2 graph for the melted data might look something like library('ggplot2') ggplot(mdata, aes(x = Price, y = value, shape = Color)) + geom_point() + facet_wrap( ~ variable, ncol = 2) + scale_shape_manual('Color', breaks = levels(mdata$Color), values = c(4, 0, 2, 8), labels = c('Blue', 'Green', 'Red', 'Yellow')) This assigns x, box, triangle and asterisk as shapes via their numeric codes (see Hadley Wickham's ggplot2 book, p. 197 for the reference). The labels = argument lets you change the letters B, G, R, Y (which would comprise the default labels) to something more evocative. If for some odd reason you wanted to add corresponding colors to the shapes, you could also do that, as follows: ggplot(mdata, aes(x = Price, y = value, shape = Color, colour = Color)) + geom_point() + facet_wrap( ~ variable, ncol = 2) + scale_shape_manual('Color', breaks = levels(mdata$Color), values = c(4, 0, 2, 8), labels = c('Blue', 'Green', 'Red', 'Yellow')) + scale_colour_manual('Color', breaks = levels(mdata$Color), values = c('blue', 'green', 'red', 'yellow'), labels = c('Blue', 'Green', 'Red', 'Yellow')) This should color the shapes in the graph and provide one (merged) legend with colored shapes as symbols. HTH, Dennis On Tue, Oct 25, 2011 at 11:48 PM, RanRL rnr...@gmail.com wrote: Hi everyone, I have some data about a market research which I want to arrange in one plot for easy viewing, the data looks something like: Product Color StoreA StoreB StoreC StoreD Price ProdA R NA 4.33 2 4.33 35 G NA 4.33 2 4.33 35 B NA 4.33 2 3.76 58 Y NA 3.72 3 5.33 23 ProdB B 5.44 NA 4.22 3.76 87 ProdC G 4.77 3.22 4.77 2.10 65 B ... ... ... ... .. And so on... I want to create a plot where the colors of the hits represent the Product (A,B,C..), the characther represent the color (X for yellow, box for green, etc..), the X axis is the price and the Y axis is the number (0-5) from the different Stores (A,B,C,D). I've thought either to create a matrix of 4 plots ( for the 4 stores) or in some creative way combine them into one plot? Please help me or point me in the right direction as to which functions to look into, I've been playing around with ggplot for a few days, but can't seem to wrap my head around it yet... Thanks -- View this message in context: http://r.789695.n4.nabble.com/Help-with-a-scatter-plot-tp3939585p3939585.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal,
[R] set different font family for strings in mtext or text?
Hi there, Is it possible to set different font family for strings in mtext or text? For example, on windows platform with windows() device: plot(1:10, type = n) text(5,5, Chinese (English)) #Chinese for Chinese characters it will give the correct Chinese and English characters with two different font family, i.e., English character in default sans family, and Chinese character in the system default font family (it seems that the Chinese font family can not be set or changed). However, when using pdf() or postscript(), if setting the font family to Times, then error message will appear: conversion failure on '...' in 'mbcsToSbcs': dot substituted for... When set the family song (a CJK font family), the English character will be displayed in that CJK font family. I hope to know, is there a mechanism that can be used to set different font family for one string, e.g., if one character can not be find in the default font family, then search for another font family? Any suggestions or comments will be really appreciated? Regards, Jinsong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] strucchange Nyblom-Hansen Test?
Thank you, things seem to be clearer :-) Hansen extended this to the linear regression model and proposed to either compute one test statistic per parameter (which you can do with the parm argument of gefp) or a joint statistic for all parameters. Hansen included in all parameters also the variance, The parm argument of gefp is a nice feature, but what is about the significance level in test statistic compuation (sctest)? Is there multiple testing correction applied or should I rather use for this case the double max statistic as recommended below? An excerpt from page 5 of the paper A Unified Approach to Structural Change Tests Based obn F Statistics, OLS Residuals, and ML Scores (Achim Zeileis): Hansen (1992) suggests to compute this statistic for the full process efp(t) to test all coefficients simultaneously and also for each component of the process (efp(t))j (denoting the j-th component of the process efp(t), j = 1, . . . , k) individually to assess which parameter causes the instability. *Note, that this approach leads to a violation of the significance level of the procedure if no multiple testing correction is applied.* This can be avoided if a functional is applied to the empirical fluctuation process which aggregates over time first yielding k independent test statistics (see Zeileis and Hornik 2003, for more details). -- View this message in context: http://r.789695.n4.nabble.com/strucchange-Nyblom-Hansen-Test-tp3887208p3940055.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] set different font family for strings in mtext or text?
See ?par: check the 'family' paramater. You can select 'family' for each call to mtext or text. However, mixing families is rather ugly, and there are font families that cover both English and Chinese. Note that the main problem with postscript() and pdf() is the limited support in those languages for non-8-bit character encodings: R cannot magically remove restrictions of languages designed in the 1970s. See also http://cran.r-project.org/doc/Rnews/Rnews_2006-2.pdf (referenced from ?pdf) Users of other OSes have the option of using cairographics-based devices (e.g. cairo_pdf), and so will Windows' users as from 2.14.0 (which is in RC): however, the font flexibility is far less on Windows. On Wed, 26 Oct 2011, Jinsong Zhao wrote: Hi there, Is it possible to set different font family for strings in mtext or text? For example, on windows platform with windows() device: plot(1:10, type = n) text(5,5, Chinese (English)) #Chinese for Chinese characters it will give the correct Chinese and English characters with two different font family, i.e., English character in default sans family, and Chinese character in the system default font family (it seems that the Chinese font family can not be set or changed). It certainly can, and the rw-FAQ describes how to do so. However, when using pdf() or postscript(), if setting the font family to Times, then error message will appear: conversion failure on '...' in 'mbcsToSbcs': dot substituted for... When set the family song (a CJK font family), the English character will be displayed in that CJK font family. I hope to know, is there a mechanism that can be used to set different font family for one string, e.g., if one character can not be find in the default font family, then search for another font family? You have to specify the family: R will not guess what you wanted. Any suggestions or comments will be really appreciated? Regards, Jinsong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] difficulties with MuMIn model generation with coxph
Dear Sophie, The answer is 'typo'. 'dredge' does not have an argument named 'marge.ex'. k Dnia 2011-10-25 12:00, r-help-requ...@r-project.org pisze: Message: 131 Date: Mon, 24 Oct 2011 17:08:41 -0700 (PDT) From: sgilbertsophielgilb...@gmail.com To:r-help@r-project.org Subject: [R] difficulties with MuMIn model generation with coxph Message-ID:1319501321733-3935078.p...@n4.nabble.com Content-Type: text/plain; charset=us-ascii Hi All, I'm having trouble with the automatized model generation (dredge) function in the MuMIn package. I'm trying to use it to automatically generate subsets of models from a global cox proportional hazards model, and rank them based on AICc. These seems like it's possible, and the Mumin documentation says that coxph is supported. However, when I run the code (see below), it gives me the following error message: Error in UseMethod(logLik) : no applicable method for 'logLik' applied to an object of class logical ##RCode #read in the data data1-read.table('MaleData500.csv', sep=',', header=T) survival-Surv(data1$Wks.at.dth, data1$Died) #create the full (global) model, a coxph object globemodel-coxph(survival~ edgeden + pctroad + pctcc90+ pctcc80 + pctcrsog + ravine + canfrag + pctoldc, data=data1) #evaluate all subsets of models using dredge exhausting-dredge(globemodel, eval=TRUE, fixed=c(pctroad),m.max=3, marge.ex=TRUE, rank=AICc) Error in UseMethod(logLik) : no applicable method for 'logLik' applied to an object of class logical any suggestions would be greatly appreciated. The globemodel works on its own, and prints out a summary just fine. The only thing I can think of is that in the names of globemodel, there is an attribute called loglik, not logLik? Thank you, Sophie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] strucchange Nyblom-Hansen Test?
On Wed, 26 Oct 2011, buehlerman wrote: Thank you, things seem to be clearer :-) Great. Hansen extended this to the linear regression model and proposed to either compute one test statistic per parameter (which you can do with the parm argument of gefp) or a joint statistic for all parameters. Hansen included in all parameters also the variance, The parm argument of gefp is a nice feature, but what is about the significance level in test statistic compuation (sctest)? Is there multiple testing correction applied or should I rather use for this case the double max statistic as recommended below? By applying the functional in sctest(), you implicitly correct for the number of parameters tested. Thus, you don't need to apply another correction for multiple testing. (The only caveat with the p-values from sctest() is that these are always asymptotic p-values and may not be exact in finite samples. And for many functionals these have been determined by simulation.) This is discussed in a little bit more detail in Zeileis A. (2006), Implementing a Class of Structural Change Tests: An Econometric Computing Approach. _Computational Statistics Data Analysis_, *50*, 2987-3008. doi:10.1016/j.csda.2005.07.001. The comment quoted below pertains to the fact that Hansen (1992) suggested to compute one p-value for each individual parameter as well as another p-value for all parameters jointly. In such a situation, you would have to apply some multiple testing procedure. The meanL2BB functional in strucchange only computes the joint p-value. hth, Z An excerpt from page 5 of the paper A Unified Approach to Structural Change Tests Based obn F Statistics, OLS Residuals, and ML Scores (Achim Zeileis): Hansen (1992) suggests to compute this statistic for the full process efp(t) to test all coefficients simultaneously and also for each component of the process (efp(t))j (denoting the j-th component of the process efp(t), j = 1, . . . , k) individually to assess which parameter causes the instability. *Note, that this approach leads to a violation of the significance level of the procedure if no multiple testing correction is applied.* This can be avoided if a functional is applied to the empirical fluctuation process which aggregates over time first yielding k independent test statistics (see Zeileis and Hornik 2003, for more details). -- View this message in context: http://r.789695.n4.nabble.com/strucchange-Nyblom-Hansen-Test-tp3887208p3940055.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] set different font family for strings in mtext or text?
Thank you very much for the quick reply. On 2011-10-26 18:24, Prof Brian Ripley wrote: See ?par: check the 'family' paramater. You can select 'family' for each call to mtext or text. Yes, I can select 'family' for each call to mtext or text. however, when it's necessary to put both Chinese and English in one line, I should call text or mtext several times with position explicitly. It will be really tedious. The following code have been used for this purpose, however, I don't like this design: put.text - function(x, y, text, family, font, ...) { str.n - length(text) sw.n - numeric(length = str.n+1) sw.n[1] - 0 if (missing(family)) family - rep(, str.n) if (missing(font)) font - rep(1, str.n) for (i in 1:str.n) sw.n[i+1] - strwidth(text[i], family = family[i], font = font[i]) sw - sum(sw.n) for (i in 1:str.n) text(x+sum(sw.n[1:i]), y, text[i], family = family[i], font = font[i], adj = c(0,0.5), ...) } ## usage ## plot 中文(English) with different font family ## 'song' is a user defined font family for CJK. pdf() plot(1:10, type = n) put.text(5, 5, c(中文, (English)), c(song, Times)) dev.off() However, mixing families is rather ugly, and there are font families that cover both English and Chinese. Yes, there are some font families that cover both English and Chinese, however, in those font families, the English characters are ugly... Note that the main problem with postscript() and pdf() is the limited support in those languages for non-8-bit character encodings: R cannot magically remove restrictions of languages designed in the 1970s. See also http://cran.r-project.org/doc/Rnews/Rnews_2006-2.pdf (referenced from ?pdf) Well, I have read this paper very careful, so I can draw CJK on the plot in postscript() and pdf(). Users of other OSes have the option of using cairographics-based devices (e.g. cairo_pdf), and so will Windows' users as from 2.14.0 (which is in RC): however, the font flexibility is far less on Windows. I will try this device. Thanks for the information. Regards, Jinsong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Example(chron) doesn't work
On Wed, Oct 26, 2011 at 12:21 AM, hchui helena.c...@flinders.edu.au wrote: Hi, there, I have a similar problem. The chron example gives NA. dates doesn't work but times does. I would appreciate it if there's a fix for it. Thanks, Helena example(chron) chron dts - dates(c(02/27/92, 02/27/92, 01/14/92, chron+ 02/28/92, 02/01/92)) chron dts [1] NA NA NA NA NA chron # [1] 02/27/92 02/27/92 01/14/92 02/28/92 02/01/92 chron tms - times(c(23:03:20, 22:29:56, 01:03:30, chron+ 18:21:03, 16:56:26)) chron tms [1] 23:03:20 22:29:56 01:03:30 18:21:03 16:56:26 chron # [1] 23:03:20 22:29:56 01:03:30 18:21:03 16:56:26 chron x - chron(dates = dts, times = tms) chron x [1] (NA NA) (NA NA) (NA NA) (NA NA) (NA NA) chron # [1] (02/27/92 23:03:19) (02/27/92 22:29:56) (01/14/92 01:03:30) chron # [4] (02/28/92 18:21:03) (02/01/92 16:56:26) chron chron # We can add or subtract scalars (representing days) to dates or chron # chron objects: chron c(dts[1], dts[1] + 10) Error in y + ifelse(m 2, 0, -1) : non-numeric argument to binary operator In addition: Warning message: In matrix(unlist(lapply(dots, origin)), nrow = 3) : data length [2] is not a sub-multiple or multiple of the number of rows [3] packageDescription(chron)$Version [1] 2.3-42 R.version.string [1] R version 2.13.1 (2011-07-08) win.version() [1] Windows 7 x64 (build 7600) Does it still occur if you start R in vanilla mode? From Windows console: Rgui --vanilla (If you don't have Rgui.exe's directory on your path then cd to the directory where Rgui.exe is located first.). Also, does it still occur in the most recent version of R? -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dotPlot with diagonal
Hi, I want draw a dotPlot. All works fine: (Seq - matrix(c(1, 1, 6, 1, 2, 2, 5, 4, 3, 3, 4, 3, 4, 4, 3, 2, 5, 5, 2, 5, 6, 6, 1, 6), ncol = 6)) dotPlot(Seq[1,], Seq[2,], main = Sequenz 1 und Sequenz 2, asp = 1) Is there a way to draw a small diagonal, begin at (0/0) to (6/6) (perhaps in red??) or must I use gimp? I have many dotPlots, so it is fine if R can do this. Thanks Joerg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simulation from discrete uniform
Why don't you use sample; sample(1:10,10,replace=TRUE) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of BSanders Sent: 26 October 2011 08:49 To: r-help@r-project.org Subject: Re: [R] Simulation from discrete uniform If you wanted a discrete uniform from 1-10 use: ceiling(10*runif(1)) if you wanted from 0-12, use: ceiling(13*runif(1))-1 -- View this message in context: http://r.789695.n4.nabble.com/Simulation-from-discrete-uniform-tp3434980 p3939694.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. LEGAL NOTICE This message is intended for the use o...{{dropped:10}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging two dataframes
Hello. Now i tried to do what you told me. I used the str(fuction), and data$date1 and data3$date1 where both listed character. I changed name to character but it did not work either. I also changed all variables to character, with no positive result. str(data) 'data.frame': 14446 obs. of 15 variables: $ id : chr 1 1 1 1 ... $ compid : chr 2514 2514 2514 2514 ... $ secid : chr 15856 15856 15856 15856 ... $ name : chr A-pressen A-pressen A-pressen A-pressen ... $ period : chr 1 2 3 4 ... $ date : chr 17.05.1980 17.05.1981 17.05.1982 17.05.1983 ... $ enddate: chr 17.05.1981 17.05.1982 17.05.1983 17.05.1984 ... $ div: chr NA NA NA NA ... $ ndivs : chr NA NA NA NA ... $ posdiv : chr NA NA NA NA ... $ ddiv2 : chr NA NA NA NA ... $ ddiv3 : chr NA NA NA NA ... $ ddiv4 : chr NA NA NA NA ... $ ddiv5 : chr NA NA NA NA ... $ ddiv6 : chr NA NA NA NA ... str(data3) 'data.frame': 812354 obs. of 9 variables: $ date : chr 02.01.1996 03.01.1996 04.01.1996 05.01.1996 ... $ Securityid: chr 6001 6001 6001 6001 ... $ Symbol: chr AAV AAV AAV AAV ... $ name : chr Adresseavisen Adresseavisen Adresseavisen Adresseavisen ... $ Securitytype : chr Ordinary Shares Ordinary Shares Ordinary Shares Ordinary Shares ... $ Unadjusted: chr 200 200 200 200 ... $ Event.adjusted: chr 200 200 200 200 ... $ Div.and.Event.adjusted: chr 109,7595375 109,7595375 109,7595375 109,7595375 ... $ Sharesissued : chr 1901646 1901646 1901646 1901646 ... Here is some suitable data for data dput(data[1:20,]) structure(list(id = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), compid = c(2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514), secid = c(15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856), name = c(A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen), period = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20), date = c(17.05.1980, 17.05.1981, 17.05.1982, 17.05.1983, 17.05.1984, 17.05.1985, 17.05.1986, 17.05.1987, 17.05.1988, 17.05.1989, 17.05.1990, 17.05.1991, 17.05.1992, 17.05.1993, 17.05.1994, 17.05.1995, 17.05.1996, 17.05.1997, 17.05.1998, 17.05.1999), enddate = c(17.05.1981, 17.05.1982, 17.05.1983, 17.05.1984, 17.05.1985, 17.05.1986, 17.05.1987, 17.05.1988, 17.05.1989, 17.05.1990, 17.05.1991, 17.05.1992, 17.05.1993, 17.05.1994, 17.05.1995, 17.05.1996, 17.05.1997, 17.05.1998, 17.05.1999, 17.05.2000), div = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 5, 0, 1.1, 1.2, 1, 0 ), ndivs = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0), posdiv = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, 1, 1, 1, NA), ddiv2 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, NA, 0, 1, NA, NA), ddiv3 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0, -1), ddiv4 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0), ddiv5 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 0, 0), ddiv6 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 0)), .Names = c(id, compid, secid, name, period, date, enddate, div, ndivs, posdiv, ddiv2, ddiv3, ddiv4, ddiv5, ddiv6 ), row.names = c(NA, 20L), class = data.frame) Here is some suitable data for data3: dput(data3[1:20,]) structure(list(date = c(02.01.1996, 03.01.1996, 04.01.1996, 05.01.1996, 08.01.1996, 09.01.1996, 10.01.1996, 11.01.1996, 12.01.1996, 15.01.1996, 16.01.1996, 17.01.1996, 18.01.1996, 19.01.1996, 22.01.1996, 23.01.1996, 24.01.1996, 25.01.1996, 26.01.1996, 29.01.1996), Securityid = c(6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001), Symbol = c(AAV, AAV, AAV, AAV, AAV, AAV, AAV, AAV, AAV, AAV, AAV, AAV, AAV, AAV, AAV, AAV, AAV, AAV, AAV, AAV), name = c(Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen), Securitytype = c(Ordinary Shares, Ordinary Shares, Ordinary Shares, Ordinary Shares, Ordinary Shares, Ordinary Shares, Ordinary Shares, Ordinary Shares, Ordinary Shares, Ordinary Shares, Ordinary Shares, Ordinary Shares, Ordinary Shares, Ordinary
[R] New column of data filled with the larger value from 2 columns
Hi, I'm sure there is a pretty simple answer to this but I have had my head buried in the R book and on help pages for a while now and I've not made any progress. In simple terms: I have 2 columns of data, column A and column B. I want to create a new column (C) and fill it with the largest value from of A or B on each row. So I want C to contain A if BA and C to contain B if A=B Like I said I have tried to look for an answer and I'm sure there is one (or many) out there but I am looking in the wrong places or for the wrong terms so I would really appreciate this help! Thanks, Rob. (I promise that once I have mastered R- hopefully in the near future- I will make up for my sins of asking a basic Q by answering many on here!) -- View this message in context: http://r.789695.n4.nabble.com/New-column-of-data-filled-with-the-larger-value-from-2-columns-tp3940020p3940020.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correlation Matrix in R
Thank you for your quick reply and helpful advice. Using this argument allows me to do what I needed to do Now the only other thing I wanted to accomplish was to obtain the top half of the matrix with p values and the bottom half with the correlations, to observe the significant correlations. I have been able to use a few functions such as rcorr, and cor.matrix to get such information but it isn't output in a format that I can save with the write.table function or write.clipboard the pair function allows a graphical display of the data on the other hand (with correlation graphics on the bottom half) and I have added an argument which allows to view the significant p values. But I wanted to know if I could also do the above easily. -- View this message in context: http://r.789695.n4.nabble.com/Correlation-Matrix-in-R-tp3938274p3940170.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculate the difference using ave
Dear R users, It may be very simple but it is being difficult for me. I'd like to calculate the difference in percent between to measures. My data looks like this: set.seed(123) df1 - data.frame(measure = rep(c(A1, A2, A3), each=3), water = sample(c(100:200), 9), tide = sample(c(-10:+10), 9)) df1 # What I want to calculate is: # tide_[A2] / water_[A1], # tide_[A3] / water_[A2] # This 'works' for the example, but I am # looking for a more general solution. df1$tide_diff - ave(df1$tide, FUN=function(L) L / c(NA, NA, NA, df1$water)) * 100 df1 Thanks for any help! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] set different font family for strings in mtext or text?
On Wed, 26 Oct 2011, Jinsong Zhao wrote: Thank you very much for the quick reply. On 2011-10-26 18:24, Prof Brian Ripley wrote: See ?par: check the 'family' paramater. You can select 'family' for each call to mtext or text. Yes, I can select 'family' for each call to mtext or text. however, when it's necessary to put both Chinese and English in one line, I should call text or mtext several times with position explicitly. It will be really tedious. The following code have been used for this purpose, however, I don't like this design: put.text - function(x, y, text, family, font, ...) { str.n - length(text) sw.n - numeric(length = str.n+1) sw.n[1] - 0 if (missing(family)) family - rep(, str.n) if (missing(font)) font - rep(1, str.n) for (i in 1:str.n) sw.n[i+1] - strwidth(text[i], family = family[i], font = font[i]) sw - sum(sw.n) for (i in 1:str.n) text(x+sum(sw.n[1:i]), y, text[i], family = family[i], font = font[i], adj = c(0,0.5), ...) } ## usage ## plot 中文(English) with different font family ## 'song' is a user defined font family for CJK. pdf() plot(1:10, type = n) put.text(5, 5, c(中文, (English)), c(song, Times)) dev.off() However, mixing families is rather ugly, and there are font families that cover both English and Chinese. Yes, there are some font families that cover both English and Chinese, however, in those font families, the English characters are ugly... Not to my eyes in Arial Unicode MS (nor to millions of writers of Word documents). Not elegant, but not ugly. And that is one of the recommended choices in several places in the R documentation. Note that the main problem with postscript() and pdf() is the limited support in those languages for non-8-bit character encodings: R cannot magically remove restrictions of languages designed in the 1970s. See also http://cran.r-project.org/doc/Rnews/Rnews_2006-2.pdf (referenced from ?pdf) Well, I have read this paper very careful, so I can draw CJK on the plot in postscript() and pdf(). Users of other OSes have the option of using cairographics-based devices (e.g. cairo_pdf), and so will Windows' users as from 2.14.0 (which is in RC): however, the font flexibility is far less on Windows. I will try this device. Thanks for the information. Regards, Jinsong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New column of data filled with the larger value from 2 columns
robgriffin247 robgriffin247 at hotmail.com writes: Hi, I'm sure there is a pretty simple answer to this but I have had my head buried in the R book and on help pages for a while now and I've not made any progress. In simple terms: I have 2 columns of data, column A and column B. I want to create a new column (C) and fill it with the largest value from of A or B on each row. sounds like you want data$C - pmax(data$A,data$B) (or data - transform(C,pmax(A,B))) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculate the difference using ave
Maybe one approach could be: set.seed(123) df1 - data.frame(measure = rep(c(A1, A2, A3), each=3), water = sample(c(100:200), 9), tide = sample(c(-10:+10), 9)) 100 * tail(df1$tide, -3) / head(df1$water, -3) I hope it helps. Best, Dimitris On 10/26/2011 12:02 PM, Patrick Hausmann wrote: Dear R users, It may be very simple but it is being difficult for me. I'd like to calculate the difference in percent between to measures. My data looks like this: set.seed(123) df1 - data.frame(measure = rep(c(A1, A2, A3), each=3), water = sample(c(100:200), 9), tide = sample(c(-10:+10), 9)) df1 # What I want to calculate is: # tide_[A2] / water_[A1], # tide_[A3] / water_[A2] # This 'works' for the example, but I am # looking for a more general solution. df1$tide_diff - ave(df1$tide, FUN=function(L) L / c(NA, NA, NA, df1$water)) * 100 df1 Thanks for any help! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging two dataframes
I think you want something like this (I like to be explicit about what you are merging) df3 = merge(df1, df2, by = date, all=T) You can be explicit about what you are merging on in each file: df3 = merge(df1,df2, by.x = date”, by.y=date, all=T) You were trying to merge on “date1” but it looks to me like your data frames actually contains columns called “date” not “date1 As Petr says, in the vanilla situation where there is no overlap of data and the ID column has the same name in both frames, then merge(frame1, frame2) works by itself. tip: don’t use words like “data” as variable names, as that is also a function On 26 Oct 2011, at 11:59 AM, dividend wrote: Hello. Now i tried to do what you told me. I used the str(fuction), and data$date1 and data3$date1 where both listed character. I changed name to character but it did not work either. I also changed all variables to character, with no positive result. str(data) 'data.frame': 14446 obs. of 15 variables: $ id : chr 1 1 1 1 ... $ compid : chr 2514 2514 2514 2514 ... $ secid : chr 15856 15856 15856 15856 ... $ name : chr A-pressen A-pressen A-pressen A-pressen ... $ period : chr 1 2 3 4 ... $ date : chr 17.05.1980 17.05.1981 17.05.1982 17.05.1983 ... $ enddate: chr 17.05.1981 17.05.1982 17.05.1983 17.05.1984 ... $ div: chr NA NA NA NA ... $ ndivs : chr NA NA NA NA ... $ posdiv : chr NA NA NA NA ... $ ddiv2 : chr NA NA NA NA ... $ ddiv3 : chr NA NA NA NA ... $ ddiv4 : chr NA NA NA NA ... $ ddiv5 : chr NA NA NA NA ... $ ddiv6 : chr NA NA NA NA ... str(data3) 'data.frame': 812354 obs. of 9 variables: $ date : chr 02.01.1996 03.01.1996 04.01.1996 05.01.1996 ... $ Securityid: chr 6001 6001 6001 6001 ... $ Symbol: chr AAV AAV AAV AAV ... $ name : chr Adresseavisen Adresseavisen Adresseavisen Adresseavisen ... $ Securitytype : chr Ordinary Shares Ordinary Shares Ordinary Shares Ordinary Shares ... $ Unadjusted: chr 200 200 200 200 ... $ Event.adjusted: chr 200 200 200 200 ... $ Div.and.Event.adjusted: chr 109,7595375 109,7595375 109,7595375 109,7595375 ... $ Sharesissued : chr 1901646 1901646 1901646 1901646 ... Here is some suitable data for data dput(data[1:20,]) structure(list(id = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), compid = c(2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514), secid = c(15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856), name = c(A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen), period = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20), date = c(17.05.1980, 17.05.1981, 17.05.1982, 17.05.1983, 17.05.1984, 17.05.1985, 17.05.1986, 17.05.1987, 17.05.1988, 17.05.1989, 17.05.1990, 17.05.1991, 17.05.1992, 17.05.1993, 17.05.1994, 17.05.1995, 17.05.1996, 17.05.1997, 17.05.1998, 17.05.1999), enddate = c(17.05.1981, 17.05.1982, 17.05.1983, 17.05.1984, 17.05.1985, 17.05.1986, 17.05.1987, 17.05.1988, 17.05.1989, 17.05.1990, 17.05.1991, 17.05.1992, 17.05.1993, 17.05.1994, 17.05.1995, 17.05.1996, 17.05.1997, 17.05.1998, 17.05.1999, 17.05.2000), div = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 5, 0, 1.1, 1.2, 1, 0 ), ndivs = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0), posdiv = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, 1, 1, 1, NA), ddiv2 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, NA, 0, 1, NA, NA), ddiv3 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0, -1), ddiv4 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0), ddiv5 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 0, 0), ddiv6 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 0)), .Names = c(id, compid, secid, name, period, date, enddate, div, ndivs, posdiv, ddiv2, ddiv3, ddiv4, ddiv5, ddiv6 ), row.names = c(NA, 20L), class = data.frame) Here is some suitable data for data3: dput(data3[1:20,]) structure(list(date = c(02.01.1996, 03.01.1996, 04.01.1996, 05.01.1996, 08.01.1996, 09.01.1996, 10.01.1996, 11.01.1996, 12.01.1996, 15.01.1996, 16.01.1996, 17.01.1996, 18.01.1996, 19.01.1996, 22.01.1996, 23.01.1996, 24.01.1996, 25.01.1996, 26.01.1996, 29.01.1996), Securityid = c(6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001,
Re: [R] Random Forest Classification
Explore the ModelMap package. It might offer some useful tools for your application. Steve Friedman Ph. D. Ecologist / Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 steve_fried...@nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 Mohammed Rashad mohammedrashadkm @gmail.comTo Sent by: r-help@r-project.org r-help-bounces@r- cc project.org Subject [R] Random Forest Classification 10/26/2011 02:50 AM Hi All, I wrant to do Random Forest classification. I installed R, randomForest classifier package for R but dont know how to use it. Is there any Open Source Remote sensing application which do RF classification on satellite images? Anyone r has random forest classification example? Any language or package example no problem. Does anyone did it in R? if yes how? I google RF Classification but most of them are for medical disease and research not for Remote Sensing -- Regards, Mohammed Rashad K M M.S. (By Research) student Lab for Spatial Informatics Department of CSE International Institute of Information Technology Hyderabad, India [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [BioC] comparing two tables
Hi David, your function works just fine if I take nly the region into account. But unfortunately it does not consider the first column of the chromosomes. There can be an overlap between the two tables only if the regions are on the same chromosome. This is why the first column of both tables is a prerequisite for the analysis. I treid somehow to create a second argument to consider this, but until now without success. If you have any Ideas I will be grateful. Thanks Assa (I send it only to r-help, as iti si besically an R-question and not specific to bioconductor, but I still think it is also something to do with bioc as it deals with chromosome regions. But anyway, I think you were right about it.) On Tue, Oct 25, 2011 at 18:01, David Winsemius dwinsem...@comcast.netwrote: On Oct 25, 2011, at 10:40 AM, Assa Yeroslaviz wrote: Hi all, @Martin - thanks for the help it works very good. @David - sorry for the misunderstanding. I will see to it, that it won't happen again. BTW, unfortunately your function is not working. It is patialy my error as I gave no regions with overlaps, but even after changing them it just doesn't fit. Here is the new data with an overlap in the third gene: genetable - rd.txt(name chr start end str accession Length gen1 4 646752 646838 + MI0005806 86 gen12 2L 243035 243141 - MI0005821 106 gen3 2L 159838 159928 + MI0005813 90 gen7 2L 1831685 1831799 - MI0011290 114 gen4 2L 2737568 2737661 + MI0017696 93) loctable - rd.txt(Chr Start End length 4 136532 138654 2122 3 139870 141970 2100 2L 157838 160440 2602 X 160834 162966 2132 4 204040 208536 4496) But I still get: apply(genetable, 1, function(x) inregion(x, loctable[, c(Start, End)]) ) [1] FALSE FALSE FALSE FALSE FALSE You just want to pass the start and end columns of genetable # Helper function inregion - function(vec, locs) { +any( apply(locs, 1, function(x) vec[start]x[1] vec[end]=x[2])) } # Test the function inregion(genetable[2, ], loctable[, c(Start, End)]) [1] FALSE # [1] FALSE apply(genetable[, 3:4], 1, function(x) inregion(x, loctable[, c(Start, End)]) ) [1] FALSE FALSE TRUE FALSE FALSE ( I really wish that you would stop crossposting. I am only following your bad practice because you posted my code on BioC.) -- David for the single queries I get TRUE: inregion(genetable[3, ], loctable[, c(Start, End)]) [1] TRUE Do you have Idea, as to how I can fix this problem? Thanks and again sorry for the trouble. Assa On Tue, Oct 25, 2011 at 15:48, Martin Morgan mtmor...@fhcrc.org wrote: On 10/25/2011 03:42 AM, Assa Yeroslaviz wrote: Hi everybody, I would like to know whether it is possible to compare to tables for certain parameters. I have these two tables: gene table name chr start end str accession Length gen1 4 646752 646838 + MI0005806 86 gen12 2L 243035 243141 - MI0005821 106 gen3 2L 159838 159928 + MI0005813 90 gen7 2L 1831685 1831799 - MI0011290 114 gen4 2L 2737568 2737661 + MI0017696 93 ... localization table: Chr Start End length 4 136532 138654 2122 3 139870 141970 2100 2L 157838 158440 602 X 160834 162966 2132 4 204040 208536 4496 ... I would like to check whether a specific gene lie within a certain region. For example I want to see if gene 3 on chromosome 2L lies within the region given in the second table. Hi Assa -- In Bioconductor, use the GenomicRanges package. Create two GRanges objects genes = with(genetable, GRanges(chr, IRanges(start, end), str, accession=accession, Length=length) locations = with(locationtable, GRanges(Chr, IRanges(Start, End))) then olaps = findOverlaps(genes, locations) queryHits(olaps) and subjectHits(olaps) index each gene with all locations it overlaps. The definition of 'overlap' is flexible, see ?findOverlaps. Martin What I would like to is like 1. check if the gene lies on a specific chromosome 1.a if no - go to the next line 1.b if yes - go to 2 2. check if the start position of the gene is bigger than the start position of the localization table AND if it smaller than the end position (if it lies between the start and end positions in the localization table) 2.a if no - go to the next gene 2.b if yes - give it to me. I was having difficulties doing it without running into three interleaved conditional loops (if). I would appreciate any help. Thanks Assa [[alternative HTML version deleted]] ___
Re: [R] RGtk2 problems
The gain from updating will be that RGtk2 now looks in a specific (internal) place for the libraries, so you should no longer need to worry about library conflicts and PATH settings. In theory. Michael On Tue, Oct 25, 2011 at 5:46 PM, Aref arefnamm...@gmail.com wrote: Thank you for the response and I am sorry about the html--will remember next time. The version of RGtk2 installed is 2.20.8 I installed it through R from CRAN repository. I believe that the problem is that during the installation the environment variable GTK_BASEPATH was set to some other location than where GTK+ was installed--overwritten by the R installation process. I found this after I fixed the issue by copying the libraries into R \bin. This is probably not the best solution but it works. I will be updating R soon to 2.14 when it comes out and hopefully things will work better now that I have the environment variable pointing to the right place for the GTK+ libraries. On Oct 24, 12:12 am, Prof Brian Ripley rip...@stats.ox.ac.uk wrote: Please update your R (and probably your RGtk2: you did not tell us its version), as the posting guide asked you to do before posting. On Sun, 23 Oct 2011, Aref Nammari wrote: Hello, I hope this is the right place to ask for help with a problem I am having with RGtk2 installation with R on Windows XP. I am running R 2.11.1 and have installed the package RGtk2 from CRAN. As a binary package, I guess, but please tell us (it matters). I also have GTK 2.10.11 installed as well as GTK2-runtime 2.22.0. I have added the environment variable GTK_PATH and set its value to the root location where GTK is installed. But you need the Gtk+ bin directory in your PATH. Environment variable GTK_PATH is only needed when RGtk2 is installed from the sources. Which Gtk+ you need in your path depends on the version of RGtk2 you have and how you installed it. For current binary versions, see http://cran.r-project.org/bin/windows/contrib/2.13/@ReadMe When I try to run RGtk2 in R by typing library(RGtk2) a popup dialog appears with the following error message: The procedure entry point gdk_app_launch_context_get_type could not be located in the dynamic link library libgdk-win32-2.0-0.dll In the R window I get : Error in inDL(x, as.logical(local), as.logical(now), ...) : unable to load shared library 'C:/PROGRA~1/R/R-211~1.1/library/RGtk2/ libs/RGtk2.dll': LoadLibrary failure: The specified procedure could not be found. Failed to load RGtk2 dynamic library, attempting to install it. Error : .onLoad failed in loadNamespace() for 'RGtk2', details: call: install_all() error: This platform is not yet supported by the automatic installer. Please install GTK+ manually, if necessary. See: http://www.gtk.org http://www.google.com/url?sa=Dq=http://www.gtk.orgusg=AFQjCNFJhHsdo... Error: package/namespace load failed for 'RGtk2' Any help in figuring out what could be the problem is greatly appreciated. Cheers, [[alternative HTML version deleted]] Please do as the posting guide asked of you and not send HTML. __ r-h...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp:// www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ r-h...@r-project.org mailing listhttps:// stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp:// www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dotPlot with diagonal
Let's see: there is a dotPlot() function in each of the following packages: BHH2, caret, mosaic, qualityTools Would you be kind enough to share which of these packages (if any) you are using? Dennis On Wed, Oct 26, 2011 at 4:25 AM, Jörg Reuter jo...@reuter.at wrote: Hi, I want draw a dotPlot. All works fine: (Seq - matrix(c(1, 1, 6, 1, 2, 2, 5, 4, 3, 3, 4, 3, 4, 4, 3, 2, 5, 5, 2, 5, 6, 6, 1, 6), ncol = 6)) dotPlot(Seq[1,], Seq[2,], main = Sequenz 1 und Sequenz 2, asp = 1) Is there a way to draw a small diagonal, begin at (0/0) to (6/6) (perhaps in red??) or must I use gimp? I have many dotPlots, so it is fine if R can do this. Thanks Joerg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lock a package to specific R version
-Original Message- From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk] Sent: 26 October 2011 10:12 To: Uwe Ligges Cc: Mehmet Suzen; r-help@r-project.org Subject: Re: [R] lock a package to specific R version On Wed, 26 Oct 2011, Uwe Ligges wrote: On 25.10.2011 11:42, Mehmet Suzen wrote: Hi, I was wondering if it is possible to lock a package to a specific version of R. Dependency attribute in the package DESCRIPTION only accepts= AFAIU Depends: R (= 2.13.2), R (= 2.13.2) Or even use == Dear Professor Ripley, Thank you for the reply. We are maintaining Internal R packages and build binaries for different versions of R base, ranging from 2.8.x to 2.13.x We need to prevent users using wrong versions, but the ones we tested. (we distribute binaries only and package source base is evolving as well) Not sure how to address this, initially I was thinking to put R version in the package version, but package version in description files only allows x.x.x format which doesn't give a room. I don't see why you would want to do this: why would a package work with 2.13.1 and not 2.13.2, or 2.13.2 and not 2.14.0? Ranges may make sense. Ranges would be much more sensible then ==. Best Regards, Mehmet LEGAL NOTICE This message is intended for the use o...{{dropped:10}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dotPlot with diagonal
Oh, sorry. library(lattice) (Seq - matrix(c(1, 1, 6, 1, 2, 2, 5, 4, 3, 3, 4, 3, 4, 4, 3, 2, 5, 5, 2, 5, 6, 6, 1, 6), ncol = 6)) dotPlot(Seq[1,], Seq[2,], main = Sequenz 1 und Sequenz 2, asp = 1) Is there a way to draw a small diagonal, begin at (0/0) to (6/6) (perhaps in red??) or must I use gimp? I have many dotPlots, so it is fine if R can do this. 2011/10/26 Dennis Murphy djmu...@gmail.com: Let's see: there is a dotPlot() function in each of the following packages: BHH2, caret, mosaic, qualityTools Would you be kind enough to share which of these packages (if any) you are using? Dennis On Wed, Oct 26, 2011 at 4:25 AM, Jörg Reuter jo...@reuter.at wrote: Hi, I want draw a dotPlot. All works fine: (Seq - matrix(c(1, 1, 6, 1, 2, 2, 5, 4, 3, 3, 4, 3, 4, 4, 3, 2, 5, 5, 2, 5, 6, 6, 1, 6), ncol = 6)) dotPlot(Seq[1,], Seq[2,], main = Sequenz 1 und Sequenz 2, asp = 1) Is there a way to draw a small diagonal, begin at (0/0) to (6/6) (perhaps in red??) or must I use gimp? I have many dotPlots, so it is fine if R can do this. Thanks Joerg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [BioC] comparing two tables
Hi, On Wed, Oct 26, 2011 at 8:17 AM, Assa Yeroslaviz fry...@gmail.com wrote: Hi David, your function works just fine if I take nly the region into account. But unfortunately it does not consider the first column of the chromosomes. There can be an overlap between the two tables only if the regions are on the same chromosome. This is why the first column of both tables is a prerequisite for the analysis. I treid somehow to create a second argument to consider this, but until now without success. Well, bioconductor has packages to deal with this type of data, and these type of queries (overlaps) very efficiently. Martin Morgan had sent you an email earlier explaining how you can use the GenomicRanges packages to get what you're after ... I (highly) suggest you go that route. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging two dataframes
Hello. Now i tried to do what you told me. I used the str(fuction), and data$date1 and data3$date1 where both listed You have no date1 only date. Therefore result- merge(data, data3, by=c(date, name), all=T) takes all values from both data frames dim(data) [1] 20 15 dim(data3) [1] 20 9 alltogether 24 columns from which 4 are date and name columns therefore 20 columns contain data. dim(result) [1] 40 22 So the result has all 20 columns from both data frames plus one name and one date column and all rows from both data frames = 40. Those two sets are disjoint. If you had some common date and name in both data frames these rows would be merged on the same row in result. Let us try this. data3$name[1:5] - data$name[1:5] data3$date[3:5] - data$date[3:5] result- merge(data, data3, by=c(date, name), all=T) dim(result) [1] 37 22 Regards Petr character. I changed name to character but it did not work either. I also changed all variables to character, with no positive result. str(data) 'data.frame': 14446 obs. of 15 variables: $ id : chr 1 1 1 1 ... $ compid : chr 2514 2514 2514 2514 ... $ secid : chr 15856 15856 15856 15856 ... $ name : chr A-pressen A-pressen A-pressen A-pressen ... $ period : chr 1 2 3 4 ... $ date : chr 17.05.1980 17.05.1981 17.05.1982 17.05.1983 ... $ enddate: chr 17.05.1981 17.05.1982 17.05.1983 17.05.1984 ... $ div: chr NA NA NA NA ... $ ndivs : chr NA NA NA NA ... $ posdiv : chr NA NA NA NA ... $ ddiv2 : chr NA NA NA NA ... $ ddiv3 : chr NA NA NA NA ... $ ddiv4 : chr NA NA NA NA ... $ ddiv5 : chr NA NA NA NA ... $ ddiv6 : chr NA NA NA NA ... str(data3) 'data.frame': 812354 obs. of 9 variables: $ date : chr 02.01.1996 03.01.1996 04.01.1996 05.01.1996 ... $ Securityid: chr 6001 6001 6001 6001 ... $ Symbol: chr AAV AAV AAV AAV ... $ name : chr Adresseavisen Adresseavisen Adresseavisen Adresseavisen ... $ Securitytype : chr Ordinary Shares Ordinary Shares Ordinary Shares Ordinary Shares ... $ Unadjusted: chr 200 200 200 200 ... $ Event.adjusted: chr 200 200 200 200 ... $ Div.and.Event.adjusted: chr 109,7595375 109,7595375 109,7595375 109,7595375 ... $ Sharesissued : chr 1901646 1901646 1901646 1901646 ... Here is some suitable data for data dput(data[1:20,]) structure(list(id = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), compid = c(2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514), secid = c(15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856, 15856), name = c(A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, A-pressen), period = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20), date = c(17.05.1980, 17.05.1981, 17.05.1982, 17.05.1983, 17.05.1984, 17.05.1985, 17.05.1986, 17.05.1987, 17.05.1988, 17.05.1989, 17.05.1990, 17.05.1991, 17.05.1992, 17.05.1993, 17.05.1994, 17.05.1995, 17.05.1996, 17.05.1997, 17.05.1998, 17.05.1999), enddate = c(17.05.1981, 17.05.1982, 17.05.1983, 17.05.1984, 17.05.1985, 17.05.1986, 17.05.1987, 17.05.1988, 17.05.1989, 17.05.1990, 17.05.1991, 17.05.1992, 17.05.1993, 17.05.1994, 17.05.1995, 17.05.1996, 17.05.1997, 17.05.1998, 17.05.1999, 17.05.2000), div = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 5, 0, 1.1, 1.2, 1, 0 ), ndivs = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0), posdiv = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, 1, 1, 1, NA), ddiv2 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, NA, 0, 1, NA, NA), ddiv3 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0, -1), ddiv4 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0), ddiv5 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 0, 0), ddiv6 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 0)), .Names = c(id, compid, secid, name, period, date, enddate, div, ndivs, posdiv, ddiv2, ddiv3, ddiv4, ddiv5, ddiv6 ), row.names = c(NA, 20L), class = data.frame) Here is some suitable data for data3: dput(data3[1:20,]) structure(list(date = c(02.01.1996, 03.01.1996, 04.01.1996, 05.01.1996, 08.01.1996, 09.01.1996, 10.01.1996, 11.01.1996, 12.01.1996, 15.01.1996, 16.01.1996, 17.01.1996, 18.01.1996, 19.01.1996, 22.01.1996, 23.01.1996,
Re: [R] [BioC] comparing two tables
Thanks Steve, I already did it and it went perfectly well. I was just trying to understand the functions David wrote, so that I can use them maybe for other queries. Unfortunately I wasn't able to add a condition for the fact that there is a third parameter to be compared. I would still ove to know whether there is a way of adding such a perameter. I tried to do it with a third argument in this line: any( apply(locs, 1, function(x){vec[start]x[2] vec[start]=x[3] * as.character(vec[chr])==as.character(x[chr]*) but it doesn't seems to work at all. Thanks for the help anyway Assa On Wed, Oct 26, 2011 at 15:33, Steve Lianoglou mailinglist.honey...@gmail.com wrote: Hi, On Wed, Oct 26, 2011 at 8:17 AM, Assa Yeroslaviz fry...@gmail.com wrote: Hi David, your function works just fine if I take nly the region into account. But unfortunately it does not consider the first column of the chromosomes. There can be an overlap between the two tables only if the regions are on the same chromosome. This is why the first column of both tables is a prerequisite for the analysis. I treid somehow to create a second argument to consider this, but until now without success. Well, bioconductor has packages to deal with this type of data, and these type of queries (overlaps) very efficiently. Martin Morgan had sent you an email earlier explaining how you can use the GenomicRanges packages to get what you're after ... I (highly) suggest you go that route. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in summary.mlm: formula not subsettable
When I fit a multivariate linear model, and the formula is defined outside the call to lm(), the method summary.mlm() fails. This works well: y - matrix(rnorm(20),nrow=10) x - matrix(rnorm(10)) mod1 - lm(y~x) summary(mod1) ... But this does not: f - y~x mod2 - lm(f) summary(mod2) Error en object$call$formula[[2L]] - object$terms[[2L]] - as.name(ynames[i]) : objeto de tipo 'symbol' no es subconjunto I would say that the problem is in the following difference: class(mod1$call$formula) [1] call class(mod2$call$formula) [1] name As far as I understand, summary.mlm() creates a list of .lm objects from the individual columns of the matrices in the .mlm object, and then it tries to change the second element of object$call$formula, to present the name of the corresponding column as the response variable. But if the formula has been defined outside the call to lm(), that element cannot be modifed that way. A bug, perhaps? sessionInfo() R version 2.13.0 (2011-04-13) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 [3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C [5] LC_TIME=Spanish_Spain.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base -- Helios de Rosario Martínez Researcher __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Power mixed model ordinal logistic regression
Is there a package that will perform power calculations for mixed model ordinal logistic regression? I searched an came up with nothing. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in summary.mlm: formula not subsettable
On 26/10/2011 9:48 AM, Helios de Rosario wrote: When I fit a multivariate linear model, and the formula is defined outside the call to lm(), the method summary.mlm() fails. This works well: y- matrix(rnorm(20),nrow=10) x- matrix(rnorm(10)) mod1- lm(y~x) summary(mod1) ... But this does not: f- y~x mod2- lm(f) summary(mod2) Error en object$call$formula[[2L]]- object$terms[[2L]]- as.name(ynames[i]) : objeto de tipo 'symbol' no es subconjunto I would say that the problem is in the following difference: class(mod1$call$formula) [1] call class(mod2$call$formula) [1] name As far as I understand, summary.mlm() creates a list of .lm objects from the individual columns of the matrices in the .mlm object, and then it tries to change the second element of object$call$formula, to present the name of the corresponding column as the response variable. But if the formula has been defined outside the call to lm(), that element cannot be modifed that way. A bug, perhaps? Yes, I'd say it's a bug. summary.lm handles this situation fine, but summary.mlm does not. I'll take a look... Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in summary.mlm: formula not subsettable
On 26/10/2011 9:48 AM, Helios de Rosario wrote: When I fit a multivariate linear model, and the formula is defined outside the call to lm(), the method summary.mlm() fails. This works well: y- matrix(rnorm(20),nrow=10) x- matrix(rnorm(10)) mod1- lm(y~x) summary(mod1) ... But this does not: f- y~x mod2- lm(f) summary(mod2) Error en object$call$formula[[2L]]- object$terms[[2L]]- as.name(ynames[i]) : objeto de tipo 'symbol' no es subconjunto I would say that the problem is in the following difference: class(mod1$call$formula) [1] call class(mod2$call$formula) [1] name As far as I understand, summary.mlm() creates a list of .lm objects from the individual columns of the matrices in the .mlm object, and then it tries to change the second element of object$call$formula, to present the name of the corresponding column as the response variable. But if the formula has been defined outside the call to lm(), that element cannot be modifed that way. A bug, perhaps? Yes, it was a bug. A simple workaround is the following: mod2$call$formula - formula(mod2) I'll add that to summary.mlm, but in the meantime, you can just do it yourself. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging two dataframes
I pasted wrong function, I have changed from date1 to date (ignore that). I think it have to be something wrong with my data format. I can`t understand why it don't work. I know I can use by.x= and by.y=, but since both datasets have the same variable name it should be unnecessary to do that. -- View this message in context: http://r.789695.n4.nabble.com/merging-two-dataframes-tp3932869p3940396.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding rows to a table with a loop
Thanks for the response, and the advice, glmulti looks like it could be quite a good alternative. As for the adding to the results table problem from within the loop, this webpage: http://ryouready.wordpress.com/2009/01/23/r-combining-vectors-or-data-frames-of-unequal-length-into-one-data-frame/ answered a number of my questions. -- View this message in context: http://r.789695.n4.nabble.com/Adding-rows-to-a-table-with-a-loop-tp3933634p3940293.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New column of data filled with the larger value from 2 columns
data$C - pmax(data$A,data$B) worked perfectly thank you very much -- View this message in context: http://r.789695.n4.nabble.com/New-column-of-data-filled-with-the-larger-value-from-2-columns-tp3940020p3940399.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging two dataframes
Re: [R] merging two dataframes I pasted wrong function, I have changed from date1 to date (ignore that). I think it have to be something wrong with my data format. I can`t understand why it don't work. I know I can use by.x= and by.y=, but since both datasets have the same variable name it should be unnecessary to do that. Again check dimensions of your the three data frames. Number of rows in final data frame shall be at least same as the number of rows in bigger data frame and lower than sum of rows of both merged data frames. Regards Petr -- View this message in context: http://r.789695.n4.nabble.com/merging-two- dataframes-tp3932869p3940396.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Building package/DESCRIPTION file not existing?
Thanks to both of you. Indeed, it was a language issue. I eventually detected a check warning stating that the DESCRIPTION file had non-ASCII characters and unknown encoding, but no special characters were in the file. From reading various messages on mailing lists, I added Encoding: latin1 and it worked. Then, when installing the package tarball with install.packages, the é in the Propriétaire for the library directory was changed to ii . So I used it's DOS equivalent Program~1 and it also worked. I haven't notice any warning about using non-english Windows in the Writng R Extensions manual, but I may have missed it. Anyway, I realize now that using non-english Windows is probably a really bad idea in general. Cheers, Francois Rousseu From: maech...@stat.math.ethz.ch Date: Wed, 26 Oct 2011 10:37:30 +0200 To: francoisrous...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] Building package/DESCRIPTION file not existing? Francois Rousseu francoisrous...@hotmail.com on Mon, 24 Oct 2011 20:10:27 -0400 writes: Hello useRs I am trying to build a package for personal use and for making easier working with other people but I keep getting the same error message about the DESCRIPTION file not existing. when trying to install from a source tar.gz file: Error in .read_description(dfile) : file 'C:/Users/Propri�taire/AppData/Local/Temp/RtmpHFMONb/R.INSTALL647a3535/mypkg/DESCRIPTION' does not exist when trying to build a binary version: Error in .read_description(dfile) : file 'C:/Users/Propri�taire/Documents/RETROBIRD/mypkg/DESCRIPTION' does not exist In this last case, the DESCRIPTION file is certainly there! Also, help and DESCRIPTION files are edited and my path variable seems to be set correctly as I can access R and tex (form miktex 2.9) from the console. I feel it might be related to language issues (windows on my system is in french, see sessionInfo() at bottom of message) or something about temporary directories, but I really can't find the problem. I've looked into the cygwin warning, but it didn't seemed to be the problem, though I may be wrong. Yes, I'm almost sure it's the language issues. I've recently taught a course on R Package building and on Windows, the user had problems because of an 'ä' (a-Umlaut) in one of the directories in her 'path'. So if you work from another place than 'C:/Users/Propri�taire/' this may solve the main problem. Bonnes salutations, Martin Maechler, ETH Zurich Any hints? Below is the complete sequence with errors. Thanks, Francois Rousseu [.] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging two dataframes
So when I do the merge on your example frames, I get the expected result. But the example component dataframes you sent are already full of NAs, and there are no rows which are present in both data sets. So I think perhaps, that merge is just highlighting a problem that has its roots in your component data. t On 26 Oct 2011, at 1:16 PM, dividend wrote: I pasted wrong function, I have changed from date1 to date (ignore that). I think it have to be something wrong with my data format. I can`t understand why it don't work. I know I can use by.x= and by.y=, but since both datasets have the same variable name it should be unnecessary to do that. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [BioC] comparing two tables
Hi Assa, On Wed, Oct 26, 2011 at 9:44 AM, Assa Yeroslaviz fry...@gmail.com wrote: Thanks Steve, I already did it and it went perfectly well. I was just trying to understand the functions David wrote, so that I can use them maybe for other queries. Unfortunately I wasn't able to add a condition for the fact that there is a third parameter to be compared. I would still ove to know whether there is a way of adding such a perameter. Sorry, I didn't realize you were after some personal R study I tried to do it with a third argument in this line: any( apply(locs, 1, function(x){vec[start]x[2] vec[start]=x[3] as.character(vec[chr])==as.character(x[chr]) but it doesn't seems to work at all. You have to change the table you are sending to the second param of your inregion function. currently you are sending into the `locs` parameter a two column table that just has c(Start, End), eg: R Think about inregion(genetable[2, ], loctable[, c(Start, End)]) Look at what `loctable[, c(Start, End)]` gives you It looks like your change to inregion should work once you pass in the Chr column from your loctable (barring case-sensitive issues (you have 'chr' and Chr in your separate tables), eg use your modified inregion function and call it like so: R inregion(genetable[2, ], loctable[, c(Chr, Start, End)]) modulo this or that. -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sometimes removing NAs from code
Sometimes I have NA values within specific columns of a dataframe (in this example, the first two columns can have NAs). If there are NA values, I would like them to be removed. I have been using the code: y-c(NA,5,4,2,5,6,NA) z-c(NA,3,4,NA,1,3,7) x-1:7 adata-data.frame(y,z,x) adata-adata[-which(apply(adata[,1:2],1,function(x)any(is.na(x,] This works well if there are NA values, but when a dataset doesn't have NA values, this code messes up the dataframe. I was trying to pick apart this code and could not understand why it didn't work when there were no NA values. If there are no NA values and I run just the part: apply(adata[,1:2],1,function(x)any(is.na(x))) it results in: 2 3 5 6 FALSE FALSE FALSE FALSE I was thinking that I can put in an if statement, but I think there has to be a better way. Any ideas/help? Thank you. - In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/sometimes-removing-NAs-from-code-tp3941009p3941009.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using abline in lattice
Ups, sorry, just realized the first code is wrong, its one with a panel function already. The right code would be: Tuvalu - c(9,3,4,0,3,0,0) Singapor - c(38,0,0,0,12,19,0) Samoa - c(26,16,2,0,5,2,0) PNG - c(56,4,0,5,2,0,56) Micronesia - c(6,0,0,0,0,0,0) graph4 - data.frame(rbind(Tuvalu,Singapor,Samoa,PNG,Micronesia)) graph4$country - c(Tuvalu,Singapore,Samoa,Papua New Guinea,Micronesia) barchart(country ~ X1 + X2 + X3 + X4 + X5 + X6 + X7, data=graph4, stack=T, xlim=c(0,130), scales = list(alternating = 1, cex=1.2), xlab=, #col=c(grey1,grey17,grey33,grey50,grey67,grey83,grey100) col=c(grey20,grey100,grey50,grey83,grey33,grey67,grey0)) Apologies! -- View this message in context: http://r.789695.n4.nabble.com/Using-abline-in-lattice-tp3941012p3941024.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using abline in lattice
Dear all, being a relative beginner in R, I apologize for posting the second question within two days. So I want a stacked barchart, which should look like the one produced by this code: Tuvalu - c(9,3,4,0,3,0,0) Singapor - c(38,0,0,0,12,19,0) Samoa - c(26,16,2,0,5,2,0) PNG - c(56,4,0,5,2,0,56) Micronesia - c(6,0,0,0,0,0,0) graph4 - data.frame(rbind(Tuvalu,Singapor,Samoa,PNG,Micronesia)) graph4$country - c(Tuvalu,Singapore,Samoa,Papua New Guinea,Micronesia) graph4$country - factor(graph4$country) xyplot(country ~ X1 + X2 + X3 + X4 + X5 + X6 + X7, data=graph4, xlim=c(0,130), #scales = list(alternating = 1, cex=1.2), xlab=, panel = function (x,y) { stack=F groups=country panel.barchart(x,y, col=c(grey20,grey100,grey50,grey83,grey33,grey67,grey0)) panel.abline(v = 20, lty = 2, col = blue) } ) But now I would like to add vertical lines at certain values (20, 40, etc.), but because I couldn't make the abline command work with the above code, I wrote a panel function. Then the vertical lines work quite well, but now the bars are plotted on top of each other. See for yourself, here is the code (the first four lines of recoding I did to make sure that its not a problem of the formula with the pluses, but it turns out just the same): test - data.frame(rep(graph4$country,7), c(graph4$X1,graph4$X2,graph4$X3,graph4$X4,graph4$X5,graph4$X6,graph4$X7)) names(test) - c(country,X1) names(test) xyplot(country ~ X1, data=test, xlim=c(0,130), #scales = list(alternating = 1, cex=1.2), xlab=, panel = function (x,y) { #groups=country panel.abline(v = 20, lty = 2, col = grey70) panel.abline(v = 40, lty = 2, col = grey70) panel.barchart(x,y, col=c(grey20,grey100,grey50,grey83,grey33,grey67,grey0)) }, ) I played a lot around with the stack command in the second code, nothing worked. My question now would be, how can I either make the vertical lines work with the first code, or the bars look like in the first example using the second code. Thanks a lot for your help! Florian -- View this message in context: http://r.789695.n4.nabble.com/Using-abline-in-lattice-tp3941012p3941012.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dotPlot with diagonal
Try again. There is no dotPlot() function in lattice and dotplot() does not take two separate rows so the example you gave us generates an error message if dotPlot is changed to dotplot. -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jörg Reuter Sent: Wednesday, October 26, 2011 8:13 AM To: Dennis Murphy Cc: r-help@r-project.org Subject: Re: [R] dotPlot with diagonal Oh, sorry. library(lattice) (Seq - matrix(c(1, 1, 6, 1, 2, 2, 5, 4, 3, 3, 4, 3, 4, 4, 3, 2, 5, 5, 2, 5, 6, 6, 1, 6), ncol = 6)) dotPlot(Seq[1,], Seq[2,], main = Sequenz 1 und Sequenz 2, asp = 1) Is there a way to draw a small diagonal, begin at (0/0) to (6/6) (perhaps in red??) or must I use gimp? I have many dotPlots, so it is fine if R can do this. 2011/10/26 Dennis Murphy djmu...@gmail.com: Let's see: there is a dotPlot() function in each of the following packages: BHH2, caret, mosaic, qualityTools Would you be kind enough to share which of these packages (if any) you are using? Dennis On Wed, Oct 26, 2011 at 4:25 AM, Jörg Reuter jo...@reuter.at wrote: Hi, I want draw a dotPlot. All works fine: (Seq - matrix(c(1, 1, 6, 1, 2, 2, 5, 4, 3, 3, 4, 3, 4, 4, 3, 2, 5, 5, 2, 5, 6, 6, 1, 6), ncol = 6)) dotPlot(Seq[1,], Seq[2,], main = Sequenz 1 und Sequenz 2, asp = 1) Is there a way to draw a small diagonal, begin at (0/0) to (6/6) (perhaps in red??) or must I use gimp? I have many dotPlots, so it is fine if R can do this. Thanks Joerg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Power mixed model ordinal logistic regression
On Oct 26, 2011, at 8:57 AM, Scott Raynaud wrote: Is there a package that will perform power calculations for mixed model ordinal logistic regression? I searched an came up with nothing. I am not sure that there is a canned package or function that will do that. More than likely, you will need to use simulation. I would suggest that you subscribe to and post your query to the r-sig-mixed-models list: https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models That will provide you with a focused audience in this domain and somebody might know of alternatives. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sometimes removing NAs from code
Hi, Why don't you give subset a try: adata - subset(adata, is.na(z)==FALSEis.na(y)==FALSE) I'm not sure if you want to use AND or OR for this statement. Best wishes, Natalie On 26/10/2011 16:25, Schatzi wrote: Sometimes I have NA values within specific columns of a dataframe (in this example, the first two columns can have NAs). If there are NA values, I would like them to be removed. I have been using the code: y-c(NA,5,4,2,5,6,NA) z-c(NA,3,4,NA,1,3,7) x-1:7 adata-data.frame(y,z,x) adata-adata[-which(apply(adata[,1:2],1,function(x)any(is.na(x,] This works well if there are NA values, but when a dataset doesn't have NA values, this code messes up the dataframe. I was trying to pick apart this code and could not understand why it didn't work when there were no NA values. If there are no NA values and I run just the part: apply(adata[,1:2],1,function(x)any(is.na(x))) it results in: 2 3 5 6 FALSE FALSE FALSE FALSE I was thinking that I can put in an if statement, but I think there has to be a better way. Any ideas/help? Thank you. - In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/sometimes-removing-NAs-from-code-tp3941009p3941009.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sometimes removing NAs from code
?complete.cases y-c(NA,5,4,2,5,6,NA) z-c(NA,3,4,NA,1,3,7) x-1:7 adata-data.frame(y,z,x) adata y z x 1 NA NA 1 2 5 3 2 3 4 4 3 4 2 NA 4 5 5 1 5 6 6 3 6 7 NA 7 7 adata[complete.cases(adata),] y z x 2 5 3 2 3 4 4 3 5 5 1 5 6 6 3 6 On Wed, Oct 26, 2011 at 11:25 AM, Schatzi adele_thomp...@cargill.com wrote: Sometimes I have NA values within specific columns of a dataframe (in this example, the first two columns can have NAs). If there are NA values, I would like them to be removed. I have been using the code: y-c(NA,5,4,2,5,6,NA) z-c(NA,3,4,NA,1,3,7) x-1:7 adata-data.frame(y,z,x) adata-adata[-which(apply(adata[,1:2],1,function(x)any(is.na(x,] This works well if there are NA values, but when a dataset doesn't have NA values, this code messes up the dataframe. I was trying to pick apart this code and could not understand why it didn't work when there were no NA values. If there are no NA values and I run just the part: apply(adata[,1:2],1,function(x)any(is.na(x))) it results in: 2 3 5 6 FALSE FALSE FALSE FALSE I was thinking that I can put in an if statement, but I think there has to be a better way. Any ideas/help? Thank you. - In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/sometimes-removing-NAs-from-code-tp3941009p3941009.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sometimes removing NAs from code
On Oct 26, 2011, at 10:25 AM, Schatzi wrote: Sometimes I have NA values within specific columns of a dataframe (in this example, the first two columns can have NAs). If there are NA values, I would like them to be removed. I have been using the code: y-c(NA,5,4,2,5,6,NA) z-c(NA,3,4,NA,1,3,7) x-1:7 adata-data.frame(y,z,x) adata-adata[-which(apply(adata[,1:2],1,function(x)any(is.na(x,] This works well if there are NA values, but when a dataset doesn't have NA values, this code messes up the dataframe. I was trying to pick apart this code and could not understand why it didn't work when there were no NA values. If there are no NA values and I run just the part: apply(adata[,1:2],1,function(x)any(is.na(x))) it results in: 2 3 5 6 FALSE FALSE FALSE FALSE I was thinking that I can put in an if statement, but I think there has to be a better way. Any ideas/help? Thank you. Presuming that you want to remove an entire row, if any of the elements in that row are NA's, see ?na.omit na.omit(adata) y z x 2 5 3 2 3 4 4 3 5 5 1 5 6 6 3 6 HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sometimes removing NAs from code
Hi, On Wed, Oct 26, 2011 at 11:25 AM, Schatzi adele_thomp...@cargill.com wrote: Sometimes I have NA values within specific columns of a dataframe (in this example, the first two columns can have NAs). If there are NA values, I would like them to be removed. I have been using the code: y-c(NA,5,4,2,5,6,NA) z-c(NA,3,4,NA,1,3,7) x-1:7 adata-data.frame(y,z,x) adata-adata[-which(apply(adata[,1:2],1,function(x)any(is.na(x,] This works well if there are NA values, but when a dataset doesn't have NA values, this code messes up the dataframe. I was trying to pick apart this code and could not understand why it didn't work when there were no NA values. Thanks for the example. Your problem is because of the which() statement. If there are NA values, which() returns the row numbers where the NAs are: which(apply(adata[,1:2],1,function(x)any(is.na(x [1] 1 4 7 bdata - data.frame(1:7, 1:7, 1:7) which(apply(bdata[,1:2],1,function(x)any(is.na(x integer(0) But if there aren't any, which() returns 0. How does R subset on a row index of 0? Unhelpfully. Fortunately you don't need the which() at all: the logical vector returned by your apply statement is entirely sufficient (with added negation): adata[apply(adata[,1:2],1,function(x)!any(is.na(x))), ] y z x 2 5 3 2 3 4 4 3 5 5 1 5 6 6 3 6 bdata[apply(bdata[,1:2],1,function(x)!any(is.na(x))), ] X1.7 X1.7.1 X1.7.2 11 1 1 22 2 2 33 3 3 44 4 4 55 5 5 66 6 6 77 7 7 Sarah If there are no NA values and I run just the part: apply(adata[,1:2],1,function(x)any(is.na(x))) it results in: 2 3 5 6 FALSE FALSE FALSE FALSE I was thinking that I can put in an if statement, but I think there has to be a better way. Any ideas/help? Thank you. -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sometimes removing NAs from code
Instead of d[-which(condition)] use d[!condition] where 'condition' is a logical vector. which(condition) returns integer(0) (an integer vector of length 0) if there are no TRUEs in 'condition'. -integer(0) is identical to integer(0) and d[integer(0)] means to select zero elements from d. !condition means to flip the senses of all the TRUEs and FALSEs (and to leave NAs alone) so d[!condition] returns the elements of d for which condition is not TRUE (along with NA's for NA's in condition, but you won't have any of them in your example). By the way, your use of apply() slows things down and might lead to errors. Try replacing apply(adata[,1:2],1,function(x)any(is.na(x by is.na(adata$y) | is.na(adata$z) or rowSums(is.na(adata[,1:2])) 0 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Schatzi Sent: Wednesday, October 26, 2011 8:25 AM To: r-help@r-project.org Subject: [R] sometimes removing NAs from code Sometimes I have NA values within specific columns of a dataframe (in this example, the first two columns can have NAs). If there are NA values, I would like them to be removed. I have been using the code: y-c(NA,5,4,2,5,6,NA) z-c(NA,3,4,NA,1,3,7) x-1:7 adata-data.frame(y,z,x) adata-adata[-which(apply(adata[,1:2],1,function(x)any(is.na(x,] This works well if there are NA values, but when a dataset doesn't have NA values, this code messes up the dataframe. I was trying to pick apart this code and could not understand why it didn't work when there were no NA values. If there are no NA values and I run just the part: apply(adata[,1:2],1,function(x)any(is.na(x))) it results in: 2 3 5 6 FALSE FALSE FALSE FALSE I was thinking that I can put in an if statement, but I think there has to be a better way. Any ideas/help? Thank you. - In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/sometimes-removing-NAs-from-code- tp3941009p3941009.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating data frame with residuals of a data frame
try this: age- c(5,6,10,14,16,NA,18) value1- c(30,70,40,50,NA,NA,NA) value2- c(2,4,1,4,4,4,4) df- data.frame(age, value1, value2) #Run linear regression to adjust for age and get residuals: lm_f - function(x) { + x- residuals(lm(data=df, formula= x ~ age)) + } resid - apply(df,2,lm_f) resid- resid[-1] for (i in names(resid)){ + newCol - paste(i, 'res', sep = '') + df[[newCol]] - NA # initialize + df[[newCol]][as.integer(names(resid[[i]]))] - resid[[i]] + } df age value1 value2 value1res value2res 1 5 30 2 -16.945813 -0.37398374 2 6 70 4 22.906404 1.50406504 3 10 40 1 -7.684729 -1.98373984 4 14 50 4 1.724138 0.52845528 5 16 NA 4 NA 0.28455285 6 NA NA 4 NA NA 7 18 NA 4 NA 0.04065041 On Mon, Oct 24, 2011 at 10:23 AM, francesca casalino francy.casal...@gmail.com wrote: Dear experts, I am trying to create a data frame from the residuals I get after having applied a linear regression to each column of a data frame, but I don't know how to create this data frame from the resulting list since the list has differing numbers of rows. So for example: age- c(5,6,10,14,16,NA,18) value1- c(30,70,40,50,NA,NA,NA) value2- c(2,4,1,4,4,4,4) df- data.frame(age, value1, value2) #Run linear regression to adjust for age and get residuals: lm_f - function(x) { x- residuals(lm(data=df, formula= x ~ age)) } resid - apply(df,2,lm_f) resid- resid[-1] Then resid is a list with different row numbers: $value1 1 2 3 4 -16.945813 22.906404 -7.684729 1.724138 $value2 1 2 3 4 5 7 -0.37398374 1.50406504 -1.98373984 0.52845528 0.28455285 0.04065041 I am trying to get both the original variable and their residuals in the same data frame like this: age, value1, value2, resid_value1, resid_value2 But when I try cbind or other operations I get an error message because they do not have the same number of rows. Can you please help me figure out how to solve this? Thank you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Want to exclude axis numbering in plot.ca
I don't know what plot.ca is (it's not in base and you gave no package citation), but the usual way is to add xaxt = n to a plot call. Assuming plot.ca is an appropriately defined generic, this should work. E.g., layout(1:2) plot(1:5) plot(1:5, xaxt = n) Michael On Wed, Oct 26, 2011 at 3:59 AM, Mark Webb targetlinkm...@gmail.com wrote: plot.ca gives numbers on each axis. How do I stipulate to exclude these. Have read the R Documentation plot.ca but see no option to exclude axis numbers. Any suggestions? -- Mark Webb Line +27 (21) 786 4379 Cell +27 (72) 199 1000 [Poor reception] Fax +27 (86) 260 1946 Skype tomarkwebb Email targetlinkm...@gmail.com Client ftp http://targetlinkresearch.co.za/cftp/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with means using tail()
Hi all, I have 5 series (5 ts objects: rp, igpm, ereal, jurosreal, crescpib), and want to create a vector with the means of the last values of each variable. What I did was this: mrp1-mean(tail(rp,9)) migpm1-mean(tail(igpm,9)) mereal1-mean(tail(ereal,9)) mjr1-mean(tail(jurosreal,9)) mcp1-mean(tail(crescpib,9)) means=rbind(mrp1,migpm1,mereal1,mjr1,mcp1) They are monthly series, from 1995.1 to 2011.6. So what I did was generate the mean of each variable for [2010.10 to 2011.6] (9 months, as I wanted). But now I want to create a vector with the means of the last 9 values [2010.10 to 2011.6] AND the means of of 9 months but deslocated one month, that is, [2010.9 to 2011.5]. I tried to find examples of this but with no help. Can anyone give a hand? Thanks in advance. Regards, Iara [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lock a package to specific R version
On 26.10.2011 15:12, Mehmet Suzen wrote: -Original Message- From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk] Sent: 26 October 2011 10:12 To: Uwe Ligges Cc: Mehmet Suzen; r-help@r-project.org Subject: Re: [R] lock a package to specific R version On Wed, 26 Oct 2011, Uwe Ligges wrote: On 25.10.2011 11:42, Mehmet Suzen wrote: Hi, I was wondering if it is possible to lock a package to a specific version of R. Dependency attribute in the package DESCRIPTION only accepts= AFAIU Depends: R (= 2.13.2), R (= 2.13.2) Or even use == Dear Professor Ripley, Thank you for the reply. We are maintaining Internal R packages and build binaries for different versions of R base, ranging from 2.8.x to 2.13.x We need to prevent users using wrong versions, but the ones we tested. (we distribute binaries only and package source base is evolving as well) Not sure how to address this, initially I was thinking to put R version in the package version, but package version in description files only allows x.x.x format which doesn't give a room. No, you can have more, if you really want to. I don't see why you would want to do this: why would a package work with 2.13.1 and not 2.13.2, or 2.13.2 and not 2.14.0? Ranges may make sense. Ranges would be much more sensible then ==. Do go ahead with my suggestion. Best wishes, Uwe Best Regards, Mehmet LEGAL NOTICE This message is intended for the use o...{{dropped:10}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SpatialLines
I'm hoping to use R for spatial analysis. In working through examples in Chapt. 4 of Applied Spatial Data Analysis with R I've come across the following error in trying to plot lines with the meuse data set. The text is verbatim from the book. m.sl - SpatialLines(list(Lines(list(Line(cc) Error in Lines(list(Line(cc))) : Single ID required What does Single ID required mean? Thanks. Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic Regression - Variable Selection Methods With Prediction
Can I atleast get help with what pacakge to use for logistic regression with all possible models and do prediction. I know i can use regsubsets but i am not sure if it has any prediction functions to go with it. Thanks On Oct 25, 6:54 pm, RAJ dheerajathr...@gmail.com wrote: Hello, I am pretty new to R, I have always used SAS and SAS products. My target variable is binary ('Y' and 'N') and i have about 14 predictor variables. My goal is to compare different variable selection methods like Forward, Backward, All possible subsests. I am using misclassification rate to pick the winner method. This is what i have as of now, Reg - glm (Graduation ~., DFtrain,family=binomial(link=logit)) step - extractAIC(Reg, direction=forward) pred - predict(Reg, DFtest,type=response) mis - mean({pred 0.5} != {DFtest[,Graduation] == Y}) This program actually works but I needed to check to make sure am doing this right. Also, I am getting the same misclassification rates for all different methods. I also tried to use Reg - leaps(Graduation ~., DFtrain) pred - predict(Reg, DFtest,type=response) mis - mean({pred 0.5} != {DFtest[,Graduation] == Y}) #print(summary(mis)) which doesnt work and Reg - regsubsets(Graduation ~., DFtrain) pred - predict(Reg, DFtest,type=response) mis - mean({pred 0.5} != {DFtest[,Graduation] == Y}) #print(summary(mis)) The Regsubsets will work but the 'predict' function does not work with it. Is there any other way to do predictions when using regsubsets Any help is appreciated. Thanks, __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Guidance with PCA and Regression using complex categorical variables
Hello. I need some guidance. I would like to run PCA and regression, and my predictor variables are mainly complex categorical variables (hundred's of levels for some of them). What packages and functions are useful for this? THanks. sean __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plot complete dataset
Hello, I am a new user of R, so I still have some basic difficulties. I'm trying to create a bar graph completely from reading a file. The idea was on the x axis have the columns of the table Married ,Single,Divorced, widower the legend Ages 18-34 35-45 46-64 65-69 70-74 the dataset dataset Ages Married Single Divorced widower 1 18-3410.5 35.7 8.5 3.2 2 35-4512.4 22.4 22.212.6 3 46-6425.4 22.2 33.412.4 4 65-6936.7 31.4 12.435.2 5 70-7426.4 15.1 8.543.2 The code for barplot barplot(dataset,dataset$Single, col = c(rainbow(dataset$Ages)), legend = rownames(dataset$Ages), ylim = c(0, 100)) but I am not able to resolve. Thanks -- View this message in context: http://r.789695.n4.nabble.com/Plot-complete-dataset-tp3941346p3941346.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sometimes removing NAs from code
Thank you for the help and explanations. I used the complete.cases function and it is working great. adata[complete.cases(adata[,1:2]),] - In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/sometimes-removing-NAs-from-code-tp3941009p3941431.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic Regression - Variable Selection Methods With Prediction
Try the glm package Steve Friedman Ph. D. Ecologist / Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 steve_fried...@nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SpatialLines
On 26/10/2011 1:11 PM, Mark Newcomb wrote: I'm hoping to use R for spatial analysis. In working through examples in Chapt. 4 of Applied Spatial Data Analysis with R I've come across the following error in trying to plot lines with the meuse data set. The text is verbatim from the book. m.sl- SpatialLines(list(Lines(list(Line(cc) Error in Lines(list(Line(cc))) : Single ID required What does Single ID required mean? That message is coming from a contributed package, not from base R. You should say what package you're using, and you may need to contact the author or maintainer of it to get an answer. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic Regression - Variable Selection Methods With Prediction
Hi, On Wed, Oct 26, 2011 at 12:35 PM, RAJ dheerajathr...@gmail.com wrote: Can I atleast get help with what pacakge to use for logistic regression with all possible models and do prediction. I know i can use regsubsets but i am not sure if it has any prediction functions to go with it. Maybe you could try glmnet instead. It doesn't give you all possible models, but rather the best one at a given value for the penalty (lambda) parameter. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic Regression - Variable Selection Methods With Prediction
Check glmulti package for all subset selection. Weidong Gu On Wed, Oct 26, 2011 at 12:35 PM, RAJ dheerajathr...@gmail.com wrote: Can I atleast get help with what pacakge to use for logistic regression with all possible models and do prediction. I know i can use regsubsets but i am not sure if it has any prediction functions to go with it. Thanks On Oct 25, 6:54 pm, RAJ dheerajathr...@gmail.com wrote: Hello, I am pretty new to R, I have always used SAS and SAS products. My target variable is binary ('Y' and 'N') and i have about 14 predictor variables. My goal is to compare different variable selection methods like Forward, Backward, All possible subsests. I am using misclassification rate to pick the winner method. This is what i have as of now, Reg - glm (Graduation ~., DFtrain,family=binomial(link=logit)) step - extractAIC(Reg, direction=forward) pred - predict(Reg, DFtest,type=response) mis - mean({pred 0.5} != {DFtest[,Graduation] == Y}) This program actually works but I needed to check to make sure am doing this right. Also, I am getting the same misclassification rates for all different methods. I also tried to use Reg - leaps(Graduation ~., DFtrain) pred - predict(Reg, DFtest,type=response) mis - mean({pred 0.5} != {DFtest[,Graduation] == Y}) #print(summary(mis)) which doesnt work and Reg - regsubsets(Graduation ~., DFtrain) pred - predict(Reg, DFtest,type=response) mis - mean({pred 0.5} != {DFtest[,Graduation] == Y}) #print(summary(mis)) The Regsubsets will work but the 'predict' function does not work with it. Is there any other way to do predictions when using regsubsets Any help is appreciated. Thanks, __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic Regression - Variable Selection Methods With Prediction
You mean the glm() _function_ in the stats package. ?glm (just to avoid confusion) -- Bert On Wed, Oct 26, 2011 at 10:31 AM, steve_fried...@nps.gov wrote: Try the glm package Steve Friedman Ph. D. Ecologist / Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 steve_fried...@nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic Regression - Variable Selection Methods With Prediction
The reason that you are not likely getting replies is that what you propose to do is considered a poor way of building models. You need to get out of the SAS Mindset. I would suggest you obtain a copy of Frank Harrell's book: http://www.amazon.com/exec/obidos/ASIN/0387952322/ and then consider using his 'rms' package on CRAN to engage in modeling building strategies and validation. Regards, Marc Schwartz On Oct 26, 2011, at 11:35 AM, RAJ wrote: Can I atleast get help with what pacakge to use for logistic regression with all possible models and do prediction. I know i can use regsubsets but i am not sure if it has any prediction functions to go with it. Thanks On Oct 25, 6:54 pm, RAJ dheerajathr...@gmail.com wrote: Hello, I am pretty new to R, I have always used SAS and SAS products. My target variable is binary ('Y' and 'N') and i have about 14 predictor variables. My goal is to compare different variable selection methods like Forward, Backward, All possible subsests. I am using misclassification rate to pick the winner method. This is what i have as of now, Reg - glm (Graduation ~., DFtrain,family=binomial(link=logit)) step - extractAIC(Reg, direction=forward) pred - predict(Reg, DFtest,type=response) mis - mean({pred 0.5} != {DFtest[,Graduation] == Y}) This program actually works but I needed to check to make sure am doing this right. Also, I am getting the same misclassification rates for all different methods. I also tried to use Reg - leaps(Graduation ~., DFtrain) pred - predict(Reg, DFtest,type=response) mis - mean({pred 0.5} != {DFtest[,Graduation] == Y}) #print(summary(mis)) which doesnt work and Reg - regsubsets(Graduation ~., DFtrain) pred - predict(Reg, DFtest,type=response) mis - mean({pred 0.5} != {DFtest[,Graduation] == Y}) #print(summary(mis)) The Regsubsets will work but the 'predict' function does not work with it. Is there any other way to do predictions when using regsubsets Any help is appreciated. Thanks, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot complete dataset
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of RMSOPS Sent: Wednesday, October 26, 2011 9:59 AM To: r-help@r-project.org Subject: [R] Plot complete dataset Hello, I am a new user of R, so I still have some basic difficulties. I'm trying to create a bar graph completely from reading a file. The idea was on the x axis have the columns of the table Married ,Single,Divorced, widower the legend Ages 18-34 35-45 46-64 65-69 70-74 the dataset dataset Ages Married Single Divorced widower 1 18-3410.5 35.7 8.5 3.2 2 35-4512.4 22.4 22.212.6 3 46-6425.4 22.2 33.412.4 4 65-6936.7 31.4 12.435.2 5 70-7426.4 15.1 8.543.2 The code for barplot barplot(dataset,dataset$Single, col = c(rainbow(dataset$Ages)), legend = rownames(dataset$Ages), ylim = c(0, 100)) but I am not able to resolve. Thanks You should go back and read the help for barplot. Do you really want to plot the whole dataset (say as a stacked barplot)? Then something like this should do it. barplot(as.matrix(dataset[,2:5]), col = c(lightblue, mistyrose, lightcyan, lavender, cornsilk), legend = dataset$Ages, ylim = c(0, 100)) Your values don't add to 100, so I'm not sure what you actually want. If this isn't what you want then give us more information. Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extra Sums of Squares from an anova table - why are the values different?
#For full disclosure- I am working on a homework problem. However, my question revolves around computer rounding, I think. x - (structure(list(y = c(0.222, 0.395, 0.422, 0.437, 0.428, 0.467, 0.444, 0.378, 0.494, 0.456, 0.452, 0.112, 0.432, 0.101, 0.232, 0.306, 0.0923, 0.116, 0.0764, 0.439, 0.0944, 0.117, 0.0726, 0.0412, 0.251, 2e-05), x1 = c(7.3, 8.7, 8.8, 8.1, 9, 8.7, 9.3, 7.6, 10, 8.4, 9.3, 7.7, 9.8, 7.3, 8.5, 9.5, 7.4, 7.8, 7.7, 10.3, 7.8, 7.1, 7.7, 7.4, 7.3, 7.6), x2 = c(0, 0, 0.7, 4, 0.5, 1.5, 2.1, 5.1, 0, 3.7, 3.6, 2.8, 4.2, 2.5, 2, 2.5, 2.8, 2.8, 3, 1.7, 3.3, 3.9, 4.3, 6, 2, 7.8), x3 = c(0, 0.3, 1, 0.2, 1, 2.8, 1, 3.4, 0.3, 4.1, 2, 7.1, 2, 6.8, 6.6, 5, 7.8, 7.7, 8, 4.2, 8.5, 6.6, 9.5, 10.9, 5.2, 20.7), x11 = c(53.29, 75.69, 77.44, 65.61, 81, 75.69, 86.49, 57.76, 100, 70.56, 86.49, 59.29, 96.04, 53.29, 72.25, 90.25, 54.76, 60.84, 59.29, 106.09, 60.84, 50.41, 59.29, 54.76, 53.29, 57.76), x22 = c(0, 0, 0.49, 16, 0.25, 2.25, 4.41, 26.01, 0, 13.69, 12.96, 7.84, 17.64, 6.25, 4, 6.25, 7.84, 7.84, 9, 2.89, 10.89, 15.21, 18.49, 36, 4, 60.84), x33 = c(0, 0.09, 1, 0.04, 1, 7.84, 1, 11.56, 0.09, 16.81, 4, 50.41, 4, 46.24, 43.56, 25, 60.84, 59.29, 64, 17.64, 72.25, 43.56, 90.25, 118.81, 27.04, 428.49), x12 = c(0, 0, 6.16, 32.4, 4.5, 13.05, 19.53, 38.76, 0, 31.08, 33.48, 21.56, 41.16, 18.25, 17, 23.75, 20.72, 21.84, 23.1, 17.51, 25.74, 27.69, 33.11, 44.4, 14.6, 59.28), x13 = c(0, 2.61, 8.8, 1.62, 9, 24.36, 9.3, 25.84, 3, 34.44, 18.6, 54.67, 19.6, 49.64, 56.1, 47.5, 57.72, 60.06, 61.6, 43.26, 66.3, 46.86, 73.15, 80.66, 37.96, 157.32), x23 = c(0, 0, 0.7, 0.8, 0.5, 4.2, 2.1, 17.34, 0, 15.17, 7.2, 19.88, 8.4, 17, 13.2, 12.5, 21.84, 21.56, 24, 7.14, 28.05, 25.74, 40.85, 65.4, 10.4, 161.46)), .Names = c(y, x1, x2, x3, x11, x22, x33, x12, x13, x23), row.names = c(NA, -26L), class = data.frame) ) x$x11 - x$x1^2 x$x22 - x$x2^2 x$x33 - x$x3^2 x$x12 - x$x1*x$x2 x$x13 - x$x1*x$x3 x$x23 - x$x2*x$x3 x.lm - lm(y~x1+x2+x3+x11+x22+x33+x12+x13+x23, data=x) anova(lm(y~x1+x2+x3,data=x), x.lm) anova(x.lm) #I want to test #Ho:y~x1+x2+x3 #Ha:y~x1+x2+x3+x11+x22+x33+x12+x13+x23 ((0.00945+0.01340+0.00200+0.00568+0.00489+0.00050)/6)/(0.00371) #Thanks #Stephen Sefick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correlation Matrix in R
Alex, corr.test in psych will give you a matrix of correlations, a matrix of sample sizes, and a matrix of probabilities. You can combine the correlations and the probabilities to form what you want: try the following: library(psych) examp - corr.test(sat.act) mat.c.p - lower.tri(examp$r)*examp$r + t(lower.tri(examp$p)*examp$p) mat.cp Bill On Oct 26, 2011, at 6:03 AM, AlexC wrote: Thank you for your quick reply and helpful advice. Using this argument allows me to do what I needed to do Now the only other thing I wanted to accomplish was to obtain the top half of the matrix with p values and the bottom half with the correlations, to observe the significant correlations. I have been able to use a few functions such as rcorr, and cor.matrix to get such information but it isn't output in a format that I can save with the write.table function or write.clipboard the pair function allows a graphical display of the data on the other hand (with correlation graphics on the bottom half) and I have added an argument which allows to view the significant p values. But I wanted to know if I could also do the above easily. -- View this message in context: http://r.789695.n4.nabble.com/Correlation-Matrix-in-R-tp3938274p3940170.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. William Revellehttp://personality-project.org/revelle.html Professor http://personality-project.org Department of Psychology http://www.wcas.northwestern.edu/psych/ Northwestern Universityhttp://www.northwestern.edu/ Use R for psychology http://personality-project.org/r It is 6 minutes to midnighthttp://www.thebulletin.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] survival: fitting equation to survival curve?
Given a survfit object, is it possible to fit an equation to the resulting survival curve? What about with a coxph or survreg object? TIA, Rob [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extract data for specific levels factor
Dear all, Thanks for your help. Option of Sarah and Dan is just what I want: ff1-mydata[mydata$cat%in%c(“wish1”, “wish2”, “wish3”),] Then I used ff1 in ggplot2 without problems. Option of Dennis (reshape2) does produced an output “no coherent”: content in he object contained data of categorical variable but not data of “ind”. Option of Dennis (ggplot2) does functioned well, but when I put more than one clasification in categ the following output is obtained: ggplot(subset(mydata, categ=='folic tot', 'porc fol marc'), aes(age,ind))+geom_point() Error in `[.data.frame`(x, r, vars, drop = drop) : undefined columns selected I tryed the same but added ddply: qq-ddply(mydata, .(ind), subset, categ=(porc fol tot, porc fol marc)) Error: unexpected ',' in qq-ddply(mydata, .(ind), subset, categ=(porc fol tot, Any idea about the last? Andrés AM 2011/10/25, Dennis Murphy djmu...@gmail.com: Are you trying to separate the substrings in cat? If so, one way is to use the colsplit() function in the reshape2 package, something like (untested since you did not provide a suitable data format with which to work): library('reshape2') splitcat - colsplit(mydata$cat, ' ', names = c('fat', 'bat', 'rat')) moredat - cbind(mydata, splitcat) Other options that might pertain to your request include: (i) subset(mydata, cat == 'por fol pec') which you can use as a data argument inside ggplot2 - e.g., ggplot(subset(mydata, cat == 'por fol pec'), aes(x = age, y = ind)) + geom_point() (ii) use faceting to get individual plots by factor level of cat - e.g., ggplot(mydata, aes(x = age, y = ind)) + geom_point() + facet_grid( ~ cat) Hope that one of these is close to the bullseye... Dennis 2011/10/25 Andrés Aragón armand...@gmail.com: Dear all, I'm trying to analyze data with the following structure: ind cattx age 40.2 por fol peq vh35 41.9 por fol med vh35 68.9 por fol preov vh 35 71.5 por fol peq ser 37 67.5 por fol medser 37 76.9 por fol preov ser 37 78.7 por fol peq otr 37 78.3 por fol medotr 37 82.1 por fol preov otr 37 83.9 por fol peq vh 37 80.6 por fol med vh 37 76.1 por fol preov vh 37 86.9 por fol peqser 35 97.7 por fol med ser 35 62.3 por fol preov ser 35 I want to separate exclusively some of factor levels (“por fol peq” in the “cat” colum). I am using ggplot2 and I only can plot all of factors, not separately. I did try ddply without success. Any help is welcome. Andrés __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extra Sums of Squares from an anova table - why are the values different?
Hi Stephen, Thanks for the disclosure. If you are referring to the difference in the third decimal place between your calculated F value and what R gives, yes, it is due to rounding. Try this: ## extract the mean squares from anova() and store in msq msq - anova(x.lm)[, Mean Sq] mean(msq[4:9])/msq[10] Cheers, Josh On Wed, Oct 26, 2011 at 11:19 AM, Stephen Sefick ssef...@gmail.com wrote: #For full disclosure- I am working on a homework problem. However, my question revolves around computer rounding, I think. x - (structure(list(y = c(0.222, 0.395, 0.422, 0.437, 0.428, 0.467, 0.444, 0.378, 0.494, 0.456, 0.452, 0.112, 0.432, 0.101, 0.232, 0.306, 0.0923, 0.116, 0.0764, 0.439, 0.0944, 0.117, 0.0726, 0.0412, 0.251, 2e-05), x1 = c(7.3, 8.7, 8.8, 8.1, 9, 8.7, 9.3, 7.6, 10, 8.4, 9.3, 7.7, 9.8, 7.3, 8.5, 9.5, 7.4, 7.8, 7.7, 10.3, 7.8, 7.1, 7.7, 7.4, 7.3, 7.6), x2 = c(0, 0, 0.7, 4, 0.5, 1.5, 2.1, 5.1, 0, 3.7, 3.6, 2.8, 4.2, 2.5, 2, 2.5, 2.8, 2.8, 3, 1.7, 3.3, 3.9, 4.3, 6, 2, 7.8), x3 = c(0, 0.3, 1, 0.2, 1, 2.8, 1, 3.4, 0.3, 4.1, 2, 7.1, 2, 6.8, 6.6, 5, 7.8, 7.7, 8, 4.2, 8.5, 6.6, 9.5, 10.9, 5.2, 20.7), x11 = c(53.29, 75.69, 77.44, 65.61, 81, 75.69, 86.49, 57.76, 100, 70.56, 86.49, 59.29, 96.04, 53.29, 72.25, 90.25, 54.76, 60.84, 59.29, 106.09, 60.84, 50.41, 59.29, 54.76, 53.29, 57.76), x22 = c(0, 0, 0.49, 16, 0.25, 2.25, 4.41, 26.01, 0, 13.69, 12.96, 7.84, 17.64, 6.25, 4, 6.25, 7.84, 7.84, 9, 2.89, 10.89, 15.21, 18.49, 36, 4, 60.84), x33 = c(0, 0.09, 1, 0.04, 1, 7.84, 1, 11.56, 0.09, 16.81, 4, 50.41, 4, 46.24, 43.56, 25, 60.84, 59.29, 64, 17.64, 72.25, 43.56, 90.25, 118.81, 27.04, 428.49), x12 = c(0, 0, 6.16, 32.4, 4.5, 13.05, 19.53, 38.76, 0, 31.08, 33.48, 21.56, 41.16, 18.25, 17, 23.75, 20.72, 21.84, 23.1, 17.51, 25.74, 27.69, 33.11, 44.4, 14.6, 59.28), x13 = c(0, 2.61, 8.8, 1.62, 9, 24.36, 9.3, 25.84, 3, 34.44, 18.6, 54.67, 19.6, 49.64, 56.1, 47.5, 57.72, 60.06, 61.6, 43.26, 66.3, 46.86, 73.15, 80.66, 37.96, 157.32), x23 = c(0, 0, 0.7, 0.8, 0.5, 4.2, 2.1, 17.34, 0, 15.17, 7.2, 19.88, 8.4, 17, 13.2, 12.5, 21.84, 21.56, 24, 7.14, 28.05, 25.74, 40.85, 65.4, 10.4, 161.46)), .Names = c(y, x1, x2, x3, x11, x22, x33, x12, x13, x23), row.names = c(NA, -26L), class = data.frame) ) x$x11 - x$x1^2 x$x22 - x$x2^2 x$x33 - x$x3^2 x$x12 - x$x1*x$x2 x$x13 - x$x1*x$x3 x$x23 - x$x2*x$x3 x.lm - lm(y~x1+x2+x3+x11+x22+x33+x12+x13+x23, data=x) anova(lm(y~x1+x2+x3,data=x), x.lm) anova(x.lm) #I want to test #Ho:y~x1+x2+x3 #Ha:y~x1+x2+x3+x11+x22+x33+x12+x13+x23 ((0.00945+0.01340+0.00200+0.00568+0.00489+0.00050)/6)/(0.00371) #Thanks #Stephen Sefick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] survival: fitting equation to survival curve?
Given a survfit object, is it possible to fit an equation to the resulting survival curve? Is this possible? What about with a coxph or survreg object? TIA, Rob [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extra Sums of Squares from an anova table - why are the values different?
I was referring to the 3rd decimal place and beyond. Thanks that did the trick. I was trying to compare the two to make sure that I knew how to do it by hand. Thanks for all of your help. Stephen On Wed 26 Oct 2011 02:23:02 PM CDT, Joshua Wiley wrote: Hi Stephen, Thanks for the disclosure. If you are referring to the difference in the third decimal place between your calculated F value and what R gives, yes, it is due to rounding. Try this: ## extract the mean squares from anova() and store in msq msq- anova(x.lm)[, Mean Sq] mean(msq[4:9])/msq[10] Cheers, Josh On Wed, Oct 26, 2011 at 11:19 AM, Stephen Sefickssef...@gmail.com wrote: #For full disclosure- I am working on a homework problem. However, my question revolves around computer rounding, I think. x- (structure(list(y = c(0.222, 0.395, 0.422, 0.437, 0.428, 0.467, 0.444, 0.378, 0.494, 0.456, 0.452, 0.112, 0.432, 0.101, 0.232, 0.306, 0.0923, 0.116, 0.0764, 0.439, 0.0944, 0.117, 0.0726, 0.0412, 0.251, 2e-05), x1 = c(7.3, 8.7, 8.8, 8.1, 9, 8.7, 9.3, 7.6, 10, 8.4, 9.3, 7.7, 9.8, 7.3, 8.5, 9.5, 7.4, 7.8, 7.7, 10.3, 7.8, 7.1, 7.7, 7.4, 7.3, 7.6), x2 = c(0, 0, 0.7, 4, 0.5, 1.5, 2.1, 5.1, 0, 3.7, 3.6, 2.8, 4.2, 2.5, 2, 2.5, 2.8, 2.8, 3, 1.7, 3.3, 3.9, 4.3, 6, 2, 7.8), x3 = c(0, 0.3, 1, 0.2, 1, 2.8, 1, 3.4, 0.3, 4.1, 2, 7.1, 2, 6.8, 6.6, 5, 7.8, 7.7, 8, 4.2, 8.5, 6.6, 9.5, 10.9, 5.2, 20.7), x11 = c(53.29, 75.69, 77.44, 65.61, 81, 75.69, 86.49, 57.76, 100, 70.56, 86.49, 59.29, 96.04, 53.29, 72.25, 90.25, 54.76, 60.84, 59.29, 106.09, 60.84, 50.41, 59.29, 54.76, 53.29, 57.76), x22 = c(0, 0, 0.49, 16, 0.25, 2.25, 4.41, 26.01, 0, 13.69, 12.96, 7.84, 17.64, 6.25, 4, 6.25, 7.84, 7.84, 9, 2.89, 10.89, 15.21, 18.49, 36, 4, 60.84), x33 = c(0, 0.09, 1, 0.04, 1, 7.84, 1, 11.56, 0.09, 16.81, 4, 50.41, 4, 46.24, 43.56, 25, 60.84, 59.29, 64, 17.64, 72.25, 43.56, 90.25, 118.81, 27.04, 428.49), x12 = c(0, 0, 6.16, 32.4, 4.5, 13.05, 19.53, 38.76, 0, 31.08, 33.48, 21.56, 41.16, 18.25, 17, 23.75, 20.72, 21.84, 23.1, 17.51, 25.74, 27.69, 33.11, 44.4, 14.6, 59.28), x13 = c(0, 2.61, 8.8, 1.62, 9, 24.36, 9.3, 25.84, 3, 34.44, 18.6, 54.67, 19.6, 49.64, 56.1, 47.5, 57.72, 60.06, 61.6, 43.26, 66.3, 46.86, 73.15, 80.66, 37.96, 157.32), x23 = c(0, 0, 0.7, 0.8, 0.5, 4.2, 2.1, 17.34, 0, 15.17, 7.2, 19.88, 8.4, 17, 13.2, 12.5, 21.84, 21.56, 24, 7.14, 28.05, 25.74, 40.85, 65.4, 10.4, 161.46)), .Names = c(y, x1, x2, x3, x11, x22, x33, x12, x13, x23), row.names = c(NA, -26L), class = data.frame) ) x$x11- x$x1^2 x$x22- x$x2^2 x$x33- x$x3^2 x$x12- x$x1*x$x2 x$x13- x$x1*x$x3 x$x23- x$x2*x$x3 x.lm- lm(y~x1+x2+x3+x11+x22+x33+x12+x13+x23, data=x) anova(lm(y~x1+x2+x3,data=x), x.lm) anova(x.lm) #I want to test #Ho:y~x1+x2+x3 #Ha:y~x1+x2+x3+x11+x22+x33+x12+x13+x23 ((0.00945+0.01340+0.00200+0.00568+0.00489+0.00050)/6)/(0.00371) #Thanks #Stephen Sefick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] FOR loop with statistical analysis for microarray data
hi all i started recently using R and i found myself stuck when i try to analyze microarray data. i use the affy package to obtain the intensities of the probes, i have two CTRs and two treated. HG.U133A.Experiment1.CEL HG.U133A.Experiment2.CEL HG.U133A_Control1.CEL HG.U133A_Control2.CEL 1007_s_at 2156.23115467.75615 364.60615 362.11865 1053_at 88.76368 93.58436 438.49365 357.75615 117_at 144.00743101.26120 95.7 107.01623 121_at 551.36865639.45615 456.66865 435.95615 1255_g_at 65.33164 18.39570 14.22565 20.74632 1294_at 106.19083169.69369 78.15722 81.14689 i divided the first two columns in two data.frames to divide Experim and CTRs then, i created a FOR loop to create a vector per each row containing a vector with two values per each gene and i wanted to do a Wilcox.test to obtain the significant genes..BUT i get a list of NULL like you can see here ..the first row works but then i get NULL down till the end of the array... fcpv [1,] 1007_s_at -20.248 0.4664612 [2,] 1053_at -344.7132 NULL [3,] 117_atNULL NULL [4,] 121_atNULL NULL [5,] 1255_g_at NULL NULL [6,] 1294_at NULL NULL the script i used is: === fc=0 pv=0 for (i in 1:nrow(data)) { v1= c(y1[i,1], y1[i,2]) v2= c(y2[i,1], y2[1,2]) fc=v1-v2 w=t.test(v1,v2) pv=w$p.value fc[i]= w[1] pv[i]= w[2] } results = cbind(row.names(y1), fc, pv) head(results) what did i do wrong? i can't find a way around this!!! thanks so much!!! Seb __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correlation Matrix in R
Hi, rcor.test in library(ltm) will provide a correlation matrix with p-values on the bottom-half of the matrix. Mark On 2011-10-26, at 7:03 AM, AlexC wrote: Thank you for your quick reply and helpful advice. Using this argument allows me to do what I needed to do Now the only other thing I wanted to accomplish was to obtain the top half of the matrix with p values and the bottom half with the correlations, to observe the significant correlations. I have been able to use a few functions such as rcorr, and cor.matrix to get such information but it isn't output in a format that I can save with the write.table function or write.clipboard the pair function allows a graphical display of the data on the other hand (with correlation graphics on the bottom half) and I have added an argument which allows to view the significant p values. But I wanted to know if I could also do the above easily. -- View this message in context: http://r.789695.n4.nabble.com/Correlation-Matrix-in-R-tp3938274p3940170.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FOR loop with statistical analysis for microarray data
affy is a bioconductor package. You should be asking this question on the bioc mailing list. -- David. On Oct 26, 2011, at 4:56 PM, Seb wrote: hi all i started recently using R and i found myself stuck when i try to analyze microarray data. i use the affy package to obtain the intensities of the probes, i have two CTRs and two treated. HG.U133A.Experiment1.CEL HG.U133A.Experiment2.CEL HG.U133A_Control1.CEL HG.U133A_Control2.CEL 1007_s_at 2156.23115467.75615 364.60615 362.11865 1053_at 88.76368 93.58436 438.49365 357.75615 117_at 144.00743101.26120 95.7 107.01623 121_at 551.36865639.45615 456.66865 435.95615 1255_g_at 65.33164 18.39570 14.22565 20.74632 1294_at 106.19083169.69369 78.15722 81.14689 i divided the first two columns in two data.frames to divide Experim and CTRs then, i created a FOR loop to create a vector per each row containing a vector with two values per each gene and i wanted to do a Wilcox.test to obtain the significant genes..BUT i get a list of NULL like you can see here ..the first row works but then i get NULL down till the end of the array... fcpv [1,] 1007_s_at -20.248 0.4664612 [2,] 1053_at -344.7132 NULL [3,] 117_atNULL NULL [4,] 121_atNULL NULL [5,] 1255_g_at NULL NULL [6,] 1294_at NULL NULL the script i used is: === fc=0 pv=0 for (i in 1:nrow(data)) { v1= c(y1[i,1], y1[i,2]) v2= c(y2[i,1], y2[1,2]) fc=v1-v2 w=t.test(v1,v2) pv=w$p.value fc[i]= w[1] pv[i]= w[2] } results = cbind(row.names(y1), fc, pv) head(results) what did i do wrong? i can't find a way around this!!! thanks so much!!! Seb __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting the number of marginals
Dear all, I have two matrices lets call them A and B. Each of which is a 100 x 3 matrix. What I do is take the corresponding row from each matrix and form 100 2 x 3 tables. If we call the column sums for each 2 x 3 n1, n2 and n3, I would like to compute the following probability: Basically the number of (n1 = a, n2 = b) in all the tables divided by the number of tables. Its the probability that a table has a particular column total. So if A was 0 2 3 0 2 3 1 2 4 2 3 3 B was 1 0 3 1 0 2 1 2 4 2 3 3 The 2 x 3 tables would be: 0 2 3 1 0 3 Totals (1,2) # first 2 totals 0 2 3 1 0 2 Totals (1,2) 1 2 4 1 2 4 Totals (2,4) 2 3 3 2 3 3 Totals (4,6) The expected probabilities I should get are: 0.5, 0.5, 0.25 and 0.25 for each of the 2 x 3 tables. Any help is greatly appreciated. -- Thanks, Jim. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Where can I find cmeans {e1071} package?
Hello, I need a Fuzzy C Means algorithm. I found some documentation about cmeans {e1071} at http://rss.acs.unt.edu/Rdoc/library/e1071/html/cmeans.html Does someone knows where I can find it? Thank you Rui [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Where can I find cmeans {e1071} package?
On Wed, Oct 26, 2011 at 6:40 PM, Rui Esteves ruimax...@gmail.com wrote: Hello, I need a Fuzzy C Means algorithm. I found some documentation about cmeans {e1071} at http://rss.acs.unt.edu/Rdoc/library/e1071/html/cmeans.html Does someone knows where I can find it? e1071 is a package, and you can use install.packages() from R to install it, or download it directly from the CRAN mirror nearest you. http://cran.r-project.org/ This is a very basic question; I suspect you'd benefit from reading one of the many Introduction to R documents available online. Sarah Thank you Rui [[alternative HTML version deleted]] -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FOR loop with statistical analysis for microarray data
If you provide an example data (y1 and y2 in the loop), you might have got specific helps already. A few things in your loop seem suspicious. fc and pv are vectors, and in each loop you redesigned the whole vectors and specific indices twice. That may cause your problems. Weidong Gu On Wed, Oct 26, 2011 at 4:56 PM, Seb seba@gmail.com wrote: hi all i started recently using R and i found myself stuck when i try to analyze microarray data. i use the affy package to obtain the intensities of the probes, i have two CTRs and two treated. HG.U133A.Experiment1.CEL HG.U133A.Experiment2.CEL HG.U133A_Control1.CEL HG.U133A_Control2.CEL 1007_s_at 2156.23115 467.75615 364.60615 362.11865 1053_at 88.76368 93.58436 438.49365 357.75615 117_at 144.00743 101.26120 95.7 107.01623 121_at 551.36865 639.45615 456.66865 435.95615 1255_g_at 65.33164 18.39570 14.22565 20.74632 1294_at 106.19083 169.69369 78.15722 81.14689 i divided the first two columns in two data.frames to divide Experim and CTRs then, i created a FOR loop to create a vector per each row containing a vector with two values per each gene and i wanted to do a Wilcox.test to obtain the significant genes..BUT i get a list of NULL like you can see here ..the first row works but then i get NULL down till the end of the array... fc pv [1,] 1007_s_at -20.248 0.4664612 [2,] 1053_at -344.7132 NULL [3,] 117_at NULL NULL [4,] 121_at NULL NULL [5,] 1255_g_at NULL NULL [6,] 1294_at NULL NULL the script i used is: === fc=0 pv=0 for (i in 1:nrow(data)) { v1= c(y1[i,1], y1[i,2]) v2= c(y2[i,1], y2[1,2]) fc=v1-v2 w=t.test(v1,v2) pv=w$p.value fc[i]= w[1] pv[i]= w[2] } results = cbind(row.names(y1), fc, pv) head(results) what did i do wrong? i can't find a way around this!!! thanks so much!!! Seb __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Example(chron) doesn't work
It works with Rgui vanilla, R version 2.13.1. I'll check it again when I install R version 2.13.2. Many thanks! C:\\Program Files\\R\\R-2.13.1\\bin\\x64\\Rgui.exe --vanilla [1] C:\\Program Files\\R\\R-2.13.1\\bin\\x64\\Rgui.exe --vanilla library(chron) Warning message: package 'chron' was built under R version 2.13.2 example(chron) chron dts - dates(c(02/27/92, 02/27/92, 01/14/92, chron+02/28/92, 02/01/92)) chron dts [1] 02/27/92 02/27/92 01/14/92 02/28/92 02/01/92 chron # [1] 02/27/92 02/27/92 01/14/92 02/28/92 02/01/92 chron tms - times(c(23:03:20, 22:29:56, 01:03:30, chron+18:21:03, 16:56:26)) chron tms [1] 23:03:20 22:29:56 01:03:30 18:21:03 16:56:26 chron # [1] 23:03:20 22:29:56 01:03:30 18:21:03 16:56:26 chron x - chron(dates = dts, times = tms) chron x [1] (02/27/92 23:03:20) (02/27/92 22:29:56) (01/14/92 01:03:30) [4] (02/28/92 18:21:03) (02/01/92 16:56:26) chron # [1] (02/27/92 23:03:19) (02/27/92 22:29:56) (01/14/92 01:03:30) chron # [4] (02/28/92 18:21:03) (02/01/92 16:56:26) chron chron # We can add or subtract scalars (representing days) to dates or chron # chron objects: chron c(dts[1], dts[1] + 10) [1] 02/27/92 03/08/92 chron # [1] 02/27/92 03/08/92 chron dts[1] - 31 [1] 01/27/92 chron # [1] 01/27/92 chron chron # We can substract dates which results in a times object that chron # represents days between the operands: chron dts[1] - dts[3] Time in days: [1] 44 chron # Time in days: chron # [1] 44 chron chron # Logical comparisons work as expected: chron dts[dts 01/25/92] [1] 02/27/92 02/27/92 02/28/92 02/01/92 chron # [1] 02/27/92 02/27/92 02/28/92 02/01/92 chron dts dts[3] [1] TRUE TRUE FALSE TRUE TRUE chron # [1] TRUE TRUE FALSE TRUE TRUE chron chron # Summary operations which are sensible are permitted and work as chron # expected: chron range(dts) [1] 01/14/92 02/28/92 chron # [1] 01/14/92 02/28/92 chron diff(x) Time in days: [1] -0.02319444 -44.89335648 45.72052083 -27.05876157 chron # Time in days: chron # [1] -0.02319444 -44.89335648 45.72052083 -27.05876157 chron sort(dts)[1:3] [1] 01/14/92 02/01/92 02/27/92 chron # [1] 01/14/92 02/01/92 02/27/92 chron chron chron -- View this message in context: http://r.789695.n4.nabble.com/Example-chron-doesn-t-work-tp801580p3942640.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SpatialLines
In addition to which, R-sig-geo would be better place to ask. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 10/26/11 10:39 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 26/10/2011 1:11 PM, Mark Newcomb wrote: I'm hoping to use R for spatial analysis. In working through examples in Chapt. 4 of Applied Spatial Data Analysis with R I've come across the following error in trying to plot lines with the meuse data set. The text is verbatim from the book. m.sl- SpatialLines(list(Lines(list(Line(cc) Error in Lines(list(Line(cc))) : Single ID required What does Single ID required mean? That message is coming from a contributed package, not from base R. You should say what package you're using, and you may need to contact the author or maintainer of it to get an answer. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Webscraping - How to Scrape Out Text Into R As If Copied Pasted From Webpage?
Greetings, I am trying to get all of the text from a web page as if I selected all on the page, pasted into a text file, and then read in the text file with read.csv(). # this is the actual page I'm trying to acquire text from: web.pg - readLines(http://www.airweb.org/?page=574;) # then parsed in hopes of an easier structure to work with: web.pg - htmlTreeParse(file=web.pg, ignoreBlanks=TRUE) Now I have a lovely html tree, but don't know the best way to get just the text components (job descriptions, job titles, etc...) as they appear on the web site. I'd like to do a little text mining and make a wordcloud using the text. Can anybody suggest a method to achieve this result? Thank you, Gary R. Moser Institutional Research Analyst Heald College p - 415.808.1533 f - 415.808.1598 gary_mo...@heald.edu mailto:gary_mo...@heald.edu Disclaimer: This communication may contain Heald College confidential and proprietary data. This message is intended only for the personal and confidential use of the designated recipients named above. If you are not the intended recipient of this message you are hereby notified that any review, dissemination, distribution or copying of this message is strictly prohibited. In addition, if you have received this message in error, please advise the sender by reply email and delete the message. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Webscraping - How to Scrape Out Text Into R As If Copied Pasted From Webpage?
Use XPATH query: web.pg - htmlTreeParse(file=web.pg, ignoreBlanks=TRUE, useInternalNodes = TRUE) # Job title xpathApply(web.pg, //span[@class='normal']//b, xmlValue) On Wed, Oct 26, 2011 at 9:36 PM, Moser, Gary gary_mo...@heald.edu wrote: Greetings, I am trying to get all of the text from a web page as if I selected all on the page, pasted into a text file, and then read in the text file with read.csv(). # this is the actual page I'm trying to acquire text from: web.pg - readLines(http://www.airweb.org/?page=574;) # then parsed in hopes of an easier structure to work with: web.pg - htmlTreeParse(file=web.pg, ignoreBlanks=TRUE) Now I have a lovely html tree, but don't know the best way to get just the text components (job descriptions, job titles, etc...) as they appear on the web site. I'd like to do a little text mining and make a wordcloud using the text. Can anybody suggest a method to achieve this result? Thank you, Gary R. Moser Institutional Research Analyst Heald College p - 415.808.1533 f - 415.808.1598 gary_mo...@heald.edu mailto:gary_mo...@heald.edu Disclaimer: This communication may contain Heald College confidential and proprietary data. This message is intended only for the personal and confidential use of the designated recipients named above. If you are not the intended recipient of this message you are hereby notified that any review, dissemination, distribution or copying of this message is strictly prohibited. In addition, if you have received this message in error, please advise the sender by reply email and delete the message. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Consistant test for NAs in a factor when exclude = NULL?
Dear folks? Is there a function to correctly find (and count) the NAs in a factor when exclude=NULL, regardless of whether their origin is in the original data or by subsequent assignment? In example number 1 below, where NAs are assigned by is.na()-, testing the factor with is.na() finds the correct number of NAs. In example number 2, where the NAs are from the data, neither is.na(), ==NA, nor ==NA correctly identifies the NAs. In example number 3, which mixes NAs from assignment with NAs from data, is.na does not even find the NAs created by assignment, as it did in example 1. I'm running R 2.13.2 on Windows XP with ServicePack 3 Any assistance would be greatly appreciated. Appreciatively, andrewH Example #1 # Origin: is.na()- Exclude: NULL KK - factor(c(A,A,B,B,C,C), exclude=NULL) KK[KK==C] [1] C C Levels: A B C is.na(KK[KK==C]) - TRUE KK [1] AABBNA NA Levels: A B C levels(KK) [1] A B C levels(KK)[KK] [1] A A B B NA NA KK==NA [1] NA NA NA NA NA NA sum(KK==NA) [1] NA KK==NA [1] FALSE FALSE FALSE FALSENANA sum(KK==NA) [1] NA is.na(KK) [1] FALSE FALSE FALSE FALSE TRUE TRUE sum(is.na(KK)) [1] 2 Example #2 # Origin: data Exclude: NULL GG - factor(c(A,A,B,B, NA, NA), exclude=NULL) GG [1] AABBNA NA Levels: A B NA levels(GG) [1] A B NA levels(GG)[GG] [1] A A B B NA NA GG==NA [1] NA NA NA NA NA NA sum(GG==NA) [1] NA GG==NA [1] FALSE FALSE FALSE FALSE FALSE FALSE sum(GG==NA) [1] 0 is.na(GG) [1] FALSE FALSE FALSE FALSE FALSE FALSE sum(is.na(GG)) Example #3. MM - factor(c(A,A,B,B,C,C, NA), exclude=NULL) is.na(MM[MM==C]) - TRUE MM [1] AABBNA NA NA Levels: A B C NA levels(MM) [1] A B C NA levels(MM)[MM] [1] A A B B NA NA NA MM==NA [1] NA NA NA NA NA NA NA sum(MM==NA) [1] NA MM==NA [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE sum(MM==NA) [1] 0 is.na(MM) [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE sum(is.na(MM)) [1] 0 -- View this message in context: http://r.789695.n4.nabble.com/Consistant-test-for-NAs-in-a-factor-when-exclude-NULL-tp3942755p3942755.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.