Re: [R] network package in R
Weiwei, I know this is not a Bioconductor-specific question, but you may also want to post it on BiC help list, as there may be more people there understand what you want to do. I'm also curious about the answers to your question. ...Tao - Original Message From: Weiwei Shi helprh...@gmail.com To: r-h...@stat.math.ethz.ch r-h...@stat.math.ethz.ch Sent: Fri, May 27, 2011 2:32:23 PM Subject: [R] network package in R Hi there, I need a network builder and it can change the node size and color; I am not sure if network package in R can do this or not. The other functions I wanted have been found in that package. BTW, if there is another package in R relating to this, please suggest too. Thanks, Weiwei -- Weiwei Shi, Ph.D Research Scientist Did you always know? No, I did not. But I believed... ---Matrix III [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sobol Sequences - Convergence Issues
Hi, My question is regarding the R package randtoolbox. I was testing it for few days for generating sobol sequences. As per my findings the numbers for higher dimensions 30 are prone to clustering. This might be due to bad choice of initialization numbers or in some lingo called free direction numbers (no I am not talking about the seed value). Generally in most bookish algo's they are chosen as 2^(b-p) where b = the number of bits p = p'th direction number where p runs from 1 to g (g = polynomial used in sobol sequence which is of degree g). I hope the notation is clear. If this direction number had been chosen (as mentioned above) then sobol sequence creates a problem. Can this be modified? I want to specify my own direction numbers. I checked the code, and found that it points to a fortran code file. Any suggestions? -- Animesh Saxena www.quantanalysis.in __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Arrange a multi-level list to a one-level list
Hi, Phil, Yes. That's what I am looking for. Thank you so much. Lisa -- View this message in context: http://r.789695.n4.nabble.com/Arrange-a-multi-level-list-to-a-one-level-list-tp3556500p3556601.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with barplot
Thanks, ggplot is on my list of things to learn before Hadley comes here to the bay area to give a session on interactive graphics in R On Fri, May 27, 2011 at 10:29 PM, Joshua Wiley jwiley.ps...@gmail.comwrote: Hi Steven, This is not, strictly speaking, the answer to your question (hopefully Tom already answered that). Rather, it is the answer to questions you *might* have asked (and perhaps one of them will be one you wished you had asked). Barplots have a low data:ink ratio...you are using an entire plot to convey 8 means. A variety of alternatives exist. As a minimal first step, you could just use points to show the means and skip all the wasted bar space, and you might add error bars in (A). You could also use boxplots to give your viewers (or just yourself) a sense of the distribution along with the medians (B). Another elegant option is violin plots. These are kind of like (exactly like?) mirrored density plots. A measure of central tendency is not explicitly shown, but the *entire* distribution and range is shown (C). Cheers, Josh (P.S. I hit send too soon before and sent you an offlist message with PDF examples) ## Create your data DF - data.frame( Incidents = factor(rep(c(a, b, d, e), each = 25)), Months = factor(rep(1:2, each = 10)), Time = rnorm(100)) ## Load required packages require(ggplot2) require(Hmisc) ## Option A ggplot(DF, aes(x = Incidents, y = Time, colour = Months)) + stat_summary(fun.y = mean, geom = point, position = position_dodge(width = .90), size = 3) + stat_summary(fun.data = mean_cl_normal, geom = errorbar, position = dodge) ## Option B ggplot(DF, aes(x = Incidents, y = Time, fill = Months)) + geom_boxplot(position = position_dodge(width = .8)) ## Option C ggplot(DF, aes(x = Time, fill = Months)) + geom_ribbon(aes(ymax = ..density.., ymin = -..density..), alpha = .2, stat = density) + facet_grid( ~ Incidents) + coord_flip() ## Option C altered ggplot(DF, aes(x = Time, fill = Months)) + geom_ribbon(aes(ymax = ..density.., ymin = -..density..), alpha = .2, stat = density) + facet_grid( ~ Incidents + Months) + scale_y_continuous(name = density, breaks = NA, labels = NA) + coord_flip() On Fri, May 27, 2011 at 3:08 PM, steven mosher mosherste...@gmail.com wrote: Hi, I'm really struggling with barplot I have a data.frame with 3 columns. The first column represents an incident type The second column represents a month The third column represents a time Code for a sample data.frame incidents - rep(c('a','b','d','e'), each =25) months- rep(c(1,2), each =10) times -rnorm(100) # make my sample data DF- data.frame(Incidents=as.factor(incidents),Months=as.factor(months),Time=times) # now calculate a mean for the by groups of incident type and month pivot - aggregate(DF$Time,by=list(Incidents=DF$Incidents,Months=DF$Month),FUN=mean,simplify=TRUE) What I want to create is a bar plot where I have groupings by incident type ( a,b,d,e) and within each group I have the months in order. So group 1 would be Type a; month 1,2; group 2 would be Type b; month 1,2; group 3 would be Type d; month 1,2; group 4 would be Type 3; month 1,2; I know barplot is probably the right function but I'm a bit lost on how to specify groupings etc TIA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reduce printing accuracy
Dear all, I would like to print a few values with less digits than the default. How I can reduce how many digits are printed ?(warning: not the real integer resolution but what is shown in screen) Best Regards Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reduce printing accuracy
Hi Alex, See ?options in particular the digits section. You can (per session) edit this by typing: options(digits = 3) or whatever number you want. To make this more permanent, create a .Rprofile that alters the default digits. HTH, Josh On Sat, May 28, 2011 at 12:36 AM, Alaios ala...@yahoo.com wrote: Dear all, I would like to print a few values with less digits than the default. How I can reduce how many digits are printed ?(warning: not the real integer resolution but what is shown in screen) Best Regards Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with Rmpi install
Looks like you did not find the posting guide (see the footer of this message). For compilation errors in contributed packages this is not the correct list. As for If I knew more about R, I might know how to keep that Rmpi.so file around so that I could see if it is linked to a library that is missing from my LD_LIBRARY_PATH or something. How do you stop R from deleting the files that didn't load properly? Try R CMD INSTALL --help (surely the standard way to find out about options!): you seem to be looking for --no-test-load or --no-clean-on-error . But if you unpack the tarball before installation, the things built in the src directory are not deleted (unless you ask for them to be). On Fri, 27 May 2011, Brian Mendenhall wrote: Hello R-help! I am a systems administrator for the University of Southern California. I take care of it's general-purpose research cluster, and have recently been asked to provide access to a parallelized R platform. I do not have any previous experience using R, and have only ever had to do anything more then 'yum -y install R'. We have a myrinet MPI network, and use mpich1 as our standard compiling environment which will link programs with libmyriexpress.so as well as libmpich.so Our myrinet driver package is Myrinet Express (MX) 1.2.12big. (I believe it was custom-built for us by myricom) The R package that my predecessor built was version 2.6.1. I have since installed 2.13.0 in an NFS exported shared software directory, and intend for the Rmpi package to be installed there as well. The compiler is gcc 4.3.3, the mpich version is 1.2.7..7 The install command and its output are: [brianm@hpc-string R]$ /usr/usc/R/2.13.0/bin/R CMD INSTALL Rmpi_0.5-9.tar.gz --configure-args=--prefix=/usr/usc/R/2.13.0 --with-Rmpi-type=MPICH --with-Rmpi-include=/usr/usc/mpich/default/default/include --with-Rmpi-libpath=/usr/usc/mpich/default/default/lib64 --with-mpi=/usr/usc/mpich/default/default * installing to library ‘/auto/usc/R/2.13.0/lib64/R/library’ * installing *source* package ‘Rmpi’ ... checking for openpty in -lutil... no checking for main in -lpthread... no configure: creating ./config.status config.status: creating src/Makevars ** libs gcc -std=gnu99 -I/usr/usc/R/2.13.0/lib64/R/include -DPACKAGE_NAME=\\ -DPACKAGE_TARNAME=\\ -DPACKAGE_VERSION=\\ -DPACKAGE_STRING=\\ -DPACKAGE_BUGREPORT=\\ -I/usr/usc/mpich/default/default/include -DMPICH -I/usr/local/include-fpic -g -O2 -c RegQuery.c -o RegQuery.o gcc -std=gnu99 -I/usr/usc/R/2.13.0/lib64/R/include -DPACKAGE_NAME=\\ -DPACKAGE_TARNAME=\\ -DPACKAGE_VERSION=\\ -DPACKAGE_STRING=\\ -DPACKAGE_BUGREPORT=\\ -I/usr/usc/mpich/default/default/include -DMPICH -I/usr/local/include-fpic -g -O2 -c Rmpi.c -o Rmpi.o gcc -std=gnu99 -I/usr/usc/R/2.13.0/lib64/R/include -DPACKAGE_NAME=\\ -DPACKAGE_TARNAME=\\ -DPACKAGE_VERSION=\\ -DPACKAGE_STRING=\\ -DPACKAGE_BUGREPORT=\\ -I/usr/usc/mpich/default/default/include -DMPICH -I/usr/local/include-fpic -g -O2 -c conversion.c -o conversion.o gcc -std=gnu99 -I/usr/usc/R/2.13.0/lib64/R/include -DPACKAGE_NAME=\\ -DPACKAGE_TARNAME=\\ -DPACKAGE_VERSION=\\ -DPACKAGE_STRING=\\ -DPACKAGE_BUGREPORT=\\ -I/usr/usc/mpich/default/default/include -DMPICH -I/usr/local/include-fpic -g -O2 -c internal.c -o internal.o gcc -std=gnu99 -shared -L/usr/local/lib64 -o Rmpi.so RegQuery.o Rmpi.o conversion.o internal.o -L/usr/usc/mpich/default/default/lib64 -lmpich installing to /auto/usc/R/2.13.0/lib64/R/library/Rmpi/libs ** R ** demo ** inst ** preparing package for lazy loading ** help *** installing help indices ** building package indices ... ** testing if installed package can be loaded Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared object '/auto/usc/R/2.13.0/lib64/R/library/Rmpi/libs/Rmpi.so': /auto/usc/R/2.13.0/lib64/R/library/Rmpi/libs/Rmpi.so: undefined symbol: MX_ERRORS_ARE_FATAL Error: loading failed In addition: Warning message: .Last.lib failed in detach() for 'Rmpi', details: call: dyn.unload(file.path(libpath, libs, paste(Rmpi, .Platform$dynlib.ext, error: shared object '/auto/usc/R/2.13.0/lib64/R/library/Rmpi/libs/Rmpi.so' was not loaded Execution halted ERROR: loading failed * removing ‘/auto/usc/R/2.13.0/lib64/R/library/Rmpi’ I've tried google, but didn't get very far. The only information I found was relative to mpich2... Is it that I need to move towards mpich2? All of the testing done by my predecessor indicated that mpich2 was much slower then mpich1, so we never put much time into installing it. If I knew more about R, I might know how to keep that Rmpi.so file around so that I could see if it is linked to a library that is missing from my LD_LIBRARY_PATH or something. How do you stop R from deleting the files that didn't load properly? Any help would be greatly appreciated. Regards, - Brian Mendenhall Linux/HPCC Administrator University of Southern California
Re: [R] reduce printing accuracy
Thanks a lot :) --- On Sat, 5/28/11, Joshua Wiley jwiley.ps...@gmail.com wrote: From: Joshua Wiley jwiley.ps...@gmail.com Subject: Re: [R] reduce printing accuracy To: Alaios ala...@yahoo.com Cc: R-help@r-project.org Date: Saturday, May 28, 2011, 8:48 AM Hi Alex, See ?options in particular the digits section. You can (per session) edit this by typing: options(digits = 3) or whatever number you want. To make this more permanent, create a .Rprofile that alters the default digits. HTH, Josh On Sat, May 28, 2011 at 12:36 AM, Alaios ala...@yahoo.com wrote: Dear all, I would like to print a few values with less digits than the default. How I can reduce how many digits are printed ?(warning: not the real integer resolution but what is shown in screen) Best Regards Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Opening R in 64-bit version by default
On Fri, 27 May 2011, David Winsemius wrote: On May 27, 2011, at 8:56 PM, Duncan Murdoch wrote: but really, it's just a bug. If you manually change that registry key, things are fine. If you ask Windows dialogs to do it for you, it fails. Most people pay infinitely more to Microsoft for Windows than they pay to R Core for R. I hope that's also the ratio of their complaints to Microsoft about this bug to their complaints to us about R. Fortune nomination. Added on R-Forge. thx, Z -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] barplot groups of different size i.e. height is NOT a matrix
Thanks for the help! In the end, i chose to use ggplot which creates really simply different panels. And I am replacing my bars by simple points in the end. Victor __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot pale colors
Hello i am new to ggplot and i observed a strange behavior. I want to display two groups of points, each group with a different color. But i encountered a problem with the colors. Here is a first example: dataset - data.frame(Main = c(A, A, B, B), Detail = c( b, c, 1, 2), resp = runif(4, min = 0.5, max = 1)) ggplot(dataset, aes(x = Detail, y = resp)) + facet_grid(.~Main, scales = free_x)+ geom_point(aes( size=6,shape = c(16,16,15,15) ),colour=blue)+geom_hline(aes(yintercept=0.25),colour='blue', size=2) with this code all the point are blue (like the line below) But if i try the following code, where my goal is to have the point on the left blue and the one on the right red, a problem appears: dataset - data.frame(Main = c(A, A, B, B), Detail = c( b, c, 1, 2), resp = runif(4, min = 0.5, max = 1)) ggplot(dataset, aes(x = Detail, y = resp)) + facet_grid(.~Main, scales = free_x)+ geom_point(aes( size=6,shape = c(16,16,15,15) ,colour=c(blue,blue,red,red)))+geom_hline(aes(yintercept=0.25),colour='blue', size=2) The points have different colors but those colors are pale (dull). You can see it by comparing the blue of the points to the blue of the line. I guessing i am duing it wrong but i'm stucked with it. Do you have suggestions? An additional question is that i want to add text along the blue line (which is a reference) but i did not understand what geom_text was expecting. Thanks for your help! Victor [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ftable (accidentally?) increases column widths
Dear expeRts, I typically replace NA (but also other) entries in an ftable by a character string containing LaTeX code for later use. I realized that replacing an entry in one column also affects other columns; they are displayed all with the same column width. Since the character string can be long, this is a bit annoying. Is there any way to prevent this so that there are individual column widths? Maybe one can use ... to tell ftable to use individual column widths? Minimal example: (ft - ftable(Titanic)) ft[1,2] - *** ft Cheers, Marius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Survival: pyears and ratetable: expected events
Hi thanks, The conventional method is solid (cross multiplication of patient years at each age and year and gender, with the corresponding risk for death at each age year and gender). I get about 26 expected deaths (verified by many different sources) Now what I am trying to do is get the same answer with the pyears function using a ratetable, and I am not being successful at all. Could one help be with the syntax, lets assume for now I want to use the survexp.us ratetable. Or help me with how my variables should be formated, (days? years? scale? asDate? Julian date? days difference from an origin? ) and entered in the pyears method? Or give me similar working example that I can decompose and fit to my problem? David sorry, I ment survexp.us (the standard in the survival package) Thanks to all, JT -- View this message in context: http://r.789695.n4.nabble.com/Survival-pyears-and-ratetable-expected-events-tp3553208p3557000.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error loading workspaces after upgrade
Dear Members, I upgraded R from 2.12.2 to 2.13.0 (binary) on my WinXP and now I can't load my workspaces. The error message is: Error: object âBICâ is not exported by 'namespace:nlme' I tried to load 'nlme' before loading workspace, it did not help... Any idea to fix that? Thank you in advance, regards, Peter [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ftable: how to replace NA and format entries without changing their mode?
Dear all, another ftable problem, now related to formatC. One typically would like to format entries in an ftable (adjust digits, replace NA, ...) before format() is applied to convert the formatted ftable to an object which xtable can deal with. The output of xtable can then be used within a LaTeX table. The problem I face is that the ftable entries [numeric] change their mode when one of the operations adjust digits or replace NA is applied. Here is a minimal example: ## first adjusting the format, then trying to remove NA (ft - ftable(Titanic)) # ftable ft[1,1] - NA # create an NA entry to show the behavior is.numeric(ft) # = is numeric ft. - formatC(ft, digits=1, format=f) # adjust format is.numeric(ft.) # = not numeric anymore = one can not further use is.na() etc. ft.[is.na(ft.)] - my.Command.To.Deal.With.NA # does not work because is.na() does not find NA ft. # (of course) still contains NA ## first remove NA, then trying to adjust the format (ft - ftable(Titanic)) # ftable ft[1,1] - NA ft[is.na(ft)] - my.LaTeX.Code.To.Deal.With.NA is.character(ft) # = now character, adjusting the format of the numbers with formatC not possible anymore ft formatC(ft, digits=1, format=f) # (of course) not working anymore How can I accomplish both (example-)tasks without changing the mode of the ftable entries? Note: I would like to keep the ftable structure since this nicely converts to a LaTeX table later on. Cheers, Marius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ftable: how to replace NA and format entries without changing their mode?
On May 28, 2011, at 7:19 AM, Marius Hofert wrote: Dear all, another ftable problem, now related to formatC. One typically would like to format entries in an ftable (adjust digits, replace NA, ...) before format() is applied to convert the formatted ftable to an object which xtable can deal with. The output of xtable can then be used within a LaTeX table. 1/3 [1] 0.333 options(digits=3) 1/3 [1] 0.333 The problem I face is that the ftable entries [numeric] change their mode when one of the operations adjust digits or replace NA is applied. Here is a minimal example: ## first adjusting the format, then trying to remove NA (ft - ftable(Titanic)) # ftable ft[1,1] - NA # create an NA entry to show the behavior is.numeric(ft) # = is numeric ft. - formatC(ft, digits=1, format=f) # adjust format is.numeric(ft.) # = not numeric anymore = one can not further use is.na() etc. ft.[is.na(ft.)] - my.Command.To.Deal.With.NA # does not work because is.na() does not find NA ft. # (of course) still contains NA ## first remove NA, then trying to adjust the format (ft - ftable(Titanic)) # ftable ft[1,1] - NA ft[is.na(ft)] - my.LaTeX.Code.To.Deal.With.NA is.character(ft) # = now character, adjusting the format of the numbers with formatC not possible anymore ft formatC(ft, digits=1, format=f) # (of course) not working anymore How can I accomplish both (example-)tasks without changing the mode of the ftable entries? Note: I would like to keep the ftable structure since this nicely converts to a LaTeX table later on. Cheers, Marius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] continuous time AR(1)
On May 27, 2011, at 18:00 , frossard victor wrote: Dear R helpers, I would like to model temporal trend of biological remains in sediment cores. All samples are temporally auto-correlated and I would like to take this effect into account. Initially I thought that I could use AR(1) or ARIMA functions but these functions only work with regular temporal intervals between samples. Hence I would like to use a continuous time AR(1) that allow irregular time intervals between samples. Unfortunalty I don't find this function in any R package. Doesn't someone know if this function has already be implemented for R? library(nlme) help(corCAR1) help(corExp) You use these corStructs with gls/lme/nlme. As far as I remember, the corCAR1 is equivalent to corExp without nugget effect (modulo parametrization?), but you often want the nugget effect if some data are very close in time without actually being identical (since corCAR1 implies that the correlation is one for simultaneous observations). I wouldn't know about continuous time AR of higher orders (I can't even guess what MA(n) might mean in continuous time). -pd Many thanks in advance. Victor Frossard Phd. student. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error loading workspaces after upgrade
On 28.05.2011 12:28, Vereszki-Varga Péter wrote: Dear Members, I upgraded R from 2.12.2 to 2.13.0 (binary) on my WinXP and now I can't load my workspaces. The error message is: Error: object ‘BIC’ is not exported by 'namespace:nlme' I tried to load 'nlme' before loading workspace, it did not help... Any idea to fix that? Yes: remove your workspace. You see this caused by an intended change in R-2.13.0 where some BIC related changes happened. If you have some other important objects in it, open it with R-2.12.2 and save the important objects again. Best, Uwe Ligges Thank you in advance, regards, Peter [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ftable: how to replace NA and format entries without changing their mode?
Received your offlist question and see that I did not understand your request. See below for another alternative On May 28, 2011, at 7:53 AM, David Winsemius wrote: On May 28, 2011, at 7:19 AM, Marius Hofert wrote: Dear all, another ftable problem, now related to formatC. One typically would like to format entries in an ftable (adjust digits, replace NA, ...) before format() is applied to convert the formatted ftable to an object which xtable can deal with. The output of xtable can then be used within a LaTeX table. 1/3 [1] 0.333 options(digits=3) 1/3 [1] 0.333 The problem I face is that the ftable entries [numeric] change their mode when one of the operations adjust digits or replace NA is applied. Here is a minimal example: ## first adjusting the format, then trying to remove NA (ft - ftable(Titanic)) # ftable ft[1,1] - NA # create an NA entry to show the behavior is.numeric(ft) # = is numeric ft. - formatC(ft, digits=1, format=f) # adjust format is.numeric(ft.) # = not numeric anymore = one can not further use is.na() etc. # ft.[is.na(ft.)] - my.Command.To.Deal.With.NA # does not work because is.na() does not find NA ft. # (of course) still contains NA If you want to replace an entry in a character-mode table whose value == NA (which is not a special missing value in that mode) is.na(NA) [1] FALSE is.na(NA_character_) [1] TRUE , then this should work: ft.[which(ft.==NA)] - my.Command.To.Deal.With.NA ft. Survived NoYes Class SexAge 1st Male Child my.Command.To.Deal.With.NA 5.0 Adult 118.0 57.0 Female Child 0.01.0 Adult 4.0140.0 2nd Male Child 0.011.0 Adult 154.0 14.0 Female Child 0.013.0 Adult 13.0 80.0 3rd Male Child 35.0 13.0 Adult 387.0 75.0 Female Child 17.0 14.0 Adult 89.0 76.0 Crew Male Child 0.00.0 Adult 670.0 192.0 Female Child 0.00.0 Adult 3.020.0 Although this messes up the header alignment. At least it finds the NA. -- David. ## first remove NA, then trying to adjust the format (ft - ftable(Titanic)) # ftable ft[1,1] - NA # ft[is.na(ft)] - my.LaTeX.Code.To.Deal.With.NA is.character(ft) # = now character, adjusting the format of the numbers with formatC not possible anymore ft formatC(ft, digits=1, format=f) # (of course) not working anymore How can I accomplish both (example-)tasks without changing the mode of the ftable entries? Note: I would like to keep the ftable structure since this nicely converts to a LaTeX table later on. Cheers, Marius. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Nested design
Dear R-users, I have the following problem. I have performed an experiment for which I gathered a lot of data which I now want to test. The problem is that I cannot find an appropriate test in R (I am a starter) and someone might give me a hand. This is what I have done: Across three sites (Site), I have laid out five transects (Trans)...meaning five transects in each sites. In each transect I have five Microhabitats (MH) which should be regarded as subplots (I think). In each transect, every MH has the same position (so they are not randomized). I now want to test the effect of Site and MH (nested in Trans) on my response variables. This is what I do now: model-aov(Response~Site*MH+error(Trans/MH)) I get the following output: Error: Trans Df Sum Sq Mean Sq Site 1 0.030294 0.030294 Error: Trans:MH Df Sum Sq Mean Sq Site 1 10.8367 10.8367 MH3 0.2836 0.0945 Error: Within Df Sum Sq Mean Sq F valuePr(F) Site 2 0.92504 0.46252 11.7304 5.880e-05 *** MH 4 1.86688 0.46672 11.8370 5.645e-07 *** Site:MH8 1.17041 0.14630 3.7105 0.001615 ** Residuals 54 2.12917 0.03943 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 How do I read this? Any help appreciated! BTW: I also tried the lme function: model-lme(Response~Site*MH, random=~1|Trans/MH) but then the output is really complicated. - Dr. Bjorn JM Robroek Ecology and Biodiversity Group Institute of Environmental Biology, Utrecht University Padualaan 8, 3584 CH Utrecht, The Netherlands Email address: b.j.m.robr...@uu.nl http://www.researcherid.com/rid/C-4379-2008 Tel: +31-(0)30-253 6091 -- View this message in context: http://r.789695.n4.nabble.com/Nested-design-tp3557404p3557404.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] remove , at the end of each line for all lines except the first line in a data frame
the question ask me to use gsub() and subsetting to remove the comma at the end of the for all but the first value in a data frame. the first few lines are like the following: [1] 2177663,-41,175,2678248,6021224,1840,5,25,17,,,6,, 2177691,-39.6,176.2,2784798,6173592,1843,7,8,5,30,,7.6,12, [3] 2177754,-47,166,1977803,5333806,1846,7,13,6,20,,63,, 2177759,-41,172,2425856,6022664,1846,11,18,19,,,65,, [5] 2177762,-41,174.5,2636191,6022065,1846,12,4,5,45,,6,, 2177819,-41.9,173.60001,2559794,5923028,1848,10,15,14,10,,7.4,12, this data frame is called the originalQuakes thank you for reading this, and please help! -- View this message in context: http://r.789695.n4.nabble.com/remove-at-the-end-of-each-line-for-all-lines-except-the-first-line-in-a-data-frame-tp3557391p3557391.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nested design
Hi: Essentially, you are asking for free statistical advice, which is not within the intended scope of R-help. It's always better to consult with someone locally, and as luck would have it, your university apparently provides free statistical consulting for faculty and grad students: http://www.uu.nl/faculty/socialsciences/EN/organisation/Departments/methodologystatistics/consultation/Pages/default.aspx I would suggest that you contact someone there and have a face-to-face discussion rather than a possibly extended back-and-forth on the Net. Dennis On Sat, May 28, 2011 at 6:09 AM, unpeatable bjorn.robr...@gmail.com wrote: Dear R-users, I have the following problem. I have performed an experiment for which I gathered a lot of data which I now want to test. The problem is that I cannot find an appropriate test in R (I am a starter) and someone might give me a hand. This is what I have done: Across three sites (Site), I have laid out five transects (Trans)...meaning five transects in each sites. In each transect I have five Microhabitats (MH) which should be regarded as subplots (I think). In each transect, every MH has the same position (so they are not randomized). I now want to test the effect of Site and MH (nested in Trans) on my response variables. This is what I do now: model-aov(Response~Site*MH+error(Trans/MH)) I get the following output: Error: Trans Df Sum Sq Mean Sq Site 1 0.030294 0.030294 Error: Trans:MH Df Sum Sq Mean Sq Site 1 10.8367 10.8367 MH 3 0.2836 0.0945 Error: Within Df Sum Sq Mean Sq F value Pr(F) Site 2 0.92504 0.46252 11.7304 5.880e-05 *** MH 4 1.86688 0.46672 11.8370 5.645e-07 *** Site:MH 8 1.17041 0.14630 3.7105 0.001615 ** Residuals 54 2.12917 0.03943 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 How do I read this? Any help appreciated! BTW: I also tried the lme function: model-lme(Response~Site*MH, random=~1|Trans/MH) but then the output is really complicated. - Dr. Bjorn JM Robroek Ecology and Biodiversity Group Institute of Environmental Biology, Utrecht University Padualaan 8, 3584 CH Utrecht, The Netherlands Email address: b.j.m.robr...@uu.nl http://www.researcherid.com/rid/C-4379-2008 Tel: +31-(0)30-253 6091 -- View this message in context: http://r.789695.n4.nabble.com/Nested-design-tp3557404p3557404.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] network package in R
Can you be more specific about what a network builder is? and what do you want exactly? Your question seems a bit vague. Best Ronggui On 28 May 2011 05:32, Weiwei Shi helprh...@gmail.com wrote: Hi there, I need a network builder and it can change the node size and color; I am not sure if network package in R can do this or not. The other functions I wanted have been found in that package. BTW, if there is another package in R relating to this, please suggest too. Thanks, Weiwei -- Weiwei Shi, Ph.D Research Scientist Did you always know? No, I did not. But I believed... ---Matrix III [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Wincent Ronggui HUANG Sociology Department of Fudan University PhD of City University of Hong Kong http://asrr.r-forge.r-project.org/rghuang.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] remove , at the end of each line for all lines except the first line in a data frame
Homework question? This one's pretty easy; see ?regex ?gsub A couple of examples on gsub()'s help page are rather close to what you need. The basic structure is gsub(string to replace, replacement string, name_of_string) where the strings are enclosed in quotes. If you need to apply this function to each row of your data frame, see ?apply and put the gsub() code into an anonymous function. Best of luck! Dennis On Sat, May 28, 2011 at 5:54 AM, xiaerwhite xiaerwh...@hotmail.com wrote: the question ask me to use gsub() and subsetting to remove the comma at the end of the for all but the first value in a data frame. the first few lines are like the following: [1] 2177663,-41,175,2678248,6021224,1840,5,25,17,,,6,, 2177691,-39.6,176.2,2784798,6173592,1843,7,8,5,30,,7.6,12, [3] 2177754,-47,166,1977803,5333806,1846,7,13,6,20,,63,, 2177759,-41,172,2425856,6022664,1846,11,18,19,,,65,, [5] 2177762,-41,174.5,2636191,6022065,1846,12,4,5,45,,6,, 2177819,-41.9,173.60001,2559794,5923028,1848,10,15,14,10,,7.4,12, this data frame is called the originalQuakes thank you for reading this, and please help! -- View this message in context: http://r.789695.n4.nabble.com/remove-at-the-end-of-each-line-for-all-lines-except-the-first-line-in-a-data-frame-tp3557391p3557391.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] network package in R
Hi, On Fri, May 27, 2011 at 5:32 PM, Weiwei Shi helprh...@gmail.com wrote: Hi there, I need a network builder and it can change the node size and color; I am not sure if network package in R can do this or not. The other functions I wanted have been found in that package. BTW, if there is another package in R relating to this, please suggest too. I'm not actually sure what you're looking for, but from trying to piece together the other emails in this thread maybe you are looking for a way to control graph/network layout in some GUI form? There is a bioconductor package called RCytoscape that can drive cytoscape from R, allowing you to draw a network you have loaded up in your R session and tweak different properties of nodes, edges, etc: http://www.bioconductor.org/packages/release/bioc/html/RCytoscape.html Maybe it can help you? I'm not sure if this is what you're asking for, though, so I could be way off the mark. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compiling Rgraphiz on Windows 7 64bit with R-2.13.0
Thanks, I'm going to update it! -- View this message in context: http://r.789695.n4.nabble.com/Compiling-Rgraphiz-on-Windows-7-64bit-with-R-2-13-0-tp3493750p3557431.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nested design
Dear Dennis, In my opinion I am not at all asking for any stats help, just a question how to read this output. Thanks, Bjorn - Dr. Bjorn JM Robroek Ecology and Biodiversity Group Institute of Environmental Biology, Utrecht University Padualaan 8, 3584 CH Utrecht, The Netherlands Email address: b.j.m.robr...@uu.nl http://www.researcherid.com/rid/C-4379-2008 Tel: +31-(0)30-253 6091 -- View this message in context: http://r.789695.n4.nabble.com/Nested-design-tp3557404p3557472.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Enquiry on Vrtest
Hi there, I am currently working on my dissertation which is about testing the martingale hypothesis in the stock market using a methodology involving a range of variance ratio tests and multiple variance ratio tests. I contacted the author of a reference paper and I was told that the tests can be conducted using R programming language. Although I have gone through the theoretical background of the methodology, but I found it quite difficult to implement the tests practically using my own data. As I have little experience in R. I would be really appreciated if you could do me a favour by giving me some hints on to use the Reference Manual written by the author of Vrtest, is it correct for me to read in my own data, and then type in the codes from the reference manual to get results.Obviously, I have installed the Vrtest package. I tried to type in the command from the reference manual and they always return Error: could not find function, I would like to know how to solve this. Many thanks. -- Best regards River Jim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nested design
Hi, If you are not asking for stats help, then do you understand the model and are just confused by how R labels it? We can help match R's labels to the ones you are used to, if you tell us what you are used to. Cheers, Josh On Sat, May 28, 2011 at 6:54 AM, unpeatable bjorn.robr...@gmail.com wrote: Dear Dennis, In my opinion I am not at all asking for any stats help, just a question how to read this output. Thanks, Bjorn - Dr. Bjorn JM Robroek Ecology and Biodiversity Group Institute of Environmental Biology, Utrecht University Padualaan 8, 3584 CH Utrecht, The Netherlands Email address: b.j.m.robr...@uu.nl http://www.researcherid.com/rid/C-4379-2008 Tel: +31-(0)30-253 6091 -- View this message in context: http://r.789695.n4.nabble.com/Nested-design-tp3557404p3557472.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] network package in R
Thank you, Steve. Initially I tried to choose b/w cytoscape and pajek, among other bunch of software. I did not realize the existence of RCytoscape; otherwise, I would probably use that one. Weiwei On Sat, May 28, 2011 at 9:25 AM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: Hi, On Fri, May 27, 2011 at 5:32 PM, Weiwei Shi helprh...@gmail.com wrote: Hi there, I need a network builder and it can change the node size and color; I am not sure if network package in R can do this or not. The other functions I wanted have been found in that package. BTW, if there is another package in R relating to this, please suggest too. I'm not actually sure what you're looking for, but from trying to piece together the other emails in this thread maybe you are looking for a way to control graph/network layout in some GUI form? There is a bioconductor package called RCytoscape that can drive cytoscape from R, allowing you to draw a network you have loaded up in your R session and tweak different properties of nodes, edges, etc: http://www.bioconductor.org/packages/release/bioc/html/RCytoscape.html Maybe it can help you? I'm not sure if this is what you're asking for, though, so I could be way off the mark. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact -- Weiwei Shi, Ph.D Research Scientist Did you always know? No, I did not. But I believed... ---Matrix III [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Questions regrading the lasso and glmnet
Hi all. Sorry for the long email. I have been trying to find someone local to work on this with me, without much luck. I went in to our local stats consulting service here, and the guy there told me that I already know more about model selection than he does. :- He pointed me towards another professor that can perhaps help, but that prof is busy until mid-June, so I want to get as much figured out as I can before I eventually meet with him. And that prof may turn out not to be much help anyway, in which case all I've got is you folks. :- So I've got a big dataset (300,000 rows) in which the dependent variable is binary, and there are five independent variables, each continuous, and each uniformly distributed between 0 and 1. I'm trying to use binary logistic regression to get an explanatory model with decent predictive value (but being simple, for explanation, is more important than being optimally predictive). Considering the squares of those variables as well (but no higher powers, this list talked me out of that recently because of all the problems with collinearity and polynomial fits), and many interactions between both the variables and the squares of the variables, I have a full model formula of 116 terms; but most of those terms are of small effect (although often still significant in a glm fit, since 300,000 rows is a lot of data). So I have a model selection problem: I want to find a simple explanatory model containing just the terms of large effect that are most important in getting an unders! tanding of what strongly influences the dependent variable. I was previously trying to do model selection using a step-down approach with a large per-term penalty (much larger than the standard BIC penalty) to force more terms to drop out. I looked at a wide range of penalty values, got a corresponding set of models with different numbers of terms, looked at the correct prediction rate for each of those models, and basically chose the simplest model that still gave me a pretty good prediction rate (for some subjective definition of pretty good). Typically the model I chose would have about 20 terms, out of the original 116. (A standard BIC step-down would retain more like 100 terms, with only a very slightly better prediction rate.) So that was working OK, but in discussions with the folks on this list (thanks everybody for your help!), I have been exploring using the lasso for this instead, to avoid the problems with step-down, gain the benefits of shrinkage, and so forth; clearly it should be much better than my homegrown model selection procedure. I've been reading about the lasso in Tibshirani (1996) and in The Elements of Statistical Learning. I'm using glmnet() at present; I have seen that lars() also exists, but I don't understand its documentation as well, so I'm starting with glmnet. So there's one question: 1. Is my choice of glmnet() ok? On what basis should I choose glmnet() vs. lars()? The lasso wants variables to be centered and scaled, I gather. glmnet() can do this for me, but I want to understand exactly what the variables are that the fit is done on, so that I can interpret the coefficients properly, so I want to do this myself. (I'm also concerned that glmnet() might not do it correctly, since it doesn't know that some of the terms in the formula are squares and interactions.) So I'm passing standardize=FALSE to glmnet(), and I'm doing my own scaling before, like (where df is my dataframe): df$Cx - scale(df$x)# an independent variable, centered and scaled df$Cx_sq - df$Cx ^ 2 # the square of that variable I do this for each of the five independent variables. So the variables themselves are centered and scaled, while the squared versions of the variables are exactly the square of the scaled variable; I do not scale them again. So question 2: 2. Is the way I'm scaling the variables before calling glmnet() correct? Or should the squares themselves be centered and scaled? Having scaled the variables in this way, I then construct a model matrix and call glmnet() (where f is the 116-term formula and df is my dataframe): mf - model.frame(f, df) mm - model.matrix(formula(f), mf)[,-1] lasso - glmnet(mm, y=df$outcome, family=binomial, standardize=FALSE) I do it this way because glmnet() doesn't support being passed a formula and a dataframe. I think this is doing the right thing. The model.matrix() call constructs new columns for all of the interactions in the formula, which of course act as separate independent variables in the regression. One worry I have is that those, like the squares discussed above, are not themselves centered and scaled. If there's an interaction between, say, Cx and Cy, then the model matrix column for Cx:Cy is of course just the product of the Cx column and the Cy column, and so it is not centered/scaled. I don't know if this is correct or not. So
Re: [R] Enquiry on Vrtest
On May 28, 2011, at 10:58 AM, Chim Kaho wrote: Hi there, I am currently working on my dissertation which is about testing the martingale hypothesis in the stock market using a methodology involving a range of variance ratio tests and multiple variance ratio tests. I contacted the author of a reference paper and I was told that the tests can be conducted using R programming language. Although I have gone through the theoretical background of the methodology, but I found it quite difficult to implement the tests practically using my own data. As I have little experience in R. I would be really appreciated if you could do me a favour by giving me some hints on to use the Reference Manual written by the author of Vrtest, is it correct for me to read in my own data, and then type in the codes from the reference manual to get results.Obviously, I have installed the Vrtest package. I tried to type in the command from the reference manual and they always return Error: could not find function, I would like to know how to solve this. Many thanks. A fairly common beginner mistake is to install a package but then not not understand that thye also need to laod it. ?require ?library -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] network package in R
Actually I have another question relating to this, I have about 300 nodes (I can reduced them somehow if the network is too messy) and the similarity matrix between any two nodes. The size and color of nodes have some meanings and I tried to use the layout to show their distance. Here is the question: for example, there are nodes A, B and C. There is a threshold to define if there is a link or not between nodes. Let's assume there are links between A and B; and A and C. There is no link between B and C. However, when there are more nodes added, the physical distance on the plot between B and C is shorter than some other linked pair, for example, C and D. I was suggested it was due to the layout algorithm: because A-B and A-C links, so B and C were pulled nearer. I am not sure if there is a better solution or not. I mean, physically B and C look nearer than C-D, although the latter has a link while B and C do not. I hope I explained my question clear this time. Weiwei On Sat, May 28, 2011 at 11:51 AM, Weiwei Shi helprh...@gmail.com wrote: Thank you, Steve. Initially I tried to choose b/w cytoscape and pajek, among other bunch of software. I did not realize the existence of RCytoscape; otherwise, I would probably use that one. Weiwei On Sat, May 28, 2011 at 9:25 AM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: Hi, On Fri, May 27, 2011 at 5:32 PM, Weiwei Shi helprh...@gmail.com wrote: Hi there, I need a network builder and it can change the node size and color; I am not sure if network package in R can do this or not. The other functions I wanted have been found in that package. BTW, if there is another package in R relating to this, please suggest too. I'm not actually sure what you're looking for, but from trying to piece together the other emails in this thread maybe you are looking for a way to control graph/network layout in some GUI form? There is a bioconductor package called RCytoscape that can drive cytoscape from R, allowing you to draw a network you have loaded up in your R session and tweak different properties of nodes, edges, etc: http://www.bioconductor.org/packages/release/bioc/html/RCytoscape.html Maybe it can help you? I'm not sure if this is what you're asking for, though, so I could be way off the mark. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact -- Weiwei Shi, Ph.D Research Scientist Did you always know? No, I did not. But I believed... ---Matrix III -- Weiwei Shi, Ph.D Research Scientist Did you always know? No, I did not. But I believed... ---Matrix III [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to train ksvm with spectral kernel (kernlab) in caret?
Hello all, I would like to use the train function from the caret package to train a svm with a spectral kernel from the kernlab package. Sadly a svm with spectral kernel is not among the many methods in caret... using caret to train svmRadial: -- library(caret) library(kernlab) data(iris) TrainData- iris[,1:4] TrainClasses- iris[,5] set.seed(2) fitControl$summaryFunction- Rand svmNew- train(TrainData, TrainClasses, method = svmRadial, preProcess = c(center, scale), metric = cRand, tuneLength = 4) svmNew --- here is an example on how to train the ksvm with spectral kernel --- # Load the data data(reuters) y - rlabels x - reuters sk - stringdot(type=spectrum, length=4, normalized=TRUE) svp - ksvm(x,y,kernel=sk,scale=c(),cross=5) svp - Does anyone know how I can train the svm from above with using the caret package? best regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Markov Chain model coding
Can anyone help with coding for Markov chain models. I have a data set from coral reefs in consecutive years that have been given a state depending on their coral and macroalgal cover. The markov model will be used to predict the cover of coral and macroalgae in the future dependant on the conditions during the observation. Any advice would be much appretiated. Phil -- View this message in context: http://r.789695.n4.nabble.com/Markov-Chain-model-coding-tp3557733p3557733.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nested design
Date: Sat, 28 May 2011 09:33:03 -0700 From: jwiley.ps...@gmail.com To: bjorn.robr...@gmail.com CC: r-help@r-project.org Subject: Re: [R] Nested design Hi, If you are not asking for stats help, then do you understand the model and are just confused by how R labels it? We can help match R's labels to the ones you are used to, if you tell us what you are used to. I would not suggest as a rule to use a tool to validate itself but you can use R to make sure your interpretation of other R output is right by giving contrived datasets to the analysis package and see what you get back. Comparison can be to examples from text book or your own paper and pencil analysis. This is also a good way to learn things from basic terms to things like sign or unit conventions in different fields etc. You can generate samples from normal distro and feed that to the questionable package to see what comes back. Cheers, Josh On Sat, May 28, 2011 at 6:54 AM, unpeatable wrote: Dear Dennis, In my opinion I am not at all asking for any stats help, just a question how to read this output. Thanks, Bjorn - Dr. Bjorn JM Robroek Ecology and Biodiversity Group Institute of Environmental Biology, Utrecht University Padualaan 8, 3584 CH Utrecht, The Netherlands Email address: b.j.m.robr...@uu.nl http://www.researcherid.com/rid/C-4379-2008 Tel: +31-(0)30-253 6091 -- View this message in context: http://r.789695.n4.nabble.com/Nested-design-tp3557404p3557472.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Observation in a confidence ellipse
Hello everyone I really need some help here. I made a confidence ellipse using the function ellipse from the package ellipse: ellipse(SD, centre=colMeans(pcsref),t=sqrt((p * (n-1)/(n-p))*qf(0.99, p,n-p)) Now, I want to write a function whom return TRUE or FALSE if a given observation is in the confidence ellipse. But I have no clue how to do it Can anyone help me? Best regards Jessica [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot pale colors
Hi Victor, The problem is that you have not grasped the difference between setting an aesthetic to a fixed value and mapping it to a variable. See below for details. On Sat, May 28, 2011 at 6:16 AM, Victor Gabillon victor.gabil...@inria.fr wrote: Hello i am new to ggplot and i observed a strange behavior. I want to display two groups of points, each group with a different color. But i encountered a problem with the colors. Here is a first example: dataset - data.frame(Main = c(A, A, B, B), Detail = c( b, c, 1, 2), resp = runif(4, min = 0.5, max = 1)) ggplot(dataset, aes(x = Detail, y = resp)) + facet_grid(.~Main, scales = free_x)+ geom_point(aes( size=6,shape = c(16,16,15,15) ),colour=blue)+geom_hline(aes(yintercept=0.25),colour='blue', size=2) with this code all the point are blue (like the line below) But if i try the following code, where my goal is to have the point on the left blue and the one on the right red, a problem appears: dataset - data.frame(Main = c(A, A, B, B), Detail = c( b, c, 1, 2), resp = runif(4, min = 0.5, max = 1)) ggplot(dataset, aes(x = Detail, y = resp)) + facet_grid(.~Main, scales = free_x)+ geom_point(aes( size=6,shape = c(16,16,15,15) ,colour=c(blue,blue,red,red)))+geom_hline(aes(yintercept=0.25),colour='blue', size=2) The colors are not coming from c(blue, blue, red, red). Try this to see p - ggplot(dataset, aes(x = Detail, y = resp)) + facet_grid(.~Main, scales = free_x) + geom_point(aes(shape = c(16,16,15,15) ,colour=c(purple,purple,foo,foo)), size = 6) + geom_hline(aes(yintercept=0.25),colour='blue', size=2) p So now you see that you are not setting the colors to be red and blue, you are mapping the color to a variable that just happens to have levels red and blue. The actual colors that are mapped to that variable are determined by scale_colour_discrete, as you can see from p + scale_color_manual(value=c(red, blue)) p + scale_color_manual(value=c(green, black)) As a general rule, if you want an aesthetic (e.g., color, size, shape etc.) to vary (i.e., to have more than one value) you should put it as an argument to aes(). If you just want to set it to a fixed value you should set it as an argument to geom_* or as an argument to ggplot() itself. The points have different colors but those colors are pale (dull). You can see it by comparing the blue of the points to the blue of the line. I guessing i am duing it wrong but i'm stucked with it. Do you have suggestions? An additional question is that i want to add text along the blue line (which is a reference) but i did not understand what geom_text was expecting. see http://had.co.nz/ggplot2/geom_text.html Best, Ista Thanks for your help! Victor [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] network package in R
Hi again, On Sat, May 28, 2011 at 1:03 PM, Weiwei Shi helprh...@gmail.com wrote: Actually I have another question relating to this, I have about 300 nodes (I can reduced them somehow if the network is too messy) and the similarity matrix between any two nodes. The size and color of nodes have some meanings and I tried to use the layout to show their distance. Here is the question: for example, there are nodes A, B and C. There is a threshold to define if there is a link or not between nodes. Let's assume there are links between A and B; and A and C. There is no link between B and C. However, when there are more nodes added, the physical distance on the plot between B and C is shorter than some other linked pair, for example, C and D. I was suggested it was due to the layout algorithm: because A-B and A-C links, so B and C were pulled nearer. I am not sure if there is a better solution or not. I mean, physically B and C look nearer than C-D, although the latter has a link while B and C do not. I hope I explained my question clear this time. I actually don't have a direct answer to your question -- and this isn't even related to R, but the following has hit my radar recently and thought it might be helpful: http://www.hiveplot.org/ It's a different way to visualize (large) networks that uses some of the networks structure in order to make it more visually interpretable (once you understand how to interpret them!) than normal graph visualizations allow -- to put it another way, it avoids the hairball effect. It sounds like you're struggling to plot things in one way, but the layout algorithms all want to make you see it another ... so maybe this option will be helpful. Also, you might get better graph layout suggestions on mailing lists that are focused on working with graphs. I'm pretty sure cytoscape has a mailing list you can ping, and there is also igraph (which has an R interface) which has its own mailing list ... maybe the pros there can help provide more insight. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with barplot
You can do pretty well without ggplot actually. boxplot(Time~paste(Incidents,Months),data=DF,border=c('grey20','red')) On Sat, May 28, 2011 at 2:55 AM, steven mosher mosherste...@gmail.com wrote: Thanks, ggplot is on my list of things to learn before Hadley comes here to the bay area to give a session on interactive graphics in R On Fri, May 27, 2011 at 10:29 PM, Joshua Wiley jwiley.ps...@gmail.comwrote: Hi Steven, This is not, strictly speaking, the answer to your question (hopefully Tom already answered that). Rather, it is the answer to questions you *might* have asked (and perhaps one of them will be one you wished you had asked). Barplots have a low data:ink ratio...you are using an entire plot to convey 8 means. A variety of alternatives exist. As a minimal first step, you could just use points to show the means and skip all the wasted bar space, and you might add error bars in (A). You could also use boxplots to give your viewers (or just yourself) a sense of the distribution along with the medians (B). Another elegant option is violin plots. These are kind of like (exactly like?) mirrored density plots. A measure of central tendency is not explicitly shown, but the *entire* distribution and range is shown (C). Cheers, Josh (P.S. I hit send too soon before and sent you an offlist message with PDF examples) ## Create your data DF - data.frame( Incidents = factor(rep(c(a, b, d, e), each = 25)), Months = factor(rep(1:2, each = 10)), Time = rnorm(100)) ## Load required packages require(ggplot2) require(Hmisc) ## Option A ggplot(DF, aes(x = Incidents, y = Time, colour = Months)) + stat_summary(fun.y = mean, geom = point, position = position_dodge(width = .90), size = 3) + stat_summary(fun.data = mean_cl_normal, geom = errorbar, position = dodge) ## Option B ggplot(DF, aes(x = Incidents, y = Time, fill = Months)) + geom_boxplot(position = position_dodge(width = .8)) ## Option C ggplot(DF, aes(x = Time, fill = Months)) + geom_ribbon(aes(ymax = ..density.., ymin = -..density..), alpha = .2, stat = density) + facet_grid( ~ Incidents) + coord_flip() ## Option C altered ggplot(DF, aes(x = Time, fill = Months)) + geom_ribbon(aes(ymax = ..density.., ymin = -..density..), alpha = .2, stat = density) + facet_grid( ~ Incidents + Months) + scale_y_continuous(name = density, breaks = NA, labels = NA) + coord_flip() On Fri, May 27, 2011 at 3:08 PM, steven mosher mosherste...@gmail.com wrote: Hi, I'm really struggling with barplot I have a data.frame with 3 columns. The first column represents an incident type The second column represents a month The third column represents a time Code for a sample data.frame incidents - rep(c('a','b','d','e'), each =25) months - rep(c(1,2), each =10) times -rnorm(100) # make my sample data DF - data.frame(Incidents=as.factor(incidents),Months=as.factor(months),Time=times) # now calculate a mean for the by groups of incident type and month pivot - aggregate(DF$Time,by=list(Incidents=DF$Incidents,Months=DF$Month),FUN=mean,simplify=TRUE) What I want to create is a bar plot where I have groupings by incident type ( a,b,d,e) and within each group I have the months in order. So group 1 would be Type a; month 1,2; group 2 would be Type b; month 1,2; group 3 would be Type d; month 1,2; group 4 would be Type 3; month 1,2; I know barplot is probably the right function but I'm a bit lost on how to specify groupings etc TIA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Changing the name of the R process in top
Hi all, Perhaps this is more of a unix question, but I'll give it a try here. I am running 9 different R processes at the same time (called from a shell script using R CMD BATCH). When I use the top program to monitor how they are doing, it is impossible to tell which R process is related to which R script. Is there a way to rename a specific instantiation of an R process in top with another, more informative name, e.g., something like R-script1 R-script2 etc? Thank you, Matt -- Matthew C Keller Asst. Professor of Psychology University of Colorado at Boulder www.matthewckeller.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Changing the name of the R process in top
Hi, On Sat, May 28, 2011 at 2:48 PM, Matthew Keller mckellerc...@gmail.com wrote: Hi all, Perhaps this is more of a unix question, but I'll give it a try here. I am running 9 different R processes at the same time (called from a shell script using R CMD BATCH). When I use the top program to monitor how they are doing, it is impossible to tell which R process is related to which R script. Is there a way to rename a specific instantiation of an R process in top with another, more informative name, e.g., something like R-script1 R-script2 etc? How about flipping it around and asking your scripts to let you know what process they are (so you can ID in `top` by their process id, and not process name/command). R Sys.getpid() [1] 27813 Maybe you can have your scripts `cat` that value to stdout when they run? -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Changing the name of the R process in top
Hi Matt, Not sure about renaming it but what about something like: $ ps -aux | grep R each process will have a unique PID that you can use to tell them apart (possibly even the call that started it, but I am not presently in a position to test and do not remember for certain). HTH, Josh On Sat, May 28, 2011 at 11:48 AM, Matthew Keller mckellerc...@gmail.com wrote: Hi all, Perhaps this is more of a unix question, but I'll give it a try here. I am running 9 different R processes at the same time (called from a shell script using R CMD BATCH). When I use the top program to monitor how they are doing, it is impossible to tell which R process is related to which R script. Is there a way to rename a specific instantiation of an R process in top with another, more informative name, e.g., something like R-script1 R-script2 etc? Thank you, Matt -- Matthew C Keller Asst. Professor of Psychology University of Colorado at Boulder www.matthewckeller.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Changing the name of the R process in top
Nice, Steve - I think this will work. I'll just call Sys.getpid() at the top of each session and then look at the .Rout files to figure out which is related to which... On Sat, May 28, 2011 at 1:10 PM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: Hi, On Sat, May 28, 2011 at 2:48 PM, Matthew Keller mckellerc...@gmail.com wrote: Hi all, Perhaps this is more of a unix question, but I'll give it a try here. I am running 9 different R processes at the same time (called from a shell script using R CMD BATCH). When I use the top program to monitor how they are doing, it is impossible to tell which R process is related to which R script. Is there a way to rename a specific instantiation of an R process in top with another, more informative name, e.g., something like R-script1 R-script2 etc? How about flipping it around and asking your scripts to let you know what process they are (so you can ID in `top` by their process id, and not process name/command). R Sys.getpid() [1] 27813 Maybe you can have your scripts `cat` that value to stdout when they run? -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact -- Matthew C Keller Asst. Professor of Psychology University of Colorado at Boulder www.matthewckeller.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R on SMP...
Dear all, We would like to use our SMP system, HP Superdome (128 GB, 32 x 2-core Itanium) the best possible way with R, especially to tackle problems involving large matrices. Any suggestion will be most appreciated. H. Nüzhet Dalfes Professor Istanbul Technical University and Deputy Director National Center for High Performance Computing Office Phone: +90 212 285-7125 Mobile: +90 532 206-1308 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Normality test
To build on Robert's suggestion (which is very good to begin with), you might consider using the vis.test function in the TeachingDemos package with the vt.qqnorm function. This will create the qq plot of your data along with several other qqplots of normal samples of the same size. If you cannot tell which of the plots is your data, then your data is probably close enough to normal for most practical purposes. It will give you a p-value based on your ability to distinguish your data from random normals if you need one. If you need more precision, then the most precise normality test is SnowsPenultimateNormalityTest also in TeachingDemos. However, the documentation for that function tends to be more useful than the function itself. If you really want to choose among the different normality tests in nortest (or elsewhere) then you should really investigate what assumptions they are making and what types of alternatives they are the most powerful for. Also decide on what types of non-normality you really care about, then use that to choose among them. Consider the 2 distributions where one is uniform between 0 and 1 with height 1; the other also has height 1 between 0 and 0.99, but is also 1 between 999.99 and 1000, zero elsewhere. Are these 2 distributions different in a meaningful way? They have very different mean and variance, but for most samples they will look the same (and if you throw out outliers they will look even more similar). The reason that different tests give different results is because they focus on different types of differences. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Robert Baer Sent: Friday, May 27, 2011 5:28 PM To: Salil Sharma; R-help@r-project.org Subject: Re: [R] Normality test I am writing to inquire about normality test given in nortest package. I have a random data set consisting of 300 samples. I am curious about which normality test in R would give me precise measurement, whether data sample is following normal distribution. As p value in each test is different in each test, if you could help me identifying a suitable test in R for this medium size of data, it will be grateful. I am neither a statistician nor an expert on these types of tests, but I'm guessing that your are unlikely to get a good answer even from people with such qualifications as such judgments can only be made in the context of a specific problem. You have not provided us with such a problem (please read the posting guide). That admonishment aside, I typically start by using qqnorm() and qqline() to plot my data against the expected theoretical quantiles. If your data is perfectly normal, the points will fall right along the line. Skewness and deviations from normal by the tails produce very characteristic patterns in the plots which you can learn about by plotting some simulated data that is left-skewed, right-skewed, long tailed, or short tailed. I personally find this graphical feedback to be a much more useful way to understand my data than doing a single normality test that produces a p-value. based upon assumptions I may not be privy to For more, see the help by typing: ?qqnorm ?qqline Rob -- Robert W. Baer, Ph.D. Professor of Physiology Kirksville College of Osteopathic Medicine A. T. Still University of Health Sciences 800 W. Jefferson St. Kirksville, MO 63501 660-626-2322 FAX 660-626-2965 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Markov Chain model coding
The msm package may be of use to you. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of phillowe Sent: Saturday, May 28, 2011 10:51 AM To: r-help@r-project.org Subject: [R] Markov Chain model coding Can anyone help with coding for Markov chain models. I have a data set from coral reefs in consecutive years that have been given a state depending on their coral and macroalgal cover. The markov model will be used to predict the cover of coral and macroalgae in the future dependant on the conditions during the observation. Any advice would be much appretiated. Phil -- View this message in context: http://r.789695.n4.nabble.com/Markov-Chain-model-coding-tp3557733p3557733.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Observation in a confidence ellipse
There are point in polygon functions in some of the spatial packages that would tell you if a point is within a polynomial approximation of the ellipse. But in your case I would take a different approach. Generally confidence ellipses are based on manhalobis distances, so you can just compute the manhalobis distance of the point relative to the mean vector (taking the appropriate covariance matrix into account), then if that distance is above a certain value it is outside the ellipse, if less it is inside. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jessica Minkue Sent: Saturday, May 28, 2011 11:45 AM To: r-help@r-project.org Subject: [R] Observation in a confidence ellipse Hello everyone I really need some help here. I made a confidence ellipse using the function ellipse from the package ellipse: ellipse(SD, centre=colMeans(pcsref),t=sqrt((p * (n-1)/(n-p))*qf(0.99, p,n-p)) Now, I want to write a function whom return TRUE or FALSE if a given observation is in the confidence ellipse. But I have no clue how to do it Can anyone help me? Best regards Jessica [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with barplot
Thanks Thomas. On Sat, May 28, 2011 at 11:29 AM, Thomas Levine thomas.lev...@gmail.comwrote: You can do pretty well without ggplot actually. boxplot(Time~paste(Incidents,Months),data=DF,border=c('grey20','red')) On Sat, May 28, 2011 at 2:55 AM, steven mosher mosherste...@gmail.com wrote: Thanks, ggplot is on my list of things to learn before Hadley comes here to the bay area to give a session on interactive graphics in R On Fri, May 27, 2011 at 10:29 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hi Steven, This is not, strictly speaking, the answer to your question (hopefully Tom already answered that). Rather, it is the answer to questions you *might* have asked (and perhaps one of them will be one you wished you had asked). Barplots have a low data:ink ratio...you are using an entire plot to convey 8 means. A variety of alternatives exist. As a minimal first step, you could just use points to show the means and skip all the wasted bar space, and you might add error bars in (A). You could also use boxplots to give your viewers (or just yourself) a sense of the distribution along with the medians (B). Another elegant option is violin plots. These are kind of like (exactly like?) mirrored density plots. A measure of central tendency is not explicitly shown, but the *entire* distribution and range is shown (C). Cheers, Josh (P.S. I hit send too soon before and sent you an offlist message with PDF examples) ## Create your data DF - data.frame( Incidents = factor(rep(c(a, b, d, e), each = 25)), Months = factor(rep(1:2, each = 10)), Time = rnorm(100)) ## Load required packages require(ggplot2) require(Hmisc) ## Option A ggplot(DF, aes(x = Incidents, y = Time, colour = Months)) + stat_summary(fun.y = mean, geom = point, position = position_dodge(width = .90), size = 3) + stat_summary(fun.data = mean_cl_normal, geom = errorbar, position = dodge) ## Option B ggplot(DF, aes(x = Incidents, y = Time, fill = Months)) + geom_boxplot(position = position_dodge(width = .8)) ## Option C ggplot(DF, aes(x = Time, fill = Months)) + geom_ribbon(aes(ymax = ..density.., ymin = -..density..), alpha = .2, stat = density) + facet_grid( ~ Incidents) + coord_flip() ## Option C altered ggplot(DF, aes(x = Time, fill = Months)) + geom_ribbon(aes(ymax = ..density.., ymin = -..density..), alpha = .2, stat = density) + facet_grid( ~ Incidents + Months) + scale_y_continuous(name = density, breaks = NA, labels = NA) + coord_flip() On Fri, May 27, 2011 at 3:08 PM, steven mosher mosherste...@gmail.com wrote: Hi, I'm really struggling with barplot I have a data.frame with 3 columns. The first column represents an incident type The second column represents a month The third column represents a time Code for a sample data.frame incidents - rep(c('a','b','d','e'), each =25) months- rep(c(1,2), each =10) times -rnorm(100) # make my sample data DF- data.frame(Incidents=as.factor(incidents),Months=as.factor(months),Time=times) # now calculate a mean for the by groups of incident type and month pivot - aggregate(DF$Time,by=list(Incidents=DF$Incidents,Months=DF$Month),FUN=mean,simplify=TRUE) What I want to create is a bar plot where I have groupings by incident type ( a,b,d,e) and within each group I have the months in order. So group 1 would be Type a; month 1,2; group 2 would be Type b; month 1,2; group 3 would be Type d; month 1,2; group 4 would be Type 3; month 1,2; I know barplot is probably the right function but I'm a bit lost on how to specify groupings etc TIA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rqss help in Quantreg
Dear All, I,m trying to fulfill a constraint nonparametric quantile regression analysis for monthly stock index and gdp (159 cases of data) using rqss function of quantreg package. I need to specify that stock prices are nondecreasing with growing gdp. I tried the following simple code fit1-rqss(stock~gdp) fit2-rqss(stock~qss(gdp,constraint=I)+time) but R produces an error message for the firsts line of the code Error in rqss.fit(X, Y, tau = tau, rhs = rhs, nsubmax = nsubmax, nnzlmax = nnzlmax, : object 'rhs' is not found for the second line of the code Error in D %*% B : NA/NaN/Inf when calling external function (argument 7) If I use returns instead of prices, the analysis goes. But I need to regress prices. What is wrong in my specification? Are there any restrictions in the rqss approach? -- View this message in context: http://r.789695.n4.nabble.com/rqss-help-in-Quantreg-tp3392770p3557884.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tm package
Hi, I am using the tm package. When I try to use findfreqterms I get an error message: findfreqterms(dtm,2,5) Error: could not find function findfreqterms Obvious thing such as calling the tm library and a creating document term matrix have been covered. I cannot find any dependencies that findfreqterms has. Can anyone help with this? Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Three sigma rule
Dear Sir, I have data, coming from tests, consisting of 300 values. Is there a way in R with which I can confirm this data to 68-95-99.8 rule or three-sigma rule? I need to look around percentile ranks and prediction intervals for this data. I, however, used SixSigma package and used ss.ci() function, which produced 95% confidence intervals. I still am not certain about percentile ranks conforming to 68-95-99.7 rule for this data. Thanks and regards, Salil Sharma [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] prcomp eigenvectors ... ??
Hi ... Please could you help with probably a very simple problem I have. I'm completely new to R and am trying to follow a tutorial using R for Force Distribution Analysis that I got from ... http://projects.eml.org/mbm/website/fda_gromacs.htm. Basically, the MDS I preform outputs a force matrix (.fm) from the force simulation I perform. Then, this matrix is read into R and prcomp is performed. Basically, the tutorial says 'Having run PCA, now is a good time to check the eigenvalue structure.' although it doesn't mention how I actually go about doing that with R. Could anyone tell me how I would be able to check the eigenvalues/eigenvectors?? Thanks so much for your help and I'm sorry if this is a stupid question!! Natalie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help! complete the reviewer's suggest: carry out GA+GP (gaussian process)!
I do not know about what you are asking. But your thread was helpful for me to workout some libsvm codes. Thank you for your help. Maybe we can interact for more productive work. -- View this message in context: http://r.789695.n4.nabble.com/help-complete-the-reviewer-s-suggest-carry-out-GA-GP-gaussian-process-tp3229097p3557876.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tm package
On May 28, 2011, at 2:21 PM, lloyd barcza wrote: Hi, I am using the tm package. When I try to use findfreqterms I get an error message: findfreqterms(dtm,2,5) Error: could not find function findfreqterms You are spelling it incorrectly. Obvious thing such as calling the tm library and a creating document term matrix have been covered. I cannot find any dependencies that findfreqterms has. Can anyone help with this? -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Three sigma rule
On May 28, 2011, at 2:12 PM, Salil Sharma wrote: Dear Sir, I have data, coming from tests, consisting of 300 values. Is there a way in R with which I can confirm this data to 68-95-99.8 rule or three- sigma rule? Can you describe this rule? I get the idea that it might be private language adopted by the SigxSigma sect. I need to look around percentile ranks and prediction intervals for this data. I, however, used SixSigma package and used ss.ci() function, which produced 95% confidence intervals. I still am not certain about percentile ranks conforming to 68-95-99.7 rule for this data. The quantile function is pretty much standard operating procedure. fivenum will return the values that would appear in a box-and-whiskers plot. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Installing package rgdal
Dear R-helpers, I am trying to install the package rgdal using the command install.packages(rgdal) in R. I get the following error Error: proj_api.h not found. If the PROJ.4 library is installed in a non-standard location, use --configure-args='--with-proj-include=/opt/local/include' for example, replacing /opt/local/* with appropriate values for your installation. If PROJ.4 is not installed, install it. ERROR: configuration failed for package ‘rgdal’ * removing ‘/home/alex/R/i686-pc-linux-gnu-library/2.11/rgdal’ The downloaded packages are in ‘/tmp/RtmpVDlld6/downloaded_packages’ Warning message: In install.packages(rgdal) : installation of package 'rgdal' had non-zero exit status I am using ubuntu 10.10 and have installed PROJ.4 using the software center. I have also downloaded a binary of PROJ.4 and placed it in /opt/local/ I have looked at the documentation for rgdal here - http://cran.r-project.org/web/packages/rgdal/index.html But it seems to be about using rgdal as opposed to installing it. Any help would be appreciated. Kind regards, Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Installing package rgdal
Alex - Notice that the error message says Error: proj_api.h not found. Files ending with a suffix of .h are known as header files, and on Ubuntu, they are distributed in so-called development packages, which generally end with -dev . So I'm guessing that you installed the proj-bin package, and now you need to install the libproj-dev package. You'll see the same thing for any R package that needs to be built against an external library. Installing the library or binary is not enough -- to build the R package you need the development files. Hope this helps. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Sun, 29 May 2011, Alex Olssen wrote: Dear R-helpers, I am trying to install the package rgdal using the command install.packages(rgdal) in R. I get the following error Error: proj_api.h not found. If the PROJ.4 library is installed in a non-standard location, use --configure-args='--with-proj-include=/opt/local/include' for example, replacing /opt/local/* with appropriate values for your installation. If PROJ.4 is not installed, install it. ERROR: configuration failed for package ‘rgdal’ * removing ‘/home/alex/R/i686-pc-linux-gnu-library/2.11/rgdal’ The downloaded packages are in ‘/tmp/RtmpVDlld6/downloaded_packages’ Warning message: In install.packages(rgdal) : installation of package 'rgdal' had non-zero exit status I am using ubuntu 10.10 and have installed PROJ.4 using the software center. I have also downloaded a binary of PROJ.4 and placed it in /opt/local/ I have looked at the documentation for rgdal here - http://cran.r-project.org/web/packages/rgdal/index.html But it seems to be about using rgdal as opposed to installing it. Any help would be appreciated. Kind regards, Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Observation in a confidence ellipse
On 29/05/11 05:45, Jessica Minkue wrote: Hello everyone I really need some help here. I made a confidence ellipse using the function ellipse from the package ellipse: ellipse(SD, centre=colMeans(pcsref),t=sqrt((p * (n-1)/(n-p))*qf(0.99, p,n-p)) Now, I want to write a function whom return TRUE or FALSE if a given observation is in the confidence ellipse. But I have no clue how to do it Can anyone help me? First of all, you are probably way off base talking about ***confidence*** ellipses here. If you are testing whether observations are inside the ellipses, then you are most likely interested in ***prediction*** ellipses. It is vital that you understand the difference. But to answer your question: It would be easy enough to do it from scratch. Let your ellipse be defined by (x - mu)' M(x-mu) = c where mu is the ``centre'', x is a 2-vector (x_1,x_2)', M is a positive definite matrix, c is a positive constant, and ' denotes ``transpose''. Then a point x = (x_1,x_2)' is inside the ellipse if and only if (x - mu)' M(x-mu) = c. Coding this up is an easy exercise. If you can't do it, you probably shouldn't be messing with this stuff in the first place. However if you want to use a sledge-hammer to crack a peanut, install the spatstat package and then do: require(spatstat) W - owin(poly=your ellipse) inside.owin(x1,x2,W) # Where x1 and x2 are the x and y coordinates of the points you wish to test. cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Markov Chain model coding
On 29/05/11 04:50, phillowe wrote: Can anyone help with coding for Markov chain models. I have a data set from coral reefs in consecutive years that have been given a state depending on their coral and macroalgal cover. The markov model will be used to predict the cover of coral and macroalgae in the future dependant on the conditions during the observation. Any advice would be much appretiated. Have a look at the CRAN packages: * msm * HiddenMarkov * hmm.discnp cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Setting max. iterations for lmer
Hello, I hate to ask a question which is directly addressed in the documentation, but can someone please give me an example of how to change the maximum number of iterations used by lmer. I'm having a hard time understanding this: control a list of control parameters. See below for details. control a named list of control parameters for the estimation algorithm, specifying only the ones to be changed from their default values. Hence defaults to an empty list. Possible control options and their default values are: msVerbose: a logical value passed as the trace argument to nlminb (see documentation on that function). Default is getOption(verbose). maxIter: a positive integer passed as the maxIter argument to nlminb (see documentation on that function). Default is 300. maxFN: a positive integer specifying the maximum number of evaluations of the deviance function allowed during the optimization. Default is 900 Thank you in advance, Mitch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Setting max. iterations for lmer
Hi Mitch, Something like: m - lmer(Reaction ~ Days + (1|Subject) + (0+Days|Subject), sleepstudy, control = list(maxIter = 1)) Hope this helps, Josh On Sat, May 28, 2011 at 6:05 PM, Downey, Patrick pdow...@urban.org wrote: Hello, I hate to ask a question which is directly addressed in the documentation, but can someone please give me an example of how to change the maximum number of iterations used by lmer. I'm having a hard time understanding this: control a list of control parameters. See below for details. control a named list of control parameters for the estimation algorithm, specifying only the ones to be changed from their default values. Hence defaults to an empty list. Possible control options and their default values are: msVerbose: a logical value passed as the trace argument to nlminb (see documentation on that function). Default is getOption(verbose). maxIter: a positive integer passed as the maxIter argument to nlminb (see documentation on that function). Default is 300. maxFN: a positive integer specifying the maximum number of evaluations of the deviance function allowed during the optimization. Default is 900 Thank you in advance, Mitch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.