Re: [R] hierarchical clustering of large dataset
On Fri, Mar 09, 2012 at 08:26:01PM -0500, Massimo Di Stefano wrote: my target is to have 'groups of species' based on the similarity of theyr environmental parameters, and build a dendrogram like [2] [2] http://massimo-timecapsule.whoi.edu//data/img/manova_clust_matlab.png Il giorno Mar 9, 2012, alle ore 7:18 PM, Peter Langfelder ha scritto: Well, you didn't say that column e was a label that you wanted to keep separate. Any other labels in the data? You may not want to use labels in the distance calculation. If you want to use the results of the cluster-analysis as evidence on similarities and differences between species, you _must_ not include numeric variables representing labels in the matrix. Including them would mean imposing the expected result onto the data. First do the cluster analysis, then test the distribution of species in clusters. -- Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issues in installing rgl in Mac OS 10.6.8
Please ask about OS X on R-sig-mac . There is something you have not installed on your OS, but it will probably need several rounds to find what (and it will be not just Mac-specific but depend on the exact versions of OS X (which you told us) and Xcode (which you did not)). On Fri, 9 Mar 2012, A Ezhil wrote: Dear All, I am trying to install rgl on my mac notebook from the source file. I tried using: /usr/bin/R64 CMD INSTALL rgl_0.92.798.tar.gz and get the following error message: checking for X... no configure: error: X11 not found but required, configure aborted. ERROR: configuration failed for package ‘rgl’ * removing ‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library/rgl’ * restoring previous ‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library/rgl’ I do see a directory X11 installed under /usr and Sys.getenv(PATH) inside R gives me: [1] /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin Could you please hep me to install rgl package? sessionInfo() R version 2.14.1 (2011-12-22) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base Thanks in advance. Kind regards, Ezhil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] round giving different results on Windows and Mac
On Fri, Mar 09, 2012 at 09:34:14PM +, Ruth Ripley wrote: Dear all, I have been running some tests of my package RSiena on different platforms and trying to reconcile the results. With Mac, the commands options(digits=4) round(1.81652, digits=4) print 1.817 With Windows, the same commands print 1.816 I am not bothered which answer I get, but it would be nice if they were the same. A linux box agreed with the Mac. Hi. I obtain the same difference between Linux (1.817) and 32 bit Windows (1.816). As Duncan said, the number 1.8165 is not exactly representable and printing it to 4 significant digits may depend on the platform, since it is a middle case. Note that options(digits=4) means rounding to 4 significant digits, while round(1.81652, digits=4) is rounding to 4 digits in the fractional part. Try signif(1.81652, digits=4) to get the same type of rounding as in options(digits=4). The problem is not in round(), since x - round(1.81652, digits=4) print(x, digits=20) print(x, digits=4) yields on Linux [1] 1.8165036 [1] 1.817 and on 32 bit Windows [1] 1.8165036 [1] 1.816 The difference is not due to R, since R is responsible only for the choice of the number of printed digits and not for the digits themselves. The digits are computed by sprintf() on the given platform. So, the difference seems to be there. The command sprintf(%5.3f, 18165/1) yields on Linux [1] 1.817 and on 32 bit Windows [1] 1.816 Thank you for the example. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Treat Variable as String and a String as variables name
Dear all. I am having ten variables (let's call the four of them as Alpha, Beta, Gamma and Delta.) For each variable I have to print around 100 (plots). E So far I was copying paste the code below many times. pdf(file=DC_Alpha_All.pdf, width=15) # First Variable is treated as string plot_dc_for_multiple_kapas(Alpha, 4, c(5, 4), coloridx=c(24, 32)) # First Variable is now passed #inside the function as variable dev.off() So I could save my time If I can make a function that for every variable produces the current number of plots. The problem is, as you can also see from comment above that my variable has to be converted to string (first line) and also at the second line should be used as a variable. How I can make a loop in R that for a list of variables (the 10 variables I gave at the beginning) can either treat each entry of that list once as a string and once a real variable. Could you please help me with that? Best Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issues in installing rgl in Mac OS 10.6.8
On Mar 10, 2012, at 09:57 , Prof Brian Ripley wrote: Please ask about OS X on R-sig-mac . Yep. (Or R-devel for generic developer issues, but this one is pretty OSX specific.) There is something you have not installed on your OS, but it will probably need several rounds to find what (and it will be not just Mac-specific but depend on the exact versions of OS X (which you told us) and Xcode (which you did not)). It is certainly non-trivial to install this particular package from source. Is there any reason you don't want to use the precompiled version from CRAN? I mean, it is all well and good that more people do source builds so that we don't end up with a situation where only one or two persons actually know how to build stuff, but it might not be the most productive route if you actually need to get things done... On Fri, 9 Mar 2012, A Ezhil wrote: Dear All, I am trying to install rgl on my mac notebook from the source file. I tried using: /usr/bin/R64 CMD INSTALL rgl_0.92.798.tar.gz and get the following error message: checking for X... no configure: error: X11 not found but required, configure aborted. ERROR: configuration failed for package ‘rgl’ * removing ‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library/rgl’ * restoring previous ‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library/rgl’ I do see a directory X11 installed under /usr and Sys.getenv(PATH) inside R gives me: [1] /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin Could you please hep me to install rgl package? sessionInfo() R version 2.14.1 (2011-12-22) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base Thanks in advance. Kind regards, Ezhil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Treat Variable as String and a String as variables name
On 10-03-2012, at 10:39, Alaios wrote: Dear all. I am having ten variables (let's call the four of them as Alpha, Beta, Gamma and Delta.) For each variable I have to print around 100 (plots). E So far I was copying paste the code below many times. pdf(file=DC_Alpha_All.pdf, width=15) # First Variable is treated as string plot_dc_for_multiple_kapas(Alpha, 4, c(5, 4), coloridx=c(24, 32)) # First Variable is now passed #inside the function as variable dev.off() So I could save my time If I can make a function that for every variable produces the current number of plots. The problem is, as you can also see from comment above that my variable has to be converted to string (first line) and also at the second line should be used as a variable. How I can make a loop in R that for a list of variables (the 10 variables I gave at the beginning) can either treat each entry of that list once as a string and once a real variable. Something like this varlist - LETTERS[1:10] varlist for( k in 1:length(varlist) ) assign(varlist[k], runif(10)) varlist myplot - function(x,k) plot(x,col=k) for( k in 1:length(varlist) ) { varname - varlist[k] filename - paste(DC_,varname,_All.pdf, sep=) pdf(file=filename, width=15) myplot(get(varlist[k]),k) dev.off() } Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help please. 2 tables, which test?
Thank you for the replies. So what my test wants to do is this: I have a big matrix, 30 rows (students in a class) X 50 columns (students grades for the year). An example of the matrix is as such: grade1 grade2grade3 . grade 50 student 1 student 2*** student 3 student 4*** student 5*** student 6 . . . . . student 30*** As you can see, four students (students 2,4,5 and 30) have stars beside their name. I have chosen these students based on a particular characteristic that they all share.I then pulled these students out to make a new table: grade1 grade2 grade3 ... grade 50 student 2 student 4 student 5 student 30 and what i want to see is basically is there any difference between the grades this particular set of students(i.e. student 2,4,5 and 30) got, and the class as a whole? So my null hypothesis is that there is no difference between this set of students grades, and what you would expect from the class as a whole. Aaral On Sat, Mar 10, 2012 at 12:18 AM, Greg Snow 538...@gmail.com wrote: Just what null hypothesis are you trying to test or what question are you trying to answer by comparing 2 matrices of different size? I think you need to figure out what your real question is before worrying about which test might work on it. Trying to get your data to fit a given test rather than finding the appropriate test or other procedure to answer your question is like buying a new suit then having plastic surgery to make you fit the suit rather than having the tailor modify the suit to fit you. If you can give us more information about what your question is we have a better chance of actually helping you. On Fri, Mar 9, 2012 at 9:46 AM, aoife doherty aaral.si...@gmail.com wrote: Thank you. Can the chi-squared test compare two matrices that are not the same size, eg if matrix 1 is a 2 X 4 table, and matrix 2 is a 3 X 5 matrix? On Fri, Mar 9, 2012 at 4:37 PM, Greg Snow 538...@gmail.com wrote: The chi-squared test is one option (and seems reasonable to me if it the the proportions/patterns that you want to test). One way to do the test is to combine your 2 matrices into a 3 dimensional array (the abind package may help here) and test using the loglin function. On Thu, Mar 8, 2012 at 5:46 AM, aaral singh aaral.si...@gmail.com wrote: Hi.Please help if someone can. Problem: I have 2 matrices Eg matrix 1: Freq None Some Heavy32 5 Never8 13 8 Occas14 4 Regul 95 7 matrix 2: Freq None Some Heavy7 1 3 Never 87 18 84 Occas 12 34 Regul917 I want to see if matrix 1 is significantly different from matrix 2. I consider using a chi-squared test. Is this appropriate? Could anyone advise? Many thank you. Aaral Singh -- View this message in context: http://r.789695.n4.nabble.com/help-please-2-tables-which-test-tp4456312p4456312.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issues in installing rgl in Mac OS 10.6.8
On Fri, Mar 09, 2012 at 04:52:31PM -0800, A Ezhil wrote: Dear All, I am trying to install rgl on my mac notebook from the source file. I tried using: /usr/bin/R64 CMD INSTALL rgl_0.92.798.tar.gz and get the following error message: checking for X... no configure: error: X11 not found but required, configure aborted. ERROR: configuration failed for package ‘rgl’ * removing ‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library/rgl’ * restoring previous ‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library/rgl’ I do see a directory X11 installed under /usr and Sys.getenv(PATH) inside R gives me: [1] /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin Could you please hep me to install rgl package? Not really, but I can offer a hint: I think your system has the _runtime_ libraries for X11 (in /usr/X11), but you need _development_ libraries to comile rgl. I have no knowledge about Mac OS, but in my system, Debian GNU/Linux, the needed libraries to build rgl from source are: libgl1-mesa-dev libglu1-mesa-dev mesa-common-dev -- Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issues in installing rgl in Mac OS 10.6.8
On 10-03-2012, at 12:49, Hans Ekbrand wrote: On Fri, Mar 09, 2012 at 04:52:31PM -0800, A Ezhil wrote: Dear All, I am trying to install rgl on my mac notebook from the source file. I tried using: /usr/bin/R64 CMD INSTALL rgl_0.92.798.tar.gz and get the following error message: checking for X... no configure: error: X11 not found but required, configure aborted. ERROR: configuration failed for package ‘rgl’ * removing ‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library/rgl’ * restoring previous ‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library/rgl’ I do see a directory X11 installed under /usr and Sys.getenv(PATH) inside R gives me: [1] /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin Could you please hep me to install rgl package? Not really, but I can offer a hint: I think your system has the _runtime_ libraries for X11 (in /usr/X11), but you need _development_ libraries to comile rgl. I have no knowledge about Mac OS, but in my system, Debian GNU/Linux, the needed libraries to build rgl from source are: libgl1-mesa-dev libglu1-mesa-dev mesa-common-dev That's for Linux systems not for Mac OS X. One of the many possibilities is, is that your X11 has been corrupted/trashed. You could try reinstalling X11 (from the install disc). Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Window on a vector
Dear all, I have a large vector (lets call it myVector) and I want to plot its value with the logic below yaxis-myVector[1] yaxis-c(xaxis,mean(myvector[2:3]) yaxis-c(xaxis,mean(myvector[4:8]) yaxisc(xaxis,mean(myvector[9:16]) yaxisc(xaxis,mean(myvector[17:32]) this has to stop when the new . yaxisc(xaxis,mean(myvector[1024:2048]) will not find the correspondent number of elements, either wise it will stop with an error. How I can do something like that in R? I would like to thank you in advance for your help B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with effects : 'subscript out of bounds'
Dear Nicole, Sorry, I didn't notice the earlier messages in this thread. Please see below. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Nicole Marie Ford Sent: March-10-12 1:20 AM To: r-help Subject: Re: [R] problem with effects : 'subscript out of bounds' if that is not specific (or not general) enough: newDV - dat$DV ## newDV is my DV it is continuous. newDV - as.numeric(newDV)-5 str(newDV) (i had to do a great deal of coding here so i am snipping down to the end) tmp[which(dat$v1 == stuff dat$v2 == more stuff)] - lots of stuff tmp - factor(tmp, levels=c(la, la la, fa la la)) dat$v3 - tmp newIV - as.factor(dat$v3) newIV is my IV, a factor as you can see. n.var4 - dat$v4 ## control n.var5 - dat$v5 ##control (there are others but they were coded the same) n.mod1 - lm(newDV ~ newIV + v4 + v5 + v6 + v7 + v8 + v9, data=dat) ### linear model. all of these variables already are specific to the dataset which i called 'norway' so there is no need to specify in the model. summary(n.mod1) plot(effect(newIV, n.mod1), multiline=T) Error in plot(effect(nor.trust, n.mod1), multiline = T) : error in evaluating the argument 'x' in selecting a method for function 'plot': Error in apply(mod.matrix[, components], 1, prod) : subscript out of bounds This seems very odd: The command given is plot(effect(newIV, n.mod1), multiline=T) but the error message is apparently for plot(effect(nor.trust, n.mod1), multiline = T) and nor.trust isn't a variable in the model. What command did you execute? Although it's not relevant to the error, there's no point in setting multiline=TRUE for a model without interactions. Best, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox ~~this ran perfectly on my previous dataset so i am unsure of the issue. thanks in advance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Window on a vector
On Mar 10, 2012, at 7:44 AM, Alaios wrote: Dear all, I have a large vector (lets call it myVector) and I want to plot its value with the logic below yaxis-myVector[1] yaxis-c(xaxis,mean(myvector[2:3]) yaxis-c(xaxis,mean(myvector[4:8]) yaxisc(xaxis,mean(myvector[9:16]) yaxisc(xaxis,mean(myvector[17:32]) this has to stop when the new . yaxisc(xaxis,mean(myvector[1024:2048]) will not find the correspondent number of elements, either wise it will stop with an error. How I can do something like that in R? This will generate two series that are somewhat like your index specification. I say somewhat because you appear to have changed the indexing strategy in the middle. You start with 2^0. 2^1 and 2^2 as you begin but then switch to 2^3+1, and 2^4+1. n=20 cbind(2^(0:(n-1)), 2^(1:n)-1) You can decide what to use for n with logic like: which.max(20 = 2^(1:10) ) Then you can use sapply or mapply. Alex [[alternative HTML version deleted]] Please learn to post in plain text. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rgl: cylinder3d() with elliptical cross-section
My first reply to this went privately, by accident. I've done a little editing to it, but mainly this is for the archives. On 12-03-09 2:36 PM, Michael Friendly wrote: For a paper dealing with generalized ellipsoids, I want to illustrate in 3D an ellipsoid that is unbounded in one dimension, having the shape of an infinite cylinder along, say, z, but whose cross-section in (x,y) is an ellipse, say, given by the 2x2 matrix cov(x,y). I've looked at rgl:::cylinder3d, but don't see any way to make it accomplish this. Does anyone have any ideas? rgl has no way to display curved surfaces that are unbounded. (It has lines and planes that adapt to the viewport.) So you would need to make a finite cylinder, and it will be up to you to choose how big to make it. The cylinder3d() function can do that, but it's not very good at cylinders that are straight. (This is a little embarrassing...) It sets up a local coordinate system based on the curvature, but if there are no curves, it fails, and you have to supply your own coordinates. So here's how I would do what you want: center - cbind(0, 0, 1:10) # cylinder centered on points (0, 0, z) e2 - cbind(1, 0, rep(0, 10)) # define the normal vectors cyl - cylinder3d(center, e2=e2) # Now you have an octagonal cylinder. Use the sides arg to cylinder3d # if it doesn't end up smooth enough, but in most cases I've seen, 8 # is sufficient. # Define a transformation to the x and y coordinates to give the # elliptical shape; use it as the # top left 2x2 matrix of a 3x3 matrix xfrm - matrix( c(2, 1, 0, 1, 3, 0, 0, 0, 1), 3,3, byrow=TRUE) cyl - transform3d(cyl, xfrm) cyl - addNormals(cyl) # this makes it shade smoothly shade3d(cyl, col=green) decorate3d() # show some axes for scale Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Treat Variable as String and a String as variables name
Thanks a lot works great :) Alex From: Berend Hasselman b...@xs4all.nl Cc: R help R-help@r-project.org Sent: Saturday, March 10, 2012 11:19 AM Subject: Re: [R] Treat Variable as String and a String as variables name On 10-03-2012, at 10:39, Alaios wrote: Dear all. I am having ten variables (let's call the four of them as Alpha, Beta, Gamma and Delta.) For each variable I have to print around 100 (plots). E So far I was copying paste the code below many times. pdf(file=DC_Alpha_All.pdf, width=15) # First Variable is treated as string plot_dc_for_multiple_kapas(Alpha, 4, c(5, 4), coloridx=c(24, 32)) # First Variable is now passed #inside the function as variable dev.off() So I could save my time If I can make a function that for every variable produces the current number of plots. The problem is, as you can also see from comment above that my variable has to be converted to string (first line) and also at the second line should be used as a variable. How I can make a loop in R that for a list of variables (the 10 variables I gave at the beginning) can either treat each entry of that list once as a string and once a real variable. Something like this varlist - LETTERS[1:10] varlist for( k in 1:length(varlist) ) assign(varlist[k], runif(10)) varlist myplot - function(x,k) plot(x,col=k) for( k in 1:length(varlist) ) { varname - varlist[k] filename - paste(DC_,varname,_All.pdf, sep=) pdf(file=filename, width=15) myplot(get(varlist[k]),k) dev.off() } Berend [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] round giving different results on Windows and Mac
Petr, Many thanks for this detailed explanation. It seems that the printing is going to vary because it is not done by R. I will try alternative numbers of significant digits: I had set options(digits=4) in an attempt to avoid inter-platform printing differences, without really understanding what was causing them. Ruth Original Message Subject: Re: [R] round giving different results on Windows and Mac Date: Sat, 10 Mar 2012 10:08:21 +0100 From: Petr Savicky savi...@cs.cas.cz To: r-help@r-project.org On Fri, Mar 09, 2012 at 09:34:14PM +, Ruth Ripley wrote: Dear all, I have been running some tests of my package RSiena on different platforms and trying to reconcile the results. With Mac, the commands options(digits=4) round(1.81652, digits=4) print 1.817 With Windows, the same commands print 1.816 I am not bothered which answer I get, but it would be nice if they were the same. A linux box agreed with the Mac. Hi. I obtain the same difference between Linux (1.817) and 32 bit Windows (1.816). As Duncan said, the number 1.8165 is not exactly representable and printing it to 4 significant digits may depend on the platform, since it is a middle case. Note that options(digits=4) means rounding to 4 significant digits, while round(1.81652, digits=4) is rounding to 4 digits in the fractional part. Try signif(1.81652, digits=4) to get the same type of rounding as in options(digits=4). The problem is not in round(), since x - round(1.81652, digits=4) print(x, digits=20) print(x, digits=4) yields on Linux [1] 1.8165036 [1] 1.817 and on 32 bit Windows [1] 1.8165036 [1] 1.816 The difference is not due to R, since R is responsible only for the choice of the number of printed digits and not for the digits themselves. The digits are computed by sprintf() on the given platform. So, the difference seems to be there. The command sprintf(%5.3f, 18165/1) yields on Linux [1] 1.817 and on 32 bit Windows [1] 1.816 Thank you for the example. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ruth M. Ripley, Email:r...@stats.ox.ac.uk Dept. of Statistics,http://www.stats.ox.ac.uk/~ruth/ University of Oxford, Tel: 01865 282857 1 South Parks Road, Oxford OX1 3TG, UK Fax: 01865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] round giving different results on Windows and Mac
Duncan, Thanks for your reply: given Petr's response, it seems the problem is the interpretation by the printing code, not the actual representation of the number. Given the representation, 1.817 would be correct, unless at the working accuracy it is considered to be equal to 1.8165 (as indeed I asked it to be). I will have to find a workaround, or live with two sets of test results. Ruth On 10/03/2012 00:48, Duncan Murdoch wrote: On 12-03-09 4:34 PM, Ruth Ripley wrote: Dear all, I have been running some tests of my package RSiena on different platforms and trying to reconcile the results. With Mac, the commands options(digits=4) round(1.81652, digits=4) print 1.817 The value you're printing is 1.8165, so I believe Windows gets it right using our round-to-even rule, but I'm not surprised that there are differences. The value 1.8165 isn't exactly representable, so it's somewhat random whether a system chooses to represent it slightly larger or slightly smaller. Duncan Murdoch With Windows, the same commands print 1.816 I am not bothered which answer I get, but it would be nice if they were the same. A linux box agreed with the Mac. Mac sessionInfo(): R version 2.14.2 (2012-02-29) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] RSiena_1.0.12.205 loaded via a namespace (and not attached): [1] grid_2.14.2 lattice_0.20-0 Matrix_1.0-4 tools_2.14.2 Windows (but 2.14.1patched was the same) sessionInfo(): R version 2.15.0 alpha (2012-03-08 r58640) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United Kingdom.1252 [2] LC_CTYPE=English_United Kingdom.1252 [3] LC_MONETARY=English_United Kingdom.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United Kingdom.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base Any enlightenment would be gratefully received. Ruth -- Ruth M. Ripley, Email:r...@stats.ox.ac.uk Dept. of Statistics,http://www.stats.ox.ac.uk/~ruth/ University of Oxford, Tel: 01865 282857 1 South Parks Road, Oxford OX1 3TG, UK Fax: 01865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] round giving different results on Windows and Mac
Hello Ruth: Many thanks for this detailed explanation. It seems that the printing is going to vary because it is not done by R. I will try alternative numbers of significant digits: I had set options(digits=4) in an attempt to avoid inter-platform printing differences, without really understanding what was causing them. If you round the numbers by, say, signif(x, digits=4) before printing and print with at least 4 digits precision, then the output should not depend on the printing function, but on signif(), since in this case, the printing function does not get middle cases. Function signif() can also have platform dependence, but i think, it should be rare. Send examples of platform dependencies in signif(), if you find some. Petr. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] index values of one matrix to another of a different size
Thanks for the info. Unfortunately its a little bit slower after one apples to apples test using my big data. Mine: 0.28 seconds. Yours. 0.73 seconds. Not a big deal, but significant when I have to do this 300 to 500 times. regards, ben On Fri, Mar 9, 2012 at 1:23 PM, Rui Barradas rui1...@sapo.pt wrote: Hello, I don't know if it's the fastest but it's more natural to have an index matrix with two columns only, one for each coordinate. And it's fast. fun - function(valdata, inxdata){ nr - nrow(inxdata) nc - ncol(inxdata) mat - matrix(NA, nrow=nr*nc, ncol=2) i1 - 1 i2 - nr for(j in 1:nc){ mat[i1:i2, 1] - inxdata[, j] mat[i1:i2, 2] - rep(j, nr) i1 - i1 + nr i2 - i2 + nr } matrix(valdata[mat], ncol=nc) } fun(vals, indx) Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/Re-index-values-of-one-matrix-to-another-of-a-different-size-tp4458666p4460575.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Paste ignore arrayys
Dear all, I am using paste to create a file name. filename- paste(GPS_, TimeStamps, sep=) where TimeStamps is a character vector, of two elements. The problem is that the paste instead of one string will make two, one for each entry of the TimeStamps vector. Would it be possible to make the TimeStamps entry interconnect by _ and convert them to string. That then will create the GPS_18:00_19:00 which is what I want. 'I would like to thank you in advance for your hel B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Paste ignore arrayys
Read ?paste and use something like paste(GPS_, paste(TimeStamps, collapse = _), sep = ) Michael On Sat, Mar 10, 2012 at 11:41 AM, Alaios ala...@yahoo.com wrote: Dear all, I am using paste to create a file name. filename- paste(GPS_, TimeStamps, sep=) where TimeStamps is a character vector, of two elements. The problem is that the paste instead of one string will make two, one for each entry of the TimeStamps vector. Would it be possible to make the TimeStamps entry interconnect by _ and convert them to string. That then will create the GPS_18:00_19:00 which is what I want. 'I would like to thank you in advance for your hel B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] max.print
Dear all. I wanted to read in a 20,000 row X 60 column matrix (called table) into R. i did this: R table - read.table(table, header=TRUE) table it prints out the start of my table (~1 rows by 7 columns) and then this error: [ reached getOption(max.print) -- omitted 5465 rows ]] There were 50 or more warnings (use warnings() to see the first 50) I have tried: options(max.print = Inf) and options(max.print = 9) but i still get the same error. I have seen many people on R help have this problem. However the solution of options(max.print = Inf) does not seem to work for me. Any ideas? Syb [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Use different panel functions with lattice
Hi, I have a data.frame df with names(df) = c(Var1, Var2, Var3, Var4) and I plot data with xyplot(Var1+Var2~Var3|Var4, data=df) I want to use different panel functions for Var1 and Var2. How can I do ? Something like : panel.mypanel = function(x, y, ...) { if (Var1) panel.Var1Panel(x, y, ...) else panel.Var2Panel(x, y, ...) } xyplot(Var1+Var2~Var3|Var4, data=df, panel=panel.mypanel) (I have search with google, but I found nothing) Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] max.print
Hey i have a similar size dataset and ran into the same problem, but i found this command works fine: options(max.print=100) to fix it? On Sat, Mar 10, 2012 at 4:35 PM, sybil kennelly sybilkenne...@gmail.comwrote: Dear all. I wanted to read in a 20,000 row X 60 column matrix (called table) into R. i did this: R table - read.table(table, header=TRUE) table it prints out the start of my table (~1 rows by 7 columns) and then this error: [ reached getOption(max.print) -- omitted 5465 rows ]] There were 50 or more warnings (use warnings() to see the first 50) I have tried: options(max.print = Inf) and options(max.print = 9) but i still get the same error. I have seen many people on R help have this problem. However the solution of options(max.print = Inf) does not seem to work for me. Any ideas? Syb [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help on subgraphs in xyplot of lattice library
Dear All, I would like to ask a question on how to do overlay plots in each subgraph of xyplot. 1. I did simulations for m=1000, 2500, 5000, 1, as the sample sizes. 2. for each sample size value m, 4 graphs are generated; each graph contains overlayed comparisons between 4 methods, 3. now I want put them into a 4-by-4 plot by xyplot, i.e., 4 sample size values, each of which has 4 plots. I know how to do this using plot, but the spaces between subplots are big. I do not know how to make each subplot in xyplot an overlayed one as it would appear using plot. Any help would be appreciated! Thank you, Chee [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help on subgraphs in xyplot of lattice library
What does your data look likedput() is your friend. Also, it'd be helpful if you could give base graphics code for more-or-less what you are looking for (since you can do so already) as it's pretty hard to describe graphics without pictures. Running example(xyplot) might help you get started as well. Michael On Sat, Mar 10, 2012 at 12:04 PM, Chee Chen chee.c...@yahoo.com wrote: Dear All, I would like to ask a question on how to do overlay plots in each subgraph of xyplot. 1. I did simulations for m=1000, 2500, 5000, 1, as the sample sizes. 2. for each sample size value m, 4 graphs are generated; each graph contains overlayed comparisons between 4 methods, 3. now I want put them into a 4-by-4 plot by xyplot, i.e., 4 sample size values, each of which has 4 plots. I know how to do this using plot, but the spaces between subplots are big. I do not know how to make each subplot in xyplot an overlayed one as it would appear using plot. Any help would be appreciated! Thank you, Chee [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] index values of one matrix to another of a different size
Hi Ben, It seems likely that there are bigger bottle necks in your overall program/use---have you tried Rprof() to find where things really get slowed down? In any case, f2() below takes about 70% of the time as your function in your test data, and 55-65% of the time for a bigger example I constructed. Rui's function benefits substantially from byte compiling, but is still slower. As a side benefit, f2() seems to use less memory than your current implementation. Cheers, Josh %% ##sample data ## vals - matrix(LETTERS[1:9], nrow = 3, ncol = 3, dimnames = list(c('row1','row2','row3'), c('col1','col2','col3'))) indx - matrix(c(1,1,3,3,2,2,2,3,1,2,2,1), nrow=4, ncol=3) storage.mode(indx) - integer f - function(x, i, di = dim(i), dx = dim(x)) { out - x[c(i + matrix(0:(dx[1L] - 1L) * dx[1L], nrow = di[1L], ncol = di[2L], TRUE))] dim(out) - di return(out) } fun - function(valdata, inxdata){ nr - nrow(inxdata) nc - ncol(inxdata) mat - matrix(NA, nrow=nr*nc, ncol=2) i1 - 1 i2 - nr for(j in 1:nc){ mat[i1:i2, 1] - inxdata[, j] mat[i1:i2, 2] - rep(j, nr) i1 - i1 + nr i2 - i2 + nr } matrix(valdata[mat], ncol=nc) } require(compiler) f2 - cmpfun(f) fun2 - cmpfun(fun) system.time(for (i in 1:1) f(vals, indx)) system.time(for (i in 1:1) f2(vals, indx)) system.time(for (i in 1:1) fun(vals, indx)) system.time(for (i in 1:1) fun2(vals, indx)) system.time(for (i in 1:1) matrix(vals[cbind(c(indx),rep(1:ncol(indx),each=nrow(indx)))],nrow=nrow(indx),ncol=ncol(indx))) ## now let's make a bigger test set set.seed(1) vals2 - matrix(sample(LETTERS, 10^7, TRUE), nrow = 10^4) indx2 - sapply(1:ncol(vals2), FUN = function(x) sample(10^4, 10^3, TRUE)) dim(vals2) dim(indx2) ## the best contenders from round 1 gold - matrix(vals2[cbind(c(indx2),rep(1:ncol(indx2),each=nrow(indx2)))],nrow=nrow(indx2),ncol=ncol(indx2)) test1 - f2(vals2, indx2) all.equal(gold, test1) system.time(for (i in 1:20) f2(vals2, indx2)) system.time(for (i in 1:20) matrix(vals2[cbind(c(indx2),rep(1:ncol(indx2),each=nrow(indx2)))],nrow=nrow(indx2),ncol=ncol(indx2))) %% On Sat, Mar 10, 2012 at 7:48 AM, Ben quant ccqu...@gmail.com wrote: Thanks for the info. Unfortunately its a little bit slower after one apples to apples test using my big data. Mine: 0.28 seconds. Yours. 0.73 seconds. Not a big deal, but significant when I have to do this 300 to 500 times. regards, ben On Fri, Mar 9, 2012 at 1:23 PM, Rui Barradas rui1...@sapo.pt wrote: Hello, I don't know if it's the fastest but it's more natural to have an index matrix with two columns only, one for each coordinate. And it's fast. fun - function(valdata, inxdata){ nr - nrow(inxdata) nc - ncol(inxdata) mat - matrix(NA, nrow=nr*nc, ncol=2) i1 - 1 i2 - nr for(j in 1:nc){ mat[i1:i2, 1] - inxdata[, j] mat[i1:i2, 2] - rep(j, nr) i1 - i1 + nr i2 - i2 + nr } matrix(valdata[mat], ncol=nc) } fun(vals, indx) Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/Re-index-values-of-one-matrix-to-another-of-a-different-size-tp4458666p4460575.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I do a pretty scatter plot using ggplot2?
Thanks Josh! How do I make it 50% quantile in each bin instead of the mean? Thanks a lot! On Fri, Mar 9, 2012 at 9:11 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hmm, smooth the chart makes me think you are trying to find the trends: require(ggplot2) ggplot(mtcars, aes(mpg, hp)) + geom_point() + stat_smooth() Try it out and see what you think---it adds a locally smoothed line that does something like trace the means (that is a very over simplification, but the gist of it). Cheers, Josh On Fri, Mar 9, 2012 at 7:00 PM, Michael comtech@gmail.com wrote: The origin of this problem was that a plain scatter plot with too many points with high dispersion generated too many points flying all over places. We are trying to smooth the charts a bit... Any good recommendations? Thanks a lot! On Fri, Mar 9, 2012 at 8:59 PM, Michael comtech@gmail.com wrote: Sorry for the confusion Michael. I myself am trying to figure out what my boss is requesting: I am certain that I need to plot the quantiles of each bin. ... But how are the quantiles plotted? Shall I specify 50% quantile, etc? Being a diligent guy I am trying my hard to do some homework and figure it out myself... I thought there is a standard statistical prodedure that everybody knows... Any more thoughts? Thanks a lot! On Fri, Mar 9, 2012 at 8:51 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: On Fri, Mar 9, 2012 at 9:28 PM, Michael comtech@gmail.com wrote: Thanks a lot Mike! Michael if you don't mind. (Though admittedly it leads to some degree of confusion in a conversation like this) Could you please explain your code a bit? Which part? My imagination is that for each bin, I am plotting a line which is the quantile of the y-values in that bin? Oh, so you want a qqnorm()-esque line? How is that like a scatterplot? yes, that's something else entirely (and not clear from your first post -- to my ear the quantile is a statistic tied to the [e]cdf) This is actually much easier in ggplot (and certainly doable in base as well) Try this, DAT - data.frame(x = runif(1000, 0, 20), y = rnorm(1000)) # Not so volatile this time DAT$xbin - with(DAT, cut(x, seq(0, 20, 5))) library(ggplot2) p - ggplot(DAT) + facet_wrap( ~ xbin) + stat_qq(aes(sample = y)) print(p) If this isn't what you want, please spend some time to show an example of the sort of graph you desire (it can be a bit of code or a link to a picture or even a hand sketch hosted somewhere online) Out on a limb, I think you might really be thinking of something more like this: p - ggplot(DAT) + facet_wrap( ~ xbin) + geom_step(aes(x = seq_along(y), y = sort(y))) and see this for more: http://had.co.nz/ggplot2/geom_step.html Michael Weylandt I ran your program but couldn't figure out the meaning of the dots in your plot? Thanks again! On Fri, Mar 9, 2012 at 7:07 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: That doesn't really seem to make sense to me as a graphical representation (transforming adjacent y values differently), but if you really want to do so, here's what I'd do if I understand your goal (the preprocessing is independent of the graphics engine): DAT - data.frame(x = runif(1000, 0, 20), y = rcauchy(1000)^2) # Nice and volatile! # split y based on some x binning and assign empirical quantiles of each group DAT$yquant - with(DAT, ave(y, cut(x, seq(0, 20, 5)), FUN = function(x) ecdf(x)(x))) # BASE plot(yquant ~ x, data = DAT) # ggplot2 library(ggplot2) p - ggplot(DAT, aes(x = x, y = yquant)) + geom_point() print(p) Michael Weylandt PS -- I see Josh Wiley just responded pointing out your requirements #1 and #2 are incompatible: I've used 1 here. On Fri, Mar 9, 2012 at 7:37 PM, Michael comtech@gmail.com wrote: Hi all, I am trying hard to do the following and have already spent a few hours in vain: I wanted to do the scatter plot. But given the high dispersion on those dots, I would like to bin the x-axis and then for each bin of the x-axis, plot the quantiles of the y-values of the data points in each bin: 1. Uniform bin size on the x-axis; 2. Equal number of observations in each bin; How to do that in R? I guess for the sake of prettyness, I'd better do it in ggplot2? Thank you! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html http://www.r-project.org/posting-guide.html and provide
Re: [R] How do I do a pretty scatter plot using ggplot2?
Thanks a lot! Could you please elaborate on this one? What I'd really do, if you had lots of data, would be to bin x into small contiguous bins and to calculate quantiles for each of those bins and to plot smoothers across the quantiles (using bin medians as the x axis) On Fri, Mar 9, 2012 at 9:21 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: Could you just add a log scale to the y dimension? DAT - data.frame(x = runif(1000, 0, 20), y = rnorm(1000)) plot(y ~ x, data = DAT, log = y) That lessens large dispersion (in some circumstances) but I'm not really sure what that has to do with smoothingdo you mean smoothing in the technical sense (loess, splines, and friends) or in some graphical sense? Still not sure what this has to do with quantile plots: they are usually diagnostic tools for examining distributional shape/fit. Here's two (related) ideas: i) If you have categorical x data, boxplots: http://had.co.nz/ggplot2/geom_boxplot.html ii) If you have continuous x data, quantile envelopes: http://had.co.nz/ggplot2/stat_quantile.html # In ggplot2 DAT - data.frame(x = runif(1000, 0, 20), y = rnorm(1000)) DAT$xbin - with(DAT, cut(x, seq(0, 20, 2))) p - ggplot(DAT, aes(x = x, y = y)) + geom_point(alpha = 0.2) + stat_quantile(aes(colour = ..quantile..), quantiles = seq(0.05, 0.95, by=0.05)) + facet_wrap(~ xbin, scales = free) print(p) What I'd really do, if you had lots of data, would be to bin x into small contiguous bins and to calculate quantiles for each of those bins and to plot smoothers across the quantiles (using bin medians as the x axis) -- I'm sure that's doable in ggplot2 as well. Michael On Fri, Mar 9, 2012 at 10:00 PM, Michael comtech@gmail.com wrote: The origin of this problem was that a plain scatter plot with too many points with high dispersion generated too many points flying all over places. We are trying to smooth the charts a bit... Any good recommendations? Thanks a lot! On Fri, Mar 9, 2012 at 8:59 PM, Michael comtech@gmail.com wrote: Sorry for the confusion Michael. I myself am trying to figure out what my boss is requesting: I am certain that I need to plot the quantiles of each bin. ... But how are the quantiles plotted? Shall I specify 50% quantile, etc? Being a diligent guy I am trying my hard to do some homework and figure it out myself... I thought there is a standard statistical prodedure that everybody knows... Any more thoughts? Thanks a lot! On Fri, Mar 9, 2012 at 8:51 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: On Fri, Mar 9, 2012 at 9:28 PM, Michael comtech@gmail.com wrote: Thanks a lot Mike! Michael if you don't mind. (Though admittedly it leads to some degree of confusion in a conversation like this) Could you please explain your code a bit? Which part? My imagination is that for each bin, I am plotting a line which is the quantile of the y-values in that bin? Oh, so you want a qqnorm()-esque line? How is that like a scatterplot? yes, that's something else entirely (and not clear from your first post -- to my ear the quantile is a statistic tied to the [e]cdf) This is actually much easier in ggplot (and certainly doable in base as well) Try this, DAT - data.frame(x = runif(1000, 0, 20), y = rnorm(1000)) # Not so volatile this time DAT$xbin - with(DAT, cut(x, seq(0, 20, 5))) library(ggplot2) p - ggplot(DAT) + facet_wrap( ~ xbin) + stat_qq(aes(sample = y)) print(p) If this isn't what you want, please spend some time to show an example of the sort of graph you desire (it can be a bit of code or a link to a picture or even a hand sketch hosted somewhere online) Out on a limb, I think you might really be thinking of something more like this: p - ggplot(DAT) + facet_wrap( ~ xbin) + geom_step(aes(x = seq_along(y), y = sort(y))) and see this for more: http://had.co.nz/ggplot2/geom_step.html Michael Weylandt I ran your program but couldn't figure out the meaning of the dots in your plot? Thanks again! On Fri, Mar 9, 2012 at 7:07 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: That doesn't really seem to make sense to me as a graphical representation (transforming adjacent y values differently), but if you really want to do so, here's what I'd do if I understand your goal (the preprocessing is independent of the graphics engine): DAT - data.frame(x = runif(1000, 0, 20), y = rcauchy(1000)^2) # Nice and volatile! # split y based on some x binning and assign empirical quantiles of each group DAT$yquant - with(DAT, ave(y, cut(x, seq(0, 20, 5)), FUN = function(x) ecdf(x)(x))) # BASE plot(yquant ~ x, data = DAT) # ggplot2 library(ggplot2) p - ggplot(DAT, aes(x = x, y = yquant)) + geom_point() print(p) Michael
[R] LME4 output
I hope you all don't mind this question, but I need help interpreting output for a linear mixed effects model output I've been trying to learn to do in R. I am new to longitudinal data analysis and linear mixed effects regression. I have a model I fitted with weeks as the time predictor, and score on an employment course as my y. I modeled score with weeks (time) and several fixed effects, sex and race. My model includes random effects. I need help understanding what the variance means. The output is the following: Random effects Group NameVariance EmpId intercept 980.236 weeks 13.562 Residual 23.256 I really appreciate the help. Thanks. Zeda [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] max.print
On 2012-03-10 08:35, sybil kennelly wrote: Dear all. I wanted to read in a 20,000 row X 60 column matrix (called table) into R. i did this: R table- read.table(table, header=TRUE) table it prints out the start of my table (~1 rows by 7 columns) and then this error: [ reached getOption(max.print) -- omitted 5465 rows ]] There were 50 or more warnings (use warnings() to see the first 50) I have tried: options(max.print = Inf) and options(max.print = 9) but i still get the same error. I have seen many people on R help have this problem. However the solution of options(max.print = Inf) does not seem to work for me. Any ideas? Well, I don't know why you would want to do this to your eyeballs, but View() would seem to be your friend and this is probably somewhere in the archives. You can't set max.print to anything that can't be coerced to integer (see ?integer) and I think that setting it to Inf is no longer legal (if ever it was). [Perhaps options() should generate a warning.] Peter Ehlers Syb [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use different panel functions with lattice
On Sat, Mar 10, 2012 at 9:33 AM, Balaitous balait...@mailoo.org wrote: Hi, I have a data.frame df with names(df) = c(Var1, Var2, Var3, Var4) and I plot data with xyplot(Var1+Var2~Var3|Var4, data=df) I want to use different panel functions for Var1 and Var2. How can I do ? You didn't specify which different panel functions you want. Is something like this what you're looking for? xyplot(Var1+Var2~Var3|Var4, data=df, panel=panel.superpose, panel.groups=function(x , y , group.number , ...){ panel.xyplot(x , y[group.number==1] , ...) panel.lines(x , y[group.number==2] , lwd=2 , col=1) }) Something like : panel.mypanel = function(x, y, ...) { if (Var1) panel.Var1Panel(x, y, ...) else panel.Var2Panel(x, y, ...) } xyplot(Var1+Var2~Var3|Var4, data=df, panel=panel.mypanel) (I have search with google, but I found nothing) Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] index values of one matrix to another of a different size
Very interesting. You are doing some stuff here that I have never seen. Thank you. I will test it on my real data on Monday and let you know what I find. That cmpfun function looks very useful! Thanks, Ben On Sat, Mar 10, 2012 at 10:26 AM, Joshua Wiley jwiley.ps...@gmail.comwrote: Hi Ben, It seems likely that there are bigger bottle necks in your overall program/use---have you tried Rprof() to find where things really get slowed down? In any case, f2() below takes about 70% of the time as your function in your test data, and 55-65% of the time for a bigger example I constructed. Rui's function benefits substantially from byte compiling, but is still slower. As a side benefit, f2() seems to use less memory than your current implementation. Cheers, Josh %% ##sample data ## vals - matrix(LETTERS[1:9], nrow = 3, ncol = 3, dimnames = list(c('row1','row2','row3'), c('col1','col2','col3'))) indx - matrix(c(1,1,3,3,2,2,2,3,1,2,2,1), nrow=4, ncol=3) storage.mode(indx) - integer f - function(x, i, di = dim(i), dx = dim(x)) { out - x[c(i + matrix(0:(dx[1L] - 1L) * dx[1L], nrow = di[1L], ncol = di[2L], TRUE))] dim(out) - di return(out) } fun - function(valdata, inxdata){ nr - nrow(inxdata) nc - ncol(inxdata) mat - matrix(NA, nrow=nr*nc, ncol=2) i1 - 1 i2 - nr for(j in 1:nc){ mat[i1:i2, 1] - inxdata[, j] mat[i1:i2, 2] - rep(j, nr) i1 - i1 + nr i2 - i2 + nr } matrix(valdata[mat], ncol=nc) } require(compiler) f2 - cmpfun(f) fun2 - cmpfun(fun) system.time(for (i in 1:1) f(vals, indx)) system.time(for (i in 1:1) f2(vals, indx)) system.time(for (i in 1:1) fun(vals, indx)) system.time(for (i in 1:1) fun2(vals, indx)) system.time(for (i in 1:1) matrix(vals[cbind(c(indx),rep(1:ncol(indx),each=nrow(indx)))],nrow=nrow(indx),ncol=ncol(indx))) ## now let's make a bigger test set set.seed(1) vals2 - matrix(sample(LETTERS, 10^7, TRUE), nrow = 10^4) indx2 - sapply(1:ncol(vals2), FUN = function(x) sample(10^4, 10^3, TRUE)) dim(vals2) dim(indx2) ## the best contenders from round 1 gold - matrix(vals2[cbind(c(indx2),rep(1:ncol(indx2),each=nrow(indx2)))],nrow=nrow(indx2),ncol=ncol(indx2)) test1 - f2(vals2, indx2) all.equal(gold, test1) system.time(for (i in 1:20) f2(vals2, indx2)) system.time(for (i in 1:20) matrix(vals2[cbind(c(indx2),rep(1:ncol(indx2),each=nrow(indx2)))],nrow=nrow(indx2),ncol=ncol(indx2))) %% On Sat, Mar 10, 2012 at 7:48 AM, Ben quant ccqu...@gmail.com wrote: Thanks for the info. Unfortunately its a little bit slower after one apples to apples test using my big data. Mine: 0.28 seconds. Yours. 0.73 seconds. Not a big deal, but significant when I have to do this 300 to 500 times. regards, ben On Fri, Mar 9, 2012 at 1:23 PM, Rui Barradas rui1...@sapo.pt wrote: Hello, I don't know if it's the fastest but it's more natural to have an index matrix with two columns only, one for each coordinate. And it's fast. fun - function(valdata, inxdata){ nr - nrow(inxdata) nc - ncol(inxdata) mat - matrix(NA, nrow=nr*nc, ncol=2) i1 - 1 i2 - nr for(j in 1:nc){ mat[i1:i2, 1] - inxdata[, j] mat[i1:i2, 2] - rep(j, nr) i1 - i1 + nr i2 - i2 + nr } matrix(valdata[mat], ncol=nc) } fun(vals, indx) Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/Re-index-values-of-one-matrix-to-another-of-a-different-size-tp4458666p4460575.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] index values of one matrix to another of a different size
On Sat, Mar 10, 2012 at 12:11 PM, Ben quant ccqu...@gmail.com wrote: Very interesting. You are doing some stuff here that I have never seen. and that I would not typically do or recommend (e.g., fussing with storage mode or manually setting the dimensions of an object), but that can be faster by sacrificing higher level functions flexibility for lower level, more direct control. Thank you. I will test it on my real data on Monday and let you know what I find. That cmpfun function looks very useful! It can reduce the overhead of repeated function calls. I find the biggest speedups when it is used with some sort of loop. Then again, many loops can be avoided entirely, which often yields even larger performance gains. Thanks, You're welcome. You might also look at the data table package by Matthew Dowle. It does some *very* fast indexing and subsetting and if those operations are serious slow down for you, you would likely benefit substantially from using it. One final comment, since you are creating the matrix of indices; if you can create it in such a way that it already has the vector position rather than row/column form, you could eliminate the need for my f2() function altogether as you could use it to directly index your data, and then just add dimensions back afterward. Cheers, Josh Ben On Sat, Mar 10, 2012 at 10:26 AM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hi Ben, It seems likely that there are bigger bottle necks in your overall program/use---have you tried Rprof() to find where things really get slowed down? In any case, f2() below takes about 70% of the time as your function in your test data, and 55-65% of the time for a bigger example I constructed. Rui's function benefits substantially from byte compiling, but is still slower. As a side benefit, f2() seems to use less memory than your current implementation. Cheers, Josh %% ##sample data ## vals - matrix(LETTERS[1:9], nrow = 3, ncol = 3, dimnames = list(c('row1','row2','row3'), c('col1','col2','col3'))) indx - matrix(c(1,1,3,3,2,2,2,3,1,2,2,1), nrow=4, ncol=3) storage.mode(indx) - integer f - function(x, i, di = dim(i), dx = dim(x)) { out - x[c(i + matrix(0:(dx[1L] - 1L) * dx[1L], nrow = di[1L], ncol = di[2L], TRUE))] dim(out) - di return(out) } fun - function(valdata, inxdata){ nr - nrow(inxdata) nc - ncol(inxdata) mat - matrix(NA, nrow=nr*nc, ncol=2) i1 - 1 i2 - nr for(j in 1:nc){ mat[i1:i2, 1] - inxdata[, j] mat[i1:i2, 2] - rep(j, nr) i1 - i1 + nr i2 - i2 + nr } matrix(valdata[mat], ncol=nc) } require(compiler) f2 - cmpfun(f) fun2 - cmpfun(fun) system.time(for (i in 1:1) f(vals, indx)) system.time(for (i in 1:1) f2(vals, indx)) system.time(for (i in 1:1) fun(vals, indx)) system.time(for (i in 1:1) fun2(vals, indx)) system.time(for (i in 1:1) matrix(vals[cbind(c(indx),rep(1:ncol(indx),each=nrow(indx)))],nrow=nrow(indx),ncol=ncol(indx))) ## now let's make a bigger test set set.seed(1) vals2 - matrix(sample(LETTERS, 10^7, TRUE), nrow = 10^4) indx2 - sapply(1:ncol(vals2), FUN = function(x) sample(10^4, 10^3, TRUE)) dim(vals2) dim(indx2) ## the best contenders from round 1 gold - matrix(vals2[cbind(c(indx2),rep(1:ncol(indx2),each=nrow(indx2)))],nrow=nrow(indx2),ncol=ncol(indx2)) test1 - f2(vals2, indx2) all.equal(gold, test1) system.time(for (i in 1:20) f2(vals2, indx2)) system.time(for (i in 1:20) matrix(vals2[cbind(c(indx2),rep(1:ncol(indx2),each=nrow(indx2)))],nrow=nrow(indx2),ncol=ncol(indx2))) %% On Sat, Mar 10, 2012 at 7:48 AM, Ben quant ccqu...@gmail.com wrote: Thanks for the info. Unfortunately its a little bit slower after one apples to apples test using my big data. Mine: 0.28 seconds. Yours. 0.73 seconds. Not a big deal, but significant when I have to do this 300 to 500 times. regards, ben On Fri, Mar 9, 2012 at 1:23 PM, Rui Barradas rui1...@sapo.pt wrote: Hello, I don't know if it's the fastest but it's more natural to have an index matrix with two columns only, one for each coordinate. And it's fast. fun - function(valdata, inxdata){ nr - nrow(inxdata) nc - ncol(inxdata) mat - matrix(NA, nrow=nr*nc, ncol=2) i1 - 1 i2 - nr for(j in 1:nc){ mat[i1:i2, 1] - inxdata[, j] mat[i1:i2, 2] - rep(j, nr) i1 - i1 + nr i2 - i2 + nr } matrix(valdata[mat], ncol=nc) } fun(vals, indx) Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/Re-index-values-of-one-matrix-to-another-of-a-different-size-tp4458666p4460575.html Sent from the R help mailing list archive at Nabble.com.
[R] How to fit a line through the Mountain crest, i.e., through the highest density of points - in a loess-like fashion.
Hi, I'm trying to normalize data by fitting a line through the highest density of points (in a 2D plot). In other words, if you visualize the data as a density plot, the fit I'm trying to achieve is the line that goes through the crest of the mountain. This is similar yet different to what LOESS does. I've been using loess before, but it does not exactly that as it takes into account all points. Although points farther from the fit have a smaller weight, they result in the fit being a bit off the crest. Do you know a package or maybe even an option in loess that would allow me achieve this? Any advice or idea appreciated. Emmanuel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generating abnormal returns in R
Hello This is my first post on this forum and I hope someone can help me out. I have a datafile (weeklyR) with returns of +- 100 companies. I acquired this computing the following code: library(tseries); tickers = c(GSPC , BP , TOT ,ENI.MI , VOW.BE , CS.PA , DAI.DE , ALV.DE , EOAN.DE , CA.PA , G.MI , DE , EXR.MI , MUV2.BE , UG.PA , PRU.L, VOD.L , DPB.BE , REP.MC , RWE.BE , AGN.AS , FTE.PA , EAD , LGEN.L , CNP.PA , ULVR.L , TKA.BE , RIO.L , NOK , SGO.PA , RNO.PA , VIE.PA , BAYN.DE , SAN.PA , DG.PA , SSE.L , GSK.L , EN.PA , LYB , MLSNP.PA , IBE.MC , EURS.PA , AH.AS , VIV.PA , TIT.MI , VOLV-B.ST , ABI.BR , LHA.DE , OML.L , CNA.L , CON.DE , PHG , AZN.L , SBRY.L , BA.L , BT-A.L , AF.PA , 430021.VI , SL.L , ERIC-A.ST , CDI.PA , AAL.L , ALO.PA , DELB.BR , HOT.BE , GAS.MC , SU.PA , OR.PA , FNC.MI , MRW.L , MAP.MC , ML.PA , IMT.L , EBK.DE , PP.PA , ACN , BTI , CRG.IR , CPG.L , BN.PA , NG.L , T7L.BE , HEIA.AS , ACS.MC , LG.PA , STAN.L , ALU.PA , FRE.MU , SW.PA , WOS.L , AKZA.AS , HEN.MU) for( series in tickers ){ print(series) close - get.hist.quote(instrument=series,retclass=zoo,quote=AdjClose,compression=d, start=2000-1-1, end=2011-12-31,quiet=TRUE) if(series==tickers[1]){ pricedata = close }else{ pricedata = merge( pricedata , close ) } } colnames(pricedata) = tickers # Avoid a missing because of trade halt for that stock pricedata = na.approx(pricedata) weeklyR = diff(log(pricedata)) time(weeklyR) = as.Date(time(weeklyR)) print(weeklyR) save(weeklyR , file = weeklyR.Rdata) write.zoo(weeklyR,file=weeklyR.csv,quote=T,sep=,, na = NA, dec = . , row.names = F,col.names = T) Now I need to make a market model in R so i can generate abnormal returns from these stocks. As market index I would like to use the GSPC. I also need to consider abnormal returns calculated over a sixty-trading-day window. Can this be done in R? Is it difficult to write this code? Any help would be much appreciated! thanks drsenne -- View this message in context: http://r.789695.n4.nabble.com/Generating-abnormal-returns-in-R-tp4462541p4462541.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] function input as variable name (deparse/quote/paste) ??
Hi all Say I have a function: myname=function(dat,x=5,y=6){ res-x+y-dat } for various input such as myname(dat1) myname(dat2) myname(dat3) myname(dat4) myname(dat5) how should I modify the 'res' line, to have new informative variable name correspondingly, such as dat1.res dat2.res dat3.res dat4.res dat5.res stored in the workspace. This is only an example of a complex function I have written. Thanks in advance! Casper -- View this message in context: http://r.789695.n4.nabble.com/function-input-as-variable-name-deparse-quote-paste-tp4462841p4462841.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Draw values from multiple data sets as inputs to a Monte-Carlo function; then apply across entire matrix
Hi all, I am trying to implement a Monte-Carlo simulation for each cell in a spatial matrix (using mcd2 package) . I have figured out how to conduct the simulation using data from a single location (where I manually input distribution parameters into the R code), but am having trouble (a) adjusting the code to pull input variables from my various data sets and then (b) applying the entire process across each of the cells of the matrices. I have been doing a lot of reading about loops (a big no-no?), apply, and ddply, but can not quite figure it out. Here is the situation: Data: I have (for simplicity) 3 spatial raster data sets (each 4848 x 4053 cells) as ASCII files: -Poultry density (mean value in each cell) -Poultry density (standard deviation in each cell) -Wild bird density (single estimate in each cell) I read them into R using read.table. The data look correct: Pmn - read.table(D:/Data/PoultryMeans.txt) Psd - read.table(D:/Data/PoultryStDev.txt) Wde - read.table(D:/Data/WildBirdDensity.txt) The Model: In the Monte-Carlo simulation, Poultry and Wild birds have different distributions (normal and triangle, respectively). Below are the 2 lines of code that use the mcstoc function to draw the samples. The values in bold are ones that I would like to draw from the data tables I read in above. For example, 3.5 would be cell (i,j) in the Poultry MEAN density table; 0.108 would be cell (i,j) in the Poultry STDEV table; and 47 the single estimate for cell (i,j) of the Wild bird density table. Poultry - mcstoc(rnorm, type=U, 3.5, 0.108, rtrunc=TRUE, linf=0) Wild - mcstoc(rtriang, type=U, min=0, mode=47, max=75) Risk - Poultry * Wild#this is the risk function the MC is applied to Questions: 1) How can I edit the Poultry and Wild variables above to read the data values directly from the 3 input tables (i.e., replacing 3.5, 0.108, and 47 with some variable name for the data table and using a loop?? Or somehow use apply or ddply?) 2) Have the entire process be run for every cell in the 4848 x 4053 matrix? Thank you for any help you can provide to get me moving forward! Diann [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] PCA in predefined Groups??
Hi This has a simple answer but it has been eluding me nonetheless. I have been trying to build a PCA plot from scratch with the ability to plot predefined groups in different colors. I can plot PCA but I want it to plot with predefined groups(samples) with top 100 expressed genes. I have three groups. Can any body help me keeping in mind that the user is just beginner in R. Thanks in advance -- View this message in context: http://r.789695.n4.nabble.com/PCA-in-predefined-Groups-tp4462536p4462536.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Finding the mean.
Using functions how would I go about do this question? (I already have a mean defined for a function of x.) Write a function called MyMean2. This function has two arguments, x and nonzero, where nonzero has the default value TRUE. This function should return the (Previous defined mean of x) if nonzero=FALSE (Previous defined mean of x) for all x's0 if nonzero=TRUE Much appreciated. elliot.we...@virgin.net Sent from my BlackBerry® smartphone __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help on subgraphs in xyplot of lattice library
Hi, Michael, Thank you for your help! In its simplified form, the data frame looks like: idx true_value meandiff_mean1 diff_mean2diff_mean3sdt diff_std1diff_std2diff_std3 samplesize 1 1000 2 1000 3 1000 4 1000 5 1000 1 5000 2 5000 3 5000 4 5000 5 5000 I would like the plot to be: row1 has 4 subplots for samplesize 1000; row2 has 4 subplots for samplesize 5000; in each row: the 1st subplot is true_value against mean; the 2nd is an overlay plot for idx against diff_mean1, idx against diff_mean2, idx against diff_mean3; the 3rd is true_value against std; the 4th is an overlay plot for idx against diff_std1, idx against diff_std2, idx against diff_std3. I have looked at sample xyplot codes, but still did not know how to realize this. Thanks again! Chee From: R. Michael Weylandt Sent: Saturday, March 10, 2012 12:20 PM To: Chee Chen Cc: R-ORG Subject: Re: [R] Help on subgraphs in xyplot of lattice library What does your data look likedput() is your friend. Also, it'd be helpful if you could give base graphics code for more-or-less what you are looking for (since you can do so already) as it's pretty hard to describe graphics without pictures. Running example(xyplot) might help you get started as well. Michael On Sat, Mar 10, 2012 at 12:04 PM, Chee Chen chee.c...@yahoo.com wrote: Dear All, I would like to ask a question on how to do overlay plots in each subgraph of xyplot. 1. I did simulations for m=1000, 2500, 5000, 1, as the sample sizes. 2. for each sample size value m, 4 graphs are generated; each graph contains overlayed comparisons between 4 methods, 3. now I want put them into a 4-by-4 plot by xyplot, i.e., 4 sample size values, each of which has 4 plots. I know how to do this using plot, but the spaces between subplots are big. I do not know how to make each subplot in xyplot an overlayed one as it would appear using plot. Any help would be appreciated! Thank you, Chee [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] too many open devices
I am getting too many open devices after 60 graphs. The archived comments on this problem were too sketchy to be helpful. Any ideas? Thanks Harold __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] PCA in predefined Groups??
Without taking away all the fun of trial and error, and exploration in R... I will direct you to this website which I found invaluable when I first began to use R. one way would be to use: plot(Yourdata, type=n) and then 3 text() or points() statements to plot the groups represented by different colors. Good luck! SHAFI wrote Hi This has a simple answer but it has been eluding me nonetheless. I have been trying to build a PCA plot from scratch with the ability to plot predefined groups in different colors. I can plot PCA but I want it to plot with predefined groups(samples) with top 100 expressed genes. I have three groups. Can any body help me keeping in mind that the user is just beginner in R. Thanks in advance -- View this message in context: http://r.789695.n4.nabble.com/PCA-in-predefined-Groups-tp4462536p4462765.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] applying a function in list of indexed elements of a vector:
Hi, I have a vector Y1 -c(8, 11, 7, 5, 6, 3, 6, 3, 3) and an index iy -c(c(1, 2),c(1 2), c(1, 2, 3, 4), c(2, 3, 5), c(4), c(5, 6, 7), c(7, 8, 9)) how can I produce the mean, or the sum of the elements specified in the index iy from the vector Y1? expecting something like this for the sum: Y2 19 19 31 24 5 15 12 I thought lapply function may perform this, but does not work: Y2-lapply(Y1[iy],sum) Any suggestion? TIA, Aldi -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] too many open devices
It would help if you showed us how you were plotting. are you calling 'dev.off()' after creating an output file? The comments on this problem were to sketchy to be helpful. On Sat, Mar 10, 2012 at 3:21 PM, harold kincaid kincaidharold...@gmail.com wrote: I am getting too many open devices after 60 graphs. The archived comments on this problem were too sketchy to be helpful. Any ideas? Thanks Harold __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding the mean.
if (nonzero) mean(x[x0]) else mean(x) On Sat, Mar 10, 2012 at 2:47 PM, elliot.we...@virgin.net wrote: Using functions how would I go about do this question? (I already have a mean defined for a function of x.) Write a function called MyMean2. This function has two arguments, x and nonzero, where nonzero has the default value TRUE. This function should return the (Previous defined mean of x) if nonzero=FALSE (Previous defined mean of x) for all x's0 if nonzero=TRUE Much appreciated. elliot.we...@virgin.net Sent from my BlackBerry® smartphone __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with confidence intervals for gam model using mgcv
Hi, I would be very grateful for advice on getting confidence intervals for the ordinary (non smoothed) parameter estimates from a gam. Motivation I am studying hospital outcomes in a large data set. The outcomes of interest to me are all binary variables. The one in the example here, Dead30d, is death within 30 days of admission. Sexf is gender (M or F), Age is age in years at the start of the admission. The standard glm is a logistic regression :- glmDead.AS - glm(Dead30d~Sexf+Age, data=HIPE,family=binomial(link=logit)) The corresponding GAM, with a smooth for age, is :- gamDead.AS - gam(Dead30d~Sexf+s(Age), data=HIPE,family=binomial(link=logit)) For my work, age is a nuisance. We already know exactly the effect of age (which has an odd shape). I have no interest in this parameter, nor in CIs for it. The GAM fits notably better than the GLM. The substantive interest is in the effects of the other variables, Sexf, and many more. For the GLM, the confidence intervals are simple matter of confint(glmDead.AS). For my discipline CI's are required, and the profile CI's that confint produces are ideal. There doesn't seem to be an analogous function for mgcv. The advice most commonly given is to use predict.gam with se.fit=TRUE. This does not seem to produce CI's for the non-smoothed parameters, which is what I need to calculate. CIs for the smooth, which are the focus of interest in many other cases are not of interest to me. Any suggestions? Am I missing something very obvious? Best wishes, Anthony Staines -- Anthony Staines, Professor of Health Systems, School of Nursing and Human Sciences, DCU, Dublin 9,Ireland. Tel:- +353 1 700 7807. Mobile:- +353 86 606 9713 http://astaines.eu/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function input as variable name (deparse/quote/paste) ??
On Sat, Mar 10, 2012 at 01:29:16PM -0800, casperyc wrote: Hi all Say I have a function: myname=function(dat,x=5,y=6){ res-x+y-dat } for various input such as myname(dat1) myname(dat2) myname(dat3) myname(dat4) myname(dat5) how should I modify the 'res' line, to have new informative variable name correspondingly, such as dat1.res dat2.res dat3.res dat4.res dat5.res stored in the workspace. Why not keep the information of input values in a list, or vector? What is gained by storing that info in the variable _name_ ? Your function could return a list with both the result and the input value. While you did say that this was part of something complex, I suspect your post might be a case of Being overly specific and not stating your real goal. -- Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function input as variable name (deparse/quote/paste) ??
On Sun, Mar 11, 2012 at 10:29 AM, casperyc caspe...@hotmail.co.uk wrote: Hi all Say I have a function: myname=function(dat,x=5,y=6){ res-x+y-dat } for various input such as myname(dat1) myname(dat2) myname(dat3) myname(dat4) myname(dat5) how should I modify the 'res' line, to have new informative variable name correspondingly, such as dat1.res dat2.res dat3.res dat4.res dat5.res You *can* do it with myname=function(dat,x=5,y=6){ name-paste(deparse(substitute(dat)),res,sep=.) assign(name, x+y-dat, parent.frame(), inherits=TRUE) } but I would be very surprised if this is actually the best way to do whatever complex thing you are really doing. It's very unusual for assignments into the global workspace to be a useful R programming technique. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use different panel functions with lattice
Inline On Sat, Mar 10, 2012 at 1:47 PM, Balaitous balait...@mailoo.org wrote: Le samedi 10 mars 2012 à 12:25 -0700, ilai a écrit : On Sat, Mar 10, 2012 at 9:33 AM, Balaitous balait...@mailoo.org wrote: Var1 and Var2 are 2 two different observed variables (with different scales) You might want to consider scales=list(y=list(relation='free')) in ?xyplot Var3 is the time Var4 is the point of observation I have also a Var5 for groups, but I just want groups for the Var1. snip But I don't know how to make the test if(Varx) in the function panel.mypanel, because I need Var1 - panel.superpose (It's OK) Var2 - panel.lines (I don't want groups for this) (And I will have others variables with other panel functions to use) Since outer=T (i.e. Var1 and Var2 are in different panels), at the beginning of the panel or panel.groups function, try if(packet.number() %in% 1:3) { panel.rect(x,y,groups,...) # or whatever for panels 1:3 } else{ panel.rect(x,y,groups,col=constant,...) # or some other stuff for panels 4:6 } Hope that works better. Something like : panel.mypanel = function(x, y, ...) { if (Var1) panel.Var1Panel(x, y, ...) else panel.Var2Panel(x, y, ...) } xyplot(Var1+Var2~Var3|Var4, data=df, panel=panel.mypanel) (I have search with google, but I found nothing) Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] resume on error
Dear all, I would like to ask you how I can catch an error on R and then ask it to resume. For example I have a large for loop and I know for a small number inside that loop there will be errors. How I can ask in that case from R just to ignore it and return back to the loop? I would like to thank you in advance fro your help B.R Ale [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] odd error with rJava
Hello! I'm using R-2.14.2 on a Windows 7 64 bit machine and I did the following: install.packages(rJava,depen=TRUE) --- Please select a CRAN mirror for use in this session --- trying URL 'http://cran.sixsigmaonline.org/bin/windows/contrib/2.14/rJava_0.9-3.zip' Content type 'application/zip' length 745867 bytes (728 Kb) opened URL downloaded 728 Kb package ‘rJava’ successfully unpacked and MD5 sums checked The downloaded packages are in C:\Users\erin\AppData\Local\Temp\RtmpgVpcnT\downloaded_packages library(OpenStreetMap) Loading required package: rJava Error : .onLoad failed in loadNamespace() for 'rJava', details: call: inDL(x, as.logical(local), as.logical(now), ...) error: unable to load shared object 'c:/R64/R-2.14.2/library/rJava/libs/x64/rJava.dll': LoadLibrary failure: %1 is not a valid Win32 application. Error: package ‘rJava’ could not be loaded library(rgdal) Loading required package: sp Geospatial Data Abstraction Library extensions to R successfully loaded Loaded GDAL runtime: GDAL 1.9.0, released 2011/12/29 Path to GDAL shared files: c:/R64/R-2.14.2/library/rgdal/gdal Loaded PROJ.4 runtime: Rel. 4.7.1, 23 September 2009, [PJ_VERSION: 470] Path to PROJ.4 shared files: c:/R64/R-2.14.2/library/rgdal/proj What am I doing wrong, please? Thanks, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] odd error with rJava
Hi Erin, You need to make sure that rJava both installs correctly and can load. The R package system is quite robust, so off the top of my head, I would guess you need to setup Java properly on the machine. See the rJava package for what it requires. Cheers, Josh On Sat, Mar 10, 2012 at 3:19 PM, Erin Hodgess erinm.hodg...@gmail.com wrote: Hello! I'm using R-2.14.2 on a Windows 7 64 bit machine and I did the following: install.packages(rJava,depen=TRUE) --- Please select a CRAN mirror for use in this session --- trying URL 'http://cran.sixsigmaonline.org/bin/windows/contrib/2.14/rJava_0.9-3.zip' Content type 'application/zip' length 745867 bytes (728 Kb) opened URL downloaded 728 Kb package ‘rJava’ successfully unpacked and MD5 sums checked The downloaded packages are in C:\Users\erin\AppData\Local\Temp\RtmpgVpcnT\downloaded_packages library(OpenStreetMap) Loading required package: rJava Error : .onLoad failed in loadNamespace() for 'rJava', details: call: inDL(x, as.logical(local), as.logical(now), ...) error: unable to load shared object 'c:/R64/R-2.14.2/library/rJava/libs/x64/rJava.dll': LoadLibrary failure: %1 is not a valid Win32 application. Error: package ‘rJava’ could not be loaded library(rgdal) Loading required package: sp Geospatial Data Abstraction Library extensions to R successfully loaded Loaded GDAL runtime: GDAL 1.9.0, released 2011/12/29 Path to GDAL shared files: c:/R64/R-2.14.2/library/rgdal/gdal Loaded PROJ.4 runtime: Rel. 4.7.1, 23 September 2009, [PJ_VERSION: 470] Path to PROJ.4 shared files: c:/R64/R-2.14.2/library/rgdal/proj What am I doing wrong, please? Thanks, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to improve the robustness of loess? - example included.
Hi, I posted a message earlier entitled How to fit a line through the Mountain crest ... I figured loess is probably the best way, but it seems that the problem is the robustness of the fit. Below I paste an example to illustrate the problem: tmp=rnorm(2000) X.background = 5+tmp; Y.background = 5+ (10*tmp+rnorm(2000)) X.specific = 3.5+3*runif(1000); Y.specific = 5+120*runif(1000) X = c(X.background, X.specific);Y = c(Y.background, Y.specific) MINx=range(X)[1];MAXx=range(X)[2] my.loess = loess(Y ~ X, data.frame( X = X, Y = Y), family=symmetric, degree=2, span=0.1) lo.pred = predict(my.loess, data.frame(X = seq(MINx, MAXx, length=100)), se=TRUE) plot( seq(MINx, MAXx, length=100), lo.pred$fit, lwd=2,col=2, l) points(X,Y, col= grey(abs(my.loess$res)/max(abs(my.loess$res))) ) As you will see, the red line does not follow the background signal. However, when decreasing the specific signal to 500 points it becomes perfect. I'm sure there is a way to tune the fitting so that it works but I can't figure out how. Importantly, *I cannot increase the span* because in reality the relationship I'm looking at is more complex so I need a small span value to allow for a close fit. I foresee that changing the weigthing is the way to go but I do not really understand how the weight option is used (I tried to change it and nothing happened), and also the embedded tricubic weighting does not seem changeable. So any idea would be very welcome. Emmanuel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reading text files from other languages
I'm trying to read a data file that contains characters from the Spanish language: Station - read.fwf(LosDatos.txt,widths=c(7,7,25,8,8,5),header=FALSE, + skip=3,n=separ[1]-4) Then the R interpreter issues the following message: Error en substring(x, first, last) : invalid multibyte string at 'd1A, S.' Calls: read.fwf - cat - sapply - lapply - FUN - substring I know that the message is because there is a Ñ before the text A, S.. Is there a way to tell R that the text file is UTF-8 encoded? Thanks, --Sergio. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading text files from other languages
Hi Julio, If you look at the documentation for ?read.fwf you will see '...' further arguments to be passed to 'read.table' and if you look at ?read.table you will see there is an argument called, 'encoding', so, yes. Just specify the encoding. Cheers, Josh On Sat, Mar 10, 2012 at 3:41 PM, Julio Sergio julioser...@gmail.com wrote: I'm trying to read a data file that contains characters from the Spanish language: Station - read.fwf(LosDatos.txt,widths=c(7,7,25,8,8,5),header=FALSE, + skip=3,n=separ[1]-4) Then the R interpreter issues the following message: Error en substring(x, first, last) : invalid multibyte string at 'd1A, S.' Calls: read.fwf - cat - sapply - lapply - FUN - substring I know that the message is because there is a Ñ before the text A, S.. Is there a way to tell R that the text file is UTF-8 encoded? Thanks, --Sergio. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] resume on error
? try or ? tryCatch Michael On Sat, Mar 10, 2012 at 6:08 PM, Alaios ala...@yahoo.com wrote: Dear all, I would like to ask you how I can catch an error on R and then ask it to resume. For example I have a large for loop and I know for a small number inside that loop there will be errors. How I can ask in that case from R just to ignore it and return back to the loop? I would like to thank you in advance fro your help B.R Ale [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generating abnormal returns in R
Well, it's not hard to write the code for it, but if you know the secret way to accurately model abnormal returns, you'll be a far richer man than I quite soon. Less snidely, one needs to say quite a bit more about a distribution to specify it than not Gaussian. Michael On Sat, Mar 10, 2012 at 12:46 PM, drsenne dr_se...@pandora.be wrote: Hello This is my first post on this forum and I hope someone can help me out. I have a datafile (weeklyR) with returns of +- 100 companies. I acquired this computing the following code: library(tseries); tickers = c(GSPC , BP , TOT , ENI.MI , VOW.BE , CS.PA , DAI.DE , ALV.DE , EOAN.DE , CA.PA , G.MI , DE , EXR.MI , MUV2.BE , UG.PA , PRU.L, VOD.L , DPB.BE , REP.MC , RWE.BE , AGN.AS , FTE.PA , EAD , LGEN.L , CNP.PA , ULVR.L , TKA.BE , RIO.L , NOK , SGO.PA , RNO.PA , VIE.PA , BAYN.DE , SAN.PA , DG.PA , SSE.L , GSK.L , EN.PA , LYB , MLSNP.PA , IBE.MC , EURS.PA , AH.AS , VIV.PA , TIT.MI , VOLV-B.ST , ABI.BR , LHA.DE , OML.L , CNA.L , CON.DE , PHG , AZN.L , SBRY.L , BA.L , BT-A.L , AF.PA , 430021.VI , SL.L , ERIC-A.ST , CDI.PA , AAL.L , ALO.PA , DELB.BR , HOT.BE , GAS.MC , SU.PA , OR.PA , FNC.MI , MRW.L , MAP.MC , ML.PA , IMT.L , EBK.DE , PP.PA , ACN , BTI , CRG.IR , CPG.L , BN.PA , NG.L , T7L.BE , HEIA.AS , ACS.MC , LG.PA , STAN.L , ALU.PA , FRE.MU , SW.PA , WOS.L , AKZA.AS , HEN.MU) for( series in tickers ){ print(series) close - get.hist.quote(instrument=series,retclass=zoo,quote=AdjClose,compression=d, start=2000-1-1, end=2011-12-31,quiet=TRUE) if(series==tickers[1]){ pricedata = close }else{ pricedata = merge( pricedata , close ) } } colnames(pricedata) = tickers # Avoid a missing because of trade halt for that stock pricedata = na.approx(pricedata) weeklyR = diff(log(pricedata)) time(weeklyR) = as.Date(time(weeklyR)) print(weeklyR) save(weeklyR , file = weeklyR.Rdata) write.zoo(weeklyR,file=weeklyR.csv,quote=T,sep=,, na = NA, dec = . , row.names = F,col.names = T) Now I need to make a market model in R so i can generate abnormal returns from these stocks. As market index I would like to use the GSPC. I also need to consider abnormal returns calculated over a sixty-trading-day window. Can this be done in R? Is it difficult to write this code? Any help would be much appreciated! thanks drsenne -- View this message in context: http://r.789695.n4.nabble.com/Generating-abnormal-returns-in-R-tp4462541p4462541.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] applying a function in list of indexed elements of a vector:
Your code for iy doesn't work as providedI'll assume you meant this instead: iy - list(c(1, 2),c(1, 2), c(1, 2, 3, 4), c(2, 3, 5), c(4), c(5, 6, 7), c(7, 8, 9)) Then sapply(iy, function(x) sum(Y1[x])) Michael On Sat, Mar 10, 2012 at 5:01 PM, aldi a...@dsgmail.wustl.edu wrote: Hi, I have a vector Y1 -c(8, 11, 7, 5, 6, 3, 6, 3, 3) and an index iy -c(c(1, 2),c(1 2), c(1, 2, 3, 4), c(2, 3, 5), c(4), c(5, 6, 7), c(7, 8, 9)) how can I produce the mean, or the sum of the elements specified in the index iy from the vector Y1? expecting something like this for the sum: Y2 19 19 31 24 5 15 12 I thought lapply function may perform this, but does not work: Y2-lapply(Y1[iy],sum) Any suggestion? TIA, Aldi -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generating abnormal returns in R
Hi Michael: abnormal returns in a term used in finance to describe the residual return after estimating a return model ( either capm or apt or whatever ) so the needs to build a return model ( capm is the easiest ) and then just calculate the residuals. these are termed the residual returns and can be negative or positive. drsenne: you should send that to R-Sig-Finance or look around on the net. It's an interesting exercise but you need to understand R pretty well and install quantmod and be able to get the prices for all the stocks and run regression models. I don't know where an R example of it is but Eric Zivot has a nice example in his S+Finmetrics book. If you can get your hands on that, it will show all the details. But, if you send your question to R-Sig-Finance, I bet someone over there will know where a good R example lies. Mark P.S: also checkout the website of systematic investor. I don't know if he does exactly the above but it does a lot of related things and provides R code. On Sat, Mar 10, 2012 at 7:03 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: Well, it's not hard to write the code for it, but if you know the secret way to accurately model abnormal returns, you'll be a far richer man than I quite soon. Less snidely, one needs to say quite a bit more about a distribution to specify it than not Gaussian. Michael On Sat, Mar 10, 2012 at 12:46 PM, drsenne dr_se...@pandora.be wrote: Hello This is my first post on this forum and I hope someone can help me out. I have a datafile (weeklyR) with returns of +- 100 companies. I acquired this computing the following code: library(tseries); tickers = c(GSPC , BP , TOT ,ENI.MI , VOW.BE , CS.PA , DAI.DE , ALV.DE , EOAN.DE , CA.PA , G.MI , DE , EXR.MI , MUV2.BE , UG.PA , PRU.L, VOD.L , DPB.BE , REP.MC , RWE.BE , AGN.AS , FTE.PA , EAD , LGEN.L , CNP.PA , ULVR.L , TKA.BE , RIO.L , NOK , SGO.PA , RNO.PA , VIE.PA , BAYN.DE , SAN.PA , DG.PA , SSE.L , GSK.L , EN.PA , LYB , MLSNP.PA , IBE.MC , EURS.PA , AH.AS , VIV.PA , TIT.MI , VOLV-B.ST , ABI.BR , LHA.DE , OML.L , CNA.L , CON.DE , PHG , AZN.L , SBRY.L , BA.L , BT-A.L , AF.PA , 430021.VI , SL.L , ERIC-A.ST , CDI.PA , AAL.L , ALO.PA , DELB.BR , HOT.BE , GAS.MC , SU.PA , OR.PA , FNC.MI , MRW.L , MAP.MC , ML.PA , IMT.L , EBK.DE , PP.PA , ACN , BTI , CRG.IR , CPG.L , BN.PA , NG.L , T7L.BE , HEIA.AS , ACS.MC , LG.PA , STAN.L , ALU.PA , FRE.MU , SW.PA , WOS.L , AKZA.AS , HEN.MU) for( series in tickers ){ print(series) close - get.hist.quote(instrument=series,retclass=zoo,quote=AdjClose,compression=d, start=2000-1-1, end=2011-12-31,quiet=TRUE) if(series==tickers[1]){ pricedata = close }else{ pricedata = merge( pricedata , close ) } } colnames(pricedata) = tickers # Avoid a missing because of trade halt for that stock pricedata = na.approx(pricedata) weeklyR = diff(log(pricedata)) time(weeklyR) = as.Date(time(weeklyR)) print(weeklyR) save(weeklyR , file = weeklyR.Rdata) write.zoo(weeklyR,file=weeklyR.csv,quote=T,sep=,, na = NA, dec = . , row.names = F,col.names = T) Now I need to make a market model in R so i can generate abnormal returns from these stocks. As market index I would like to use the GSPC. I also need to consider abnormal returns calculated over a sixty-trading-day window. Can this be done in R? Is it difficult to write this code? Any help would be much appreciated! thanks drsenne -- View this message in context: http://r.789695.n4.nabble.com/Generating-abnormal-returns-in-R-tp4462541p4462541.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading text files from other languages
Joshua Wiley jwiley.psych at gmail.com writes: Thanks Joshua! Best regards, --Sergio. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to fit a line through the Mountain crest, i.e., through the highest density of points - in a loess-like fashion.
On Mar 10, 2012, at 3:55 PM, Emmanuel Levy wrote: Hi, I'm trying to normalize data by fitting a line through the highest density of points (in a 2D plot). In other words, if you visualize the data as a density plot, the fit I'm trying to achieve is the line that goes through the crest of the mountain. Are you familiar with the kde2d of bkde2D functions in various packages? If you then collected the max density for each X and Y you might want to see whether that 2-d function would follow a sufficiently regular path that would represent the projection of the ridge on the z=0 plane. This is similar yet different to what LOESS does. Do you want a curve or a line? I've been using loess before, but it does not exactly that as it takes into account all points. Although points farther from the fit have a smaller weight, they result in the fit being a bit off the crest. Do you know a package or maybe even an option in loess that would allow me achieve this? I don't. I happen to have a dataset where I could test it. But you are likely to get better responses if you provide a test case. Any advice or idea appreciated. Emmanuel [[alternative HTML version deleted]] Plain text is preferred. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to fit a line through the Mountain crest, i.e., through the highest density of points - in a loess-like fashion.
Hi, Thanks a lot for your reply - I posted a second message where I provide a dummy example, entitled How to improve the robustness of loess? - example included. I need to fit a curve which makes it a bit difficult to work with kde2d only. I'm actually trying to use kde2d in combination with loess - basically I give the output density of kde2d as weights in the loess function. It seems to give nice results :) In my second post I wrote that the weight option did not work but that's because I was writing weigth - not sure why I did not get an error message. I'll post the lines of code as a reply to the second post. All the best, Emmanuel On 10 March 2012 19:46, David Winsemius dwinsem...@comcast.net wrote: On Mar 10, 2012, at 3:55 PM, Emmanuel Levy wrote: Hi, I'm trying to normalize data by fitting a line through the highest density of points (in a 2D plot). In other words, if you visualize the data as a density plot, the fit I'm trying to achieve is the line that goes through the crest of the mountain. Are you familiar with the kde2d of bkde2D functions in various packages? If you then collected the max density for each X and Y you might want to see whether that 2-d function would follow a sufficiently regular path that would represent the projection of the ridge on the z=0 plane. This is similar yet different to what LOESS does. Do you want a curve or a line? I've been using loess before, but it does not exactly that as it takes into account all points. Although points farther from the fit have a smaller weight, they result in the fit being a bit off the crest. Do you know a package or maybe even an option in loess that would allow me achieve this? I don't. I happen to have a dataset where I could test it. But you are likely to get better responses if you provide a test case. Any advice or idea appreciated. Emmanuel [[alternative HTML version deleted]] Plain text is preferred. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to improve the robustness of loess? - example included.
Ok so this seems to work :) tmp=rnorm(2000) X.background = 5+tmp Y.background = 5+ (10*tmp+rnorm(2000)) X.specific = 3.5+3*runif(3000) Y.specific = 5+120*runif(3000) X = c(X.background, X.specific) Y = c(Y.background, Y.specific) MINx=range(X)[1] MAXx=range(X)[2] MINy=range(Y)[1] MAXy=range(Y)[2] ## estimates the density for each datapoint nBins=50 my.lims= c(range(X,na.rm=TRUE),range(Y,na.rm=TRUE)) z1 = kde2d(X,Y,n=nBins, lims=my.lims, h= c( (my.lims[2]-my.lims[1])/(nBins/4) , (my.lims[4]-my.lims[3])/(nBins/4) ) ) X.cut = cut(X, seq(z1$x[1], z1$x[nBins],len=(nBins+1) )) Y.cut = cut(Y, seq(z1$y[1], z1$y[nBins],len=(nBins+1) )) xy.cuts = data.frame(X.cut,Y.cut, ord=1:(length(X.cut)) ) density = data.frame( X=rep(factor(levels(X.cut)),rep(nBins) ), Y=rep(factor(levels(Y.cut)), rep(nBins,nBins) ) , Z= as.vector(z1$z)) xy.density = merge( xy.cuts, density, by=c(1,2), sort=FALSE, all.x=TRUE) xy.density = xy.density[order(x=xy.density$ord),] ### Now uses the density as a weight my.loess = loess(Y ~ X, data.frame( X = X, Y = Y), family=symmetric, degree=2, span=0.1, weights= xy.density$Z^3) lo.pred = predict(my.loess, data.frame(X = seq(MINx, MAXx, length=100)), se=TRUE) plot( seq(MINx, MAXx, length=100), lo.pred$fit, lwd=2,col=2, l) #, ylim=c(0, max(tmp$fit, na.rm=TRUE) ) , col=dark grey) points(X,Y, pch=., col= grey(abs(my.loess$res)/max(abs(my.loess$res))) ) On 10 March 2012 18:30, Emmanuel Levy emmanuel.l...@gmail.com wrote: Hi, I posted a message earlier entitled How to fit a line through the Mountain crest ... I figured loess is probably the best way, but it seems that the problem is the robustness of the fit. Below I paste an example to illustrate the problem: tmp=rnorm(2000) X.background = 5+tmp; Y.background = 5+ (10*tmp+rnorm(2000)) X.specific = 3.5+3*runif(1000); Y.specific = 5+120*runif(1000) X = c(X.background, X.specific);Y = c(Y.background, Y.specific) MINx=range(X)[1];MAXx=range(X)[2] my.loess = loess(Y ~ X, data.frame( X = X, Y = Y), family=symmetric, degree=2, span=0.1) lo.pred = predict(my.loess, data.frame(X = seq(MINx, MAXx, length=100)), se=TRUE) plot( seq(MINx, MAXx, length=100), lo.pred$fit, lwd=2,col=2, l) points(X,Y, col= grey(abs(my.loess$res)/max(abs(my.loess$res))) ) As you will see, the red line does not follow the background signal. However, when decreasing the specific signal to 500 points it becomes perfect. I'm sure there is a way to tune the fitting so that it works but I can't figure out how. Importantly, *I cannot increase the span* because in reality the relationship I'm looking at is more complex so I need a small span value to allow for a close fit. I foresee that changing the weigthing is the way to go but I do not really understand how the weight option is used (I tried to change it and nothing happened), and also the embedded tricubic weighting does not seem changeable. So any idea would be very welcome. Emmanuel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help on subgraphs in xyplot of lattice library
That's not useful sample data -- like I said, dput() some sample data and send it. I can try to figure out how to plot what you're asking, but there is literally no data in what you sent. Not copy and paste the output of a print command -- dput(). (You'll understand why when you see it) And like I also said, if you could give a sketch as to how you would do this in base graphics, it will be much easier for us to help you translate into lattice graphics. I only ask because you said you could do so. Michael Also please send plain text if you know how. On Sat, Mar 10, 2012 at 2:42 PM, Chee Chen chee.c...@yahoo.com wrote: Hi, Michael, Thank you for your help! In its simplified form, the data frame looks like: idx true_value mean diff_mean1 diff_mean2 diff_mean3 sdt diff_std1 diff_std2 diff_std3 samplesize 1 1000 2 1000 3 1000 4 1000 5 1000 1 5000 2 5000 3 5000 4 5000 5 5000 I would like the plot to be: row1 has 4 subplots for samplesize 1000; row2 has 4 subplots for samplesize 5000; in each row: the 1st subplot is true_value against mean; the 2nd is an overlay plot for idx against diff_mean1, idx against diff_mean2, idx against diff_mean3; the 3rd is true_value against std; the 4th is an overlay plot for idx against diff_std1, idx against diff_std2, idx against diff_std3. I have looked at sample xyplot codes, but still did not know how to realize this. Thanks again! Chee From: R. Michael Weylandt Sent: Saturday, March 10, 2012 12:20 PM To: Chee Chen Cc: R-ORG Subject: Re: [R] Help on subgraphs in xyplot of lattice library What does your data look likedput() is your friend. Also, it'd be helpful if you could give base graphics code for more-or-less what you are looking for (since you can do so already) as it's pretty hard to describe graphics without pictures. Running example(xyplot) might help you get started as well. Michael On Sat, Mar 10, 2012 at 12:04 PM, Chee Chen chee.c...@yahoo.com wrote: Dear All, I would like to ask a question on how to do overlay plots in each subgraph of xyplot. 1. I did simulations for m=1000, 2500, 5000, 1, as the sample sizes. 2. for each sample size value m, 4 graphs are generated; each graph contains overlayed comparisons between 4 methods, 3. now I want put them into a 4-by-4 plot by xyplot, i.e., 4 sample size values, each of which has 4 plots. I know how to do this using plot, but the spaces between subplots are big. I do not know how to make each subplot in xyplot an overlayed one as it would appear using plot. Any help would be appreciated! Thank you, Chee [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rpanel / list error
Your immediate problem seems to be that you use sum as a variable name when it is also a function name. You also have scoping issues that result from how you're using with() -- if you don't return an object, it gets thrown away after the with() function is done (part of the functional paradigm) -- I've started to clean this up a little, but it now bumps up against the fact you don't return things in the rpanel bits -- I don't really use that package much but hopefully this gets you going in the right way: main - function(panel) { SUM - with(panel,{ LAST = 1100 START = 0 INDX = 0 Starting Conditions revenue = 0 minStock = panel$minStock maxStock = 100 inventory = 100 order_costs = 0 storage_costs = 0 orderlevel = k SUM = list(ninventory = inventory, order_costs = 0, storage_costs = 0, revenue = 0, index = INDX) # initial list containing values while(SUM$index LAST inventory 0) { SUM$order_costs = SUM$order_costs + order_costs SUM$storage_costs = SUM$storage_costs + storage_costs SUM$ninventory = SUM$ninvenotry + inventory SUM$index = SUM$index + 1 } SUM }) print(SUM) sis = list(Time = SUM$index, StorageCosts=SUM$storage_costs, OrderCosts = SUM$order_cost, fInventory = SUM$ninventory) print(sis) return(sis) } panel - rp.control(title=Stochastic Case) rp.button(panel,action=main,title=Calculate) rp.slider(panel,k,from=10,to=90,resolution=10,showvalue=TRUE,title=Select Order Size,initval=70) rp.slider(panel,minStock,from=10,to=90,resolution=10,initval= 50,title=Minimum Stock Level,showvalue=TRUE) Note also that index is a function so you need to be smart in how you use that name. Michael On Fri, Mar 9, 2012 at 6:59 AM, jism7690 james.jism.ca...@gmail.com wrote: Hi Michael, Thank you for your reply. I have uploaded the minimum, I have left out the formulas for calculating the amounts as they are not important to the loop. Basically I have a while loop running that adds to the list of values and then outside this loop I have a list called sis, this is the list that is causing the error. I would like this list to return the values with panel, before I used rpanel it was returning values perfectly. Thanks main - function(panel) { with(panel,{ LAST = 1100 START = 0 index = 0 Starting Conditions revenue = 0 minStock = panel$minStock maxStock = 100 inventory = 100 order_costs = 0 storage_costs = 0 orderlevel =panel$k sum = list(ninventory=inventory,order_costs=0,storage_costs=0,revenue = 0) # initial list containing values while(index LAST inventory 0) { sum$order_costs = sum$order_costs + order_costs sum$storage_costs = sum$storage_costs + storage_costs sum$ninventory = sum$ninvenotry + inventory index = index + 1 } }) sis = list(Time = index,StorageCosts=sum$storage_costs,OrderCosts= sum$order_cost,fInventory = sum$ninventory) return(sis) } panel - rp.control(title=Stochastic Case, size=panel.size) rp.button(panel,action=main,title=Calculate,pos=pos.go.button) rp.slider(panel,k,from=10,to=90,resolution=10,showvalue=TRUE,title=Select Order Size,pos=pos.order.slider,initval=70) rp.slider(panel,minStock,from=10,to=90,resolution=10,pos=pos.minstock.slider,initval = 50,title=Minimum Stock Level,showvalue=TRUE) -- View this message in context: http://r.789695.n4.nabble.com/rpanel-list-error-tp4457308p4459254.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (Fisher) Randomization Test for Matched Pairs: Permutation Data Setup Based on Signs
In general, I *think* this is a hard problem (it sounds knapsack-ish) but since you are on small enough data sets, that's probably not so important: if I understand you right, this little function will help you. plusminus - function(n){ t(as.matrix(do.call(expand.grid, rep(list(c(-1,1)), n } plusminus(3) plusminus(5) If you multiply the output of this function by your data set you will have rows corresponding to all possible sign choices: e.g., plusminus(3) * c(1,2,3) Then you can colSums() using only the positive elements: x - plusminus(3) * c(1,2,3) x[x 0] - 0 colSums(x) To wrap this all in one function: I'd do something like this: test.statistic - function(v){ m - t(as.matrix(do.call(expand.grid, rep(list(c(-1, 1)), length(v) x - m * v x[x0] - 0 out - rbind(m * v, colSums(x)) rownames(out)[length(rownames(out))] - Sum of Positive Elements out } X - test.statistic(c(-16, -4, -7, -3, -5, +1, -10)) X[,1:10] Hopefully that helps (I'm a little fuzzy on your overall goal -- so that second bit might be a red herring) Michael On Fri, Mar 9, 2012 at 12:49 AM, Ghandalf mool...@hotmail.com wrote: Hi, I am currently attempting to write a small program for a randomization test (based on rank/combination) for matched pairs. If you will please allow me to introduce you to some background information regarding the test prior to my question at hand, or you may skip down to the bold portion for my issue. There are two sample sizes; the data, as I am sure you guessed, is matched into pairs and each pair's difference is denoted by Di. The test statistic =*T* = Sum(Di) (only for those Di 0). The issue I am having is based on the method required to use in R to setup the data into the proper structure. I am to consider the absolute value of Di, without regard to their sign. There are 2^n ways of assigning + or - signs to the set of absolute differences obtained, where n = the number of Dis. That is, we can assign + signs to all n of the |Di|, or we might assign + to |D1| but - signs to |D2| to |Dn|, and so forth. So, for example, if I have *D1=-16, D2=-4, D3=-7, D4=-3, D5=-5, D6=+1, and D7=-10 and n=7. * I need to consider the 2^7 ways of assigning signs that result in the lowest sum of the positive absolute difference. To exemplify further, we have * -16, -4, -7, -3, -5, -1, -10 T = 0 -16, -4, -7, -3, -5, +1, -10 T = 1 -16, -4, -7, +3, -5, -1, -10 T = 3 -16, -4, -7, +3, -5, +1, -10 T = 4 * ... and so on. So, if you are willing to help me, I am having trouble on setting up my data as illustrated above./ How do I create (a code for) the 2^n lines of data required with all the possible combinations of + and - in order to calculate the positive values in each line (the test statistic, T)?/ I have tried to use combn(d=data set, n=7) with a data set, d, consisting of both the positive and negative sign of the respective value, to no avail. I apologize if this is lengthy, I was not sure how to ask the aforementioned question without incorrectly portraying my thoughts. If any clarification is required then I will by more than willing to oblige with any further explanation. I have searched for possible solutions, but alas, came out empty handed. Thank you. -- View this message in context: http://r.789695.n4.nabble.com/Fisher-Randomization-Test-for-Matched-Pairs-Permutation-Data-Setup-Based-on-Signs-tp4458606p4458606.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Which non-parametric regression would allow fitting this type of data? (example given).
Hi, I'm wondering which function would allow fitting this type of data: tmp=rnorm(2000) X.1 = 5+tmp Y.1 = 5+ (5*tmp+rnorm(2000)) tmp=rnorm(100) X.2 = 9+tmp Y.2 = 40+ (1.5*tmp+rnorm(100)) X.3 = 7+ 0.5*runif(500) Y.3 = 15+20*runif(500) X = c(X.1,X.2,X.3) Y = c(Y.1,Y.2,Y.3) plot(X,Y) The problem with loess is that distances for the goodness of fit are calculated on the Y-axis. However, distances would need to be calculated on the normals of the fitted curve. Is there a function that provide this option? A simple trick in that case consists in swapping X and Y, but I'm wondering if there is a more general solution? Thanks for your input, Emmanuel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function input as variable name (deparse/quote/paste) ??
Sorry if I wasn't stating what I really wanted or it was a bit confusing. Basically, there are MANY datasets to run suing the same function I have written a function to analyze it and returns a LIST of useful out put in the variable 'res' (to the workspace). I also created another script run.r such as myname(dat1) myname(dat2) myname(dat3) myname(dat4) myname(dat5) For now, each time the output in the main workspace 'res' (the list) is over written. I want it to have different suffix to differentiate them. So I can have a look later after the batch is run. Thanks. casper -- View this message in context: http://r.789695.n4.nabble.com/function-input-as-variable-name-deparse-quote-paste-tp4462841p4463044.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Which non-parametric regression would allow fitting this type of data? (example given).
Thanks for the example. Have you tried fitting a principal curve via either the princurve or pcurve packages? I think this might work for what you want, but no guarantees. Note that loess, splines, etc. are all fitting y|x, that is, a nonparametric regression of y on x. That is not what you say you want, so these approaches are unlikely to work. -- Bert On Sat, Mar 10, 2012 at 6:20 PM, Emmanuel Levy emmanuel.l...@gmail.com wrote: Hi, I'm wondering which function would allow fitting this type of data: tmp=rnorm(2000) X.1 = 5+tmp Y.1 = 5+ (5*tmp+rnorm(2000)) tmp=rnorm(100) X.2 = 9+tmp Y.2 = 40+ (1.5*tmp+rnorm(100)) X.3 = 7+ 0.5*runif(500) Y.3 = 15+20*runif(500) X = c(X.1,X.2,X.3) Y = c(Y.1,Y.2,Y.3) plot(X,Y) The problem with loess is that distances for the goodness of fit are calculated on the Y-axis. However, distances would need to be calculated on the normals of the fitted curve. Is there a function that provide this option? A simple trick in that case consists in swapping X and Y, but I'm wondering if there is a more general solution? Thanks for your input, Emmanuel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function input as variable name (deparse/quote/paste) ??
On 11-03-2012, at 01:01, casperyc wrote: Sorry if I wasn't stating what I really wanted or it was a bit confusing. Basically, there are MANY datasets to run suing the same function I have written a function to analyze it and returns a LIST of useful out put in the variable 'res' (to the workspace). Your function uses return? Probably not. I also created another script run.r such as myname(dat1) myname(dat2) myname(dat3) myname(dat4) myname(dat5) For now, each time the output in the main workspace 'res' (the list) is over written. I want it to have different suffix to differentiate them. So I can have a look later after the batch is run. Well, if that is the case then there is a better way than doing global assignments in a function. Make sure myfunction returns the list of results with return() and don't do global assignment with - for( k in 1:5) { dataname - paste(data,k,sep=) resname - paste(res,k,sep=) assign(resname, myfunction(get(dataname))) } Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] too many open devices
On Sat, 10-Mar-2012 at 02:21PM -0600, harold kincaid wrote: | I am getting too many open devices after 60 graphs. The archived | comments on this problem were too sketchy to be helpful. Any ideas? With minimal information, my guess might not be correct, but I suspect you're plotting to a Windows device and a new one is opened for each of your plots. That would be some clutter on your screen. You'd make life simpler if you used a pdf device that uses a new page for each of your plots which can be hundreds of pages if you like. Check out the help for pdf(), making sure you don't forget the dev.off() part. HTH -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) . Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.