[R] [R-pkgs] mvbutils and debug: new versions
New versions of the 'mvbutils' and 'debug' packages are now available on CRAN. These should work with R 2.10 as well as R 2.9. 'mvbutils' offers tools for organization of workspaces, function/documentation editing with backups, package construction and updating, seamless per-object lazy-loading, and various miscellaneous goodies. New in this version: nearly-automated package building using only plain-text documentation, plus the ability to edit update your package while it's loaded, i.e. without needing to reinstall it or to quit R. These new features are beta level; they've been tested by me and some colleagues, but are still a bit experimental. Comments welcome. 'debug' is a debugger with: separate code windows; line-numbered conditional breakpoints; continuation on errors; direct interpretation of commands from console; skipping backwards forwards; handling of 'on.exit'. This version fixes a few minor bugs. -- Mark Bravington CSIRO Mathematical Information Sciences Marine Laboratory Castray Esplanade Hobart 7001 TAS ph (+61) 3 6232 5118 fax (+61) 3 6232 5012 mob (+61) 438 315 623 ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Locate 'repeated' package?
I cannot find the 'repeated' package at the CRANs. Can anyone tell me how to get this package? Thanks, Daniel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to use SQL code in R
Le lundi 16 novembre 2009 à 13:14 +0800, sdlywjl666 a écrit : Dear All, How to use SQL code in R? Depends on what you mean by using SQL code in R... If you mean Query, update or otherwise mess with an existing database, look at the RODBC packages and/or various RDBI-related packages (hint : look at BDR's nice manual on importing/exporting data in R which comes with the standrd R documentation). If you mean use standard SQL code to manipilte R dataframes, look at the sqldf ackage (quite nice, but be aware that is work in progress). If you mean use Oracle's command language or pgsql or ... in order to transfer in R code written for a specific database, I'm afraid your SOL... In that last case, however, note that you may be able to use R *inside* your external database instead, depending on its ability to use external code (I'm thinking of the rpgsql language, which allows running R code in a pgsql function in PostgreSQL). HTH, Emmanuel Charpentier BTW, there exist a R database Special Interest Group, with a mailing list. Lookup their archive, and maybe ask a (slightly less general, if possible) version of your question there... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] JGR GUI for R-2.10.0 Help Print
Hello On 11/15/09, Bob Meglen bmeg...@comcast.net wrote: I have updated R 2.9.1 to 2.10.0. and JGR GUI 1.7. I am running Windows XP. I can't seem to get the JGR Print or Help functions to work. The system locks and requires me to stop the process. In the past I have preferred the opreation and feel of JGR GUI. I realize that this help forum is for R; but, I am hoping that some other R-user is a JGR GUI user and might have a hint about this. This should go to stats-rosuda-devel. Cc'ing there. Liviu At one point I received the following: Loading required package: rJava Loading required package: JavaGD Loading required package: iplots Attaching package: 'utils' The following object(s) are masked from package:rJava : head, str, tail starting httpd help server ...Error in tools:::startDynamicHelp() : could not find function runif Loading required package: stats Loading required package: graphics Loading Tcl/Tk interface ... done During startup - Warning message: package JGR in options(defaultPackages) was not found Loading required package: JGR starting httpd help server ... done q() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Do you know how to read? http://www.alienetworks.com/srtest.cfm Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plotting Moa clustering result
Hi all, Is there a way of plotting a 'decision tree' from the results of Mona in the cluster package. The default bannerplot is not quite what I'm after - I would like a plot of the binary decision tree. Thanks Zoë __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Locate 'repeated' package?
Daniel R Jeske wrote: I cannot find the 'repeated' package at the CRANs. Can anyone tell me how to get this package? That is a package by Jim Lindsey who does not publish his non-standard R packages on CRAN but on some webpage. The current locations according to my Google search seems to be: http://www.commanster.eu/rcode.html Best wishes, Uwe Ligges Thanks, Daniel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ^ operator
Hi, I want to apply ^ operator to a vector but it is applied to some of the elements correctly and to some others, it generates NaN. Why is it not able to calculate -6.108576e-05^(1/3) even though it exists? tmp [1] -6.108576e-05 4.208762e-05 3.547092e-05 7.171101e-04 -1.600269e-03 tmp^(1/3) [1]NaN 0.03478442 0.03285672 0.08950802NaN -6.108576e-05^(1/3) [1] -0.03938341 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: ^ operator
Hi AFAIK, this is issue of the preference of operators. r-help-boun...@r-project.org napsal dne 16.11.2009 11:24:59: Hi, I want to apply ^ operator to a vector but it is applied to some of the elements correctly and to some others, it generates NaN. Why is it not able to calculate -6.108576e-05^(1/3) even though it exists? tmp [1] -6.108576e-05 4.208762e-05 3.547092e-05 7.171101e-04 -1.600269e-03 tmp^(1/3) [1]NaN 0.03478442 0.03285672 0.08950802NaN This computes (-a)^(1/3) which is not possible in real numbers. You have to use as.complex(tmp)^(1/3) to get a result. -6.108576e-05^(1/3) [1] -0.03938341 this is actually -(6.108576e-05^(1/3)) Regards Petr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs
I'm not convinced it's right. In fact, I'm pretty sure the last step taking only the first half of the list is wrong. I also do not know if you have considered how you want to count situations like: 3 2 7 4 5 7 ... 7 3 8 6 1 2 9 2 .. How many pairs of 2-7/7-2 would that represent? -- David On Nov 15, 2009, at 11:06 PM, cindy Guo wrote: Hi, David, The matrix has 20 columns. Thank you very much for your help. I think it's right, but it seems I need some time to figure it out. I am a green hand. There are so many functions here I never used before. :) Cindy On Sun, Nov 15, 2009 at 5:19 PM, David Winsemius dwinsem...@comcast.net wrote: Assuming that the number of columns is 4, then consider this approach: prs -scan() 1: 2 5 1 6 5: 1 7 8 2 9: 3 7 6 2 13: 9 8 5 7 17: Read 16 items prmtx - matrix(prs, 4,4, byrow=T) #Now make copus of x.y and y.x pair.str - sapply(1:nrow(prmtx), function(z) c(apply(combn(prmtx[z,], 2), 2,function(x) paste(x[1],x[2], sep=.)) , apply(combn(prmtx[z,], 2), 2,function(x) paste(x[2],x[1], sep=.))) ) tpair -table(pair.str) # This then gives you a duplicated list tpair[tpair1] pair.str 1.2 2.1 2.6 2.7 6.2 7.2 7.8 8.7 2 2 2 2 2 2 2 2 # So only take the first half of the pairs: head(tpair[tpair1], sum(tpair1)/2) pair.str 1.2 2.1 2.6 2.7 2 2 2 2 -- David. On Nov 15, 2009, at 8:06 PM, David Winsemius wrote: I could of course be wrong but have you yet specified the number of columns for this pairing exercise? On Nov 15, 2009, at 5:26 PM, cindy Guo wrote: Hi, All, I have an n by m matrix with each entry between 1 and 15000. I want to know the frequency of each pair in 1:15000 that occur together in rows. So for example, if the matrix is 2 5 1 6 1 7 8 2 3 7 6 2 9 8 5 7 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return the value 2 for this pair as well as that for all pairs. Is there a fast way to do this avoiding loops? Loops take too long. and provide commented, minimal, self-contained, reproducible code. ^^ David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: ^ operator
but with complex, I get complex numbers for the first and last elements: (as.complex(tmp))^(1/3) [1] 0.01969170+0.03410703i 0.03478442+0.i 0.03285672+0.i [4] 0.08950802+0.i 0.05848363+0.10129661i whereas for the first element, we get the followings. Moreover, -6.108576e-05^(1/3) [1] -0.03938341 and -(6.108576e-05^(1/3)) [1] -0.03938341 and -((6.108576e-05)^(1/3)) [1] -0.03938341 give the same results. so using () doesn't preserve any thing --- On Mon, 11/16/09, Petr PIKAL petr.pi...@precheza.cz wrote: From: Petr PIKAL petr.pi...@precheza.cz Subject: Odp: [R] ^ operator To: carol white wht_...@yahoo.com Cc: r-h...@stat.math.ethz.ch Date: Monday, November 16, 2009, 3:40 AM Hi AFAIK, this is issue of the preference of operators. r-help-boun...@r-project.org napsal dne 16.11.2009 11:24:59: Hi, I want to apply ^ operator to a vector but it is applied to some of the elements correctly and to some others, it generates NaN. Why is it not able to calculate -6.108576e-05^(1/3) even though it exists? tmp [1] -6.108576e-05 4.208762e-05 3.547092e-05 7.171101e-04 -1.600269e-03 tmp^(1/3) [1] NaN 0.03478442 0.03285672 0.08950802 NaN This computes (-a)^(1/3) which is not possible in real numbers. You have to use as.complex(tmp)^(1/3) to get a result. -6.108576e-05^(1/3) [1] -0.03938341 this is actually -(6.108576e-05^(1/3)) Regards Petr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Labels in horizontal dendrogram not placed correctly?
Hi all, I tried plotting a horizontal dendrogram, but it seems as if the labels are not taken into account in the function plot.dendrogram(). A minimal example : Test - data.frame( x1x = c(1:10), x2x = c(2:11), x3x = c(11:2) ) TestDist - daisy(data.frame(t(Test))) TestAgnes - agnes(TestDist) plot(as.dendrogram(TestAgnes),horiz=T) If I run this in R 2.10.0, I get a horizontal dendrogram with the labels to the far right, and partly outside the plot area. This is highly inconvenient. Am I doing something wrong or is this a bug? Kind regards Joris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple if else statement problem
thanks petr, this is actually shorter ;) Petr Pikal wrote: Hi r-help-boun...@r-project.org napsal dne 13.11.2009 18:54:05: Ok Jim it worked, thank you! it´s funny because it worked with the first syntax in some cases... you can use another approach in this case P-max(c(P1,P2)) Regards Petr anna_l wrote: Hello, I am getting an error with the following code: if( P2 P1) + { + P-P2 + } else Erro: unexpected 'else' in else { + P-P1 + } I checked the syntax so I don´t understand, I have other if else statements with the same syntax working. Thanks in advance -- View this message in context: http://old.nabble.com/Simple-if-else-statement- problem-tp26340336p26340642.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/Simple-if-else-statement-problem-tp26340336p26371185.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lapply() not converting columns to factors (no error message)
Dear List, I'm having a curious problem with lapply(). I've used it before to convert a subset of columns in my dataframe, to factors, and its worked. But now, on re-running the identical code as before it just doesn't convert the columns into factors at all. As far as I can see I've done nothing different, and its strange that it shouldn't do the action. Has anybody come across this before? Any input on this strange issue much appreciated.. Hope I haven't missed something obvious. Thanks a lot, Aditi (P.s.- I've tried converting columns one by one to factors this time, and that works. P1L55-factor(P1L55) levels(P1L55) [1] 0 1 Code: prm-read.table(P:\\. .csv, header=T, ...sep=,, ...) prmdf-data.frame(prm) prmdf[2:13]-lapply(prmdf[2:13], factor) ## action performed, no error message ##I tried to pick random columns and check levels(P1L55) NULL is.factor(P1L96) FALSE -- A Singh aditi.si...@bristol.ac.uk School of Biological Sciences University of Bristol __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems by saving Rprofile.site under vista
Hi Charles, I´ve already been running it as an administrator this is why I don´t understand it. Charles Annis, P.E. wrote: You may have to run R as Administrator (right-click, choose run as administrator) to make these kinds of changes. After you have things the way you like them, run R in the usual way by clicking on the icon. Charles Annis, P.E. charles.an...@statisticalengineering.com 561-352-9699 http://www.StatisticalEngineering.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of anna_l Sent: Friday, November 13, 2009 11:46 AM To: r-help@r-project.org Subject: [R] Problems by saving Rprofile.site under vista Hello, I am trying to save some changes I have done on the Rprofile.site under vista and it doesn´t let me save the file saying that it can´t create the following file (Rprofile.site) and that I should check the pathfile or the file name. -- View this message in context: http://old.nabble.com/Problems-by-saving-Rprofile.site-under-vista-tp26339605p26339605.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/Problems-by-saving-Rprofile.site-under-vista-tp26339605p26371258.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conditional statement
Dear useRs, I wrote a function that simulates a stochastic model in discrete time. The problem is that the stochastic parameters should not be negative and sometimes they happen to be. How can I conditionate it to when it draws a negative number, it transforms into zero in that time step? Here is the function: stochastic_prost - function(Fmean, Fsd, Smean, Ssd, f, s, n, time, out=FALSE, plot=TRUE) { nt - rep(0, time) nt[1] - n for(n in 2:time) { nt[n] - 0.5*rnorm(1, Fmean, Fsd)*rnorm(1, Smean, Ssd)*exp(1)^(-(f+s)*nt[n-1])*nt[n-1]} if(out==TRUE) {print(data.frame(nt))} if(plot==TRUE) {plot(1:time, nt, type='l', main='Simulation', ylab='Population', xlab='Generations')} } The 2 rnorm()'s should not be negative; when negative they should turn into zero. Thanks in advance, Rafael [[elided Yahoo spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ^ operator
R does not know that 1/3 is 1/3. It is represented internally as 0.3...3, so certain mathematical facts such as the existence of real roots of fractional integer powers are opaque to R (since it is not a symbolic algebra system.) Try seaarching for cube roots on R-search (for example): http://finzi.psych.upenn.edu/Rhelp08/2009-July/205006.html # Further comments by Ted Harding complex(real=-6.108576e-05)^(1/3) [1] 0.01969171+0.03410703i' So the safest way to get the real cube root (as opposed the complex roots) is to use: sign(tmp)*abs(tmp)^1/3 sign(tmp)*abs(tmp)^(1/3) [1] -0.03938341 0.03478442 0.03285672 0.08950802 -0.11696726 On Nov 16, 2009, at 5:24 AM, carol white wrote: Hi, I want to apply ^ operator to a vector but it is applied to some of the elements correctly and to some others, it generates NaN. Why is it not able to calculate -6.108576e-05^(1/3) even though it exists? tmp [1] -6.108576e-05 4.208762e-05 3.547092e-05 7.171101e-04 -1.600269e-03 tmp^(1/3) [1]NaN 0.03478442 0.03285672 0.08950802NaN -6.108576e-05^(1/3) [1] -0.03938341 -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lapply() not converting columns to factors (no error message)
Sorry, my file is at: http://www.4shared.com/file/153147281/a5c78386/Testvcomp10.html -- A Singh aditi.si...@bristol.ac.uk School of Biological Sciences University of Bristol __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: ^ operator
On 16-Nov-09 11:40:29, Petr PIKAL wrote: Hi AFAIK, this is issue of the preference of operators. r-help-boun...@r-project.org napsal dne 16.11.2009 11:24:59: Not in this case (see below), though of course in general - takes precedence over ^, so, for example, in the expression -2^(1/3) the - is applied first, giving (-2); and then ^ is applied next, giving (-2)^(1/3). There is a work-round (see below). Hi, I want to apply ^ operator to a vector but it is applied to some of the elements correctly and to some others, it generates NaN. Why is it not able to calculate -6.108576e-05^(1/3) even though it exists? It only exists (in the real domain) if ^ takes precedence over - which (in R) it does not! tmp [1] -6.108576e-05 4.208762e-05 3.547092e-05 7.171101e-04 -1.600269e-03 tmp^(1/3) [1]NaN 0.03478442 0.03285672 0.08950802NaN This computes (-a)^(1/3) which is not possible in real numbers. In this example, that is not accurate. tmp has already been defined, and contains numbers which are already stored as negative numbers, so - is no longer on the scene as an operator, before ^ is applied; the issue of precedence of - over ^ is no longer present. The NaN arises from x^(1/3) where x is negative. You have to use as.complex(tmp)^(1/3) to get a result. -6.108576e-05^(1/3) [1] -0.03938341 This is not the result I get: as.complex(tmp)^(1/3) # [1] 0.01969171+0.03410703i 0.03478442+0.i # [3] 0.03285672+0.i 0.08950802+0.i # [5] 0.05848363+0.10129662i this is actually -(6.108576e-05^(1/3)) Regards Petr It is possible to work round the problem without using as.complex which can introduce complications -- see above, and also: x - (-1) x^(1/3) # [1] NaN as.complex(x)^(1/3) # [1] 0.5+0.8660254i as.complex(-1)^(1/2) # [1] 0+1i which you would not want if you are working throughout in real numbers (you would want the result -1 instead). Although, in the mathematics of complex numbers, (-1)^(1/3) has three values, one of which is -1, R only returns a single value. However, you would have to define a new operator, called say %^%: %^%-function(X,x){sign(X)*(abs(X)^x)} tmp - c(-6.108576e-05, 4.208762e-05, 3.547092e-05, 7.171101e-04, -1.600269e-03) tmp%^%(1/3) # [1] -0.03938341 0.03478442 0.03285672 0.08950802 -0.11696726 The definition of %^% forces ^ to take precedence over -, by in effect removing - from the scene until ^ has done its work. But, if you hope to rely on this, note that if you apply to 'tmp' any function in which the ordinary ^ will be used on a negative number, you will still have the same problem. Note: Trying to redefine ^ will not work, since invoking the result initiates an infinite recursion: ^- function(X,x){sign(X)*(abs(X)^x)} ## (This definition will be accepted by R) tmp%^%(1/3) # Error: evaluation nested too deeply: infinite recursion / # options(expressions=)? It's not a clean situatio, but I hope the above helps! Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 16-Nov-09 Time: 12:55:25 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lapply() not converting columns to factors (no error message)
Works for me: x - read.csv(url(http://dc170.4shared.com/download/153147281/a5c78386/Testvcomp10.csv?tsid=20091116-075223-c3093ab0;)) names(x) x[2:13] - lapply(x[2:13], factor) levels(x$P1L55) [1] 0 1 is.factor(x$P1L96) [1] TRUE sessionInfo() R version 2.10.0 (2009-10-26) i386-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lattice_0.17-26 loaded via a namespace (and not attached): [1] grid_2.10.0 tools_2.10.0 On Mon, Nov 16, 2009 at 4:50 AM, A Singh aditi.si...@bristol.ac.uk wrote: Sorry, my file is at: http://www.4shared.com/file/153147281/a5c78386/Testvcomp10.html -- A Singh aditi.si...@bristol.ac.uk School of Biological Sciences University of Bristol __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs
I stuck in another 7 in one of the lines with a 2 and reasoned that we could deal with the desire for non-ordered pair counting by pasting min(x,y) to max(x,y); dput(prmtx) structure(c(2, 1, 3, 9, 5, 7, 7, 8, 1, 7, 6, 5, 6, 2, 2, 7), .Dim = c(4L, 4L)) prmtx [,1] [,2] [,3] [,4] [1,]2516 [2,]1772 [3,]3762 [4,]9857 pair.str - sapply(1:nrow(prmtx), function(z) apply(combn(prmtx[z,], 2), 2,function(x) paste(min(x[2],x[1]), max(x[2],x[1]), sep=.))) The logic: sapply(1:nrow(prmtx), ... just loops over the rows of the matrix. combn(prmtx[z,], 2) ... returns a two row matrix of combination in a single row. apply(combn(prmtx[z,], 2), 2 ... since combn( , 2) returns a matrix that has two _rows_ I needed to loop over the columns. paste(min(x[2],x[1]), max(x[2],x[1]), sep=.) ... stick the minimum of a pair in front of the max and separates them with a period to prevent two+ digits from being non-unique Then using table() and logical tests in an index for the desired multiple pairs: tpair -table(pair.str) tpair pair.str 1.2 1.5 1.6 1.7 2.3 2.5 2.6 2.7 3.6 3.7 5.6 5.7 5.8 5.9 6.7 7.7 7.8 7.9 8.9 2 1 1 2 1 1 2 3 1 1 1 1 1 1 1 1 1 1 1 tpair[tpair1] pair.str 1.2 1.7 2.6 2.7 2 2 2 3 -- David. On Nov 16, 2009, at 7:02 AM, David Winsemius wrote: I'm not convinced it's right. In fact, I'm pretty sure the last step taking only the first half of the list is wrong. I also do not know if you have considered how you want to count situations like: 3 2 7 4 5 7 ... 7 3 8 6 1 2 9 2 .. How many pairs of 2-7/7-2 would that represent? -- David On Nov 15, 2009, at 11:06 PM, cindy Guo wrote: Hi, David, The matrix has 20 columns. Thank you very much for your help. I think it's right, but it seems I need some time to figure it out. I am a green hand. There are so many functions here I never used before. :) Cindy On Sun, Nov 15, 2009 at 5:19 PM, David Winsemius dwinsem...@comcast.net wrote: Assuming that the number of columns is 4, then consider this approach: prs -scan() 1: 2 5 1 6 5: 1 7 8 2 9: 3 7 6 2 13: 9 8 5 7 17: Read 16 items prmtx - matrix(prs, 4,4, byrow=T) #Now make copus of x.y and y.x pair.str - sapply(1:nrow(prmtx), function(z) c(apply(combn(prmtx[z,], 2), 2,function(x) paste(x[1],x[2], sep=.)) , apply(combn(prmtx[z,], 2), 2,function(x) paste(x[2],x[1], sep=.))) ) tpair -table(pair.str) # This then gives you a duplicated list tpair[tpair1] pair.str 1.2 2.1 2.6 2.7 6.2 7.2 7.8 8.7 2 2 2 2 2 2 2 2 # So only take the first half of the pairs: head(tpair[tpair1], sum(tpair1)/2) pair.str 1.2 2.1 2.6 2.7 2 2 2 2 -- David. On Nov 15, 2009, at 8:06 PM, David Winsemius wrote: I could of course be wrong but have you yet specified the number of columns for this pairing exercise? On Nov 15, 2009, at 5:26 PM, cindy Guo wrote: Hi, All, I have an n by m matrix with each entry between 1 and 15000. I want to know the frequency of each pair in 1:15000 that occur together in rows. So for example, if the matrix is 2 5 1 6 1 7 8 2 3 7 6 2 9 8 5 7 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return the value 2 for this pair as well as that for all pairs. Is there a fast way to do this avoiding loops? Loops take too long. and provide commented, minimal, self-contained, reproducible code. ^^ David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lapply() not converting columns to factors (no error message)
Oh, strange! I thought it might be a problem with the 'base' package installation, because the same thing's worked for me too before but won't do now. I tried to reinstall it (base), but R says its there already which I expected it to be anyway. I don't quite know where the issue is. Very odd. --On 16 November 2009 04:59 -0800 Sundar Dorai-Raj sdorai...@gmail.com wrote: Works for me: x - read.csv(url(http://dc170.4shared.com/download/153147281/a5c78386/Testvc omp10.csv?tsid=20091116-075223-c3093ab0)) names(x) x[2:13] - lapply(x[2:13], factor) levels(x$P1L55) [1] 0 1 is.factor(x$P1L96) [1] TRUE sessionInfo() R version 2.10.0 (2009-10-26) i386-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lattice_0.17-26 loaded via a namespace (and not attached): [1] grid_2.10.0 tools_2.10.0 On Mon, Nov 16, 2009 at 4:50 AM, A Singh aditi.si...@bristol.ac.uk wrote: Sorry, my file is at: http://www.4shared.com/file/153147281/a5c78386/Testvcomp10.html -- A Singh aditi.si...@bristol.ac.uk School of Biological Sciences University of Bristol __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- A Singh aditi.si...@bristol.ac.uk School of Biological Sciences University of Bristol __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survreg function in survival package
Thank you David. I don't think that I could pass by rweibull function since I use uniform random variable to generate survival times having weibull distribution. Therefore, like you, I have not found any other solution to set the shape parameter. If I want to calculate hazard, I will need both scale and shape parameters. Sorry, just forgot to reply to all when replying to your email. Unfortunately, nobody else has replied yet so I wonder if anybody else could be helpful. Cheers, --- On Sat, 11/14/09, David Winsemius dwinsem...@comcast.net wrote: From: David Winsemius dwinsem...@comcast.net Subject: Re: [R] survreg function in survival package To: carol white wht_...@yahoo.com Date: Saturday, November 14, 2009, 8:44 AM It appears from a look at the str() output from your survreg object that you did set the scale parameter, at least the $scale value is set to 1 which is not what happens when that procedure is employed without that explicit setting. Does that mean that the coefficients were the shape parameters? The help page for survreg.distributions {survival} says that the scale = 1/shape and that the intercept is =log(scale). ?survreg.distributions ?survreg.object And this is found in the survreg example: # There are multiple ways to parameterize a Weibull distribution. The survreg # function imbeds it in a general location-scale familiy, which is a # different parameterization than the rweibull function, and often leads # to confusion. # survreg's scale = 1/(rweibull shape) # survreg's intercept = log(rweibull scale) # For the log-likelihood all parameterizations lead to the same value. y - rweibull(1000, shape=2, scale=5) survreg(Surv(y)~1, dist=weibull) I find that a bit confusing because it would seem that the scale should not be a (pseudo-)random number. I was especting to read that scale would be either pweibull(shape) or qweibull(shape). Guess I will need to go back to my textbooks when I have time, which sadly I do not have today. Given that you are asking this offlist, I am sending it only to you which is not the optimal method for this exchange. It means that neither one of us wiil get our confusion and questions addressed by more knowledgeable persons reading the r-help list. My suggestion is that you copy this to the list. --David On Nov 13, 2009, at 9:02 AM, carol white wrote: Thanks for your reply. which parameter presents the base line scale parameter? How is it possible to set the shape parameter for weibull in survreg? Many thanks --- On Fri, 11/13/09, David Winsemius dwinsem...@comcast.net wrote: From: David Winsemius dwinsem...@comcast.net Subject: Re: [R] survreg function in survival package To: carol white wht_...@yahoo.com Cc: r-h...@stat.math.ethz.ch Date: Friday, November 13, 2009, 3:56 AM On Nov 13, 2009, at 3:17 AM, carol white wrote: Hi, Is it normal to get intercept in the list of covariates in the output of survreg function with standard error, z, p.value etc? Does it mean that intercept was fitted with the covariates? Does Value column represent coefficients or some thing else? Don't you need a baseline scale parameter for the Weibull function? You didn't offer the structure of your dataframe, but if it is the standard ovarian set, then the rx coef is just the difference between the scale parameter of rx=2 from that of rx=1, and similarly for ecog.ps. You would not have an estimate for rx=1 and ecog.ps=1 if you were not given the Intercept coef. In the future it would be good manners to indicate what grad school you are taking classes at. --David Regards, - tmp = survreg(Surv(futime, fustat) ~ ecog.ps + rx, ovarian, dist='weibull',scale=1) summary(tmp) Call: survreg(formula = Surv(futime, fustat) ~ ecog.ps + rx, data = ovarian, dist = weibull, scale = 1) Value Std. Error z p (Intercept) 6.962 1.322 5.267 1.39e-07 ecog.ps -0.433 0.587 -0.738 4.61e-01 rx 0.582 0.587 0.991 3.22e-01 Scale fixed at 1 Weibull distribution Loglik(model)= -97.2 Loglik(intercept only)= -98 Chisq= 1.67 on 2 degrees of freedom, p= 0.43 Number of Newton-Raphson Iterations: 4 n= 26 -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional statement
Generate the numbers, test for zero and then set negatives to zero: set.seed(1) x - rnorm(100,5,3) sum(x0) [1] 3 x[x0] - 0 sum(x0) [1] 0 On Mon, Nov 16, 2009 at 7:43 AM, Rafael Moral rafa_moral2...@yahoo.com.br wrote: Dear useRs, I wrote a function that simulates a stochastic model in discrete time. The problem is that the stochastic parameters should not be negative and sometimes they happen to be. How can I conditionate it to when it draws a negative number, it transforms into zero in that time step? Here is the function: stochastic_prost - function(Fmean, Fsd, Smean, Ssd, f, s, n, time, out=FALSE, plot=TRUE) { nt - rep(0, time) nt[1] - n for(n in 2:time) { nt[n] - 0.5*rnorm(1, Fmean, Fsd)*rnorm(1, Smean, Ssd)*exp(1)^(-(f+s)*nt[n-1])*nt[n-1]} if(out==TRUE) {print(data.frame(nt))} if(plot==TRUE) {plot(1:time, nt, type='l', main='Simulation', ylab='Population', xlab='Generations')} } The 2 rnorm()'s should not be negative; when negative they should turn into zero. Thanks in advance, Rafael [[elided Yahoo spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional statement
On Nov 16, 2009, at 7:43 AM, Rafael Moral wrote: Dear useRs, I wrote a function that simulates a stochastic model in discrete time. The problem is that the stochastic parameters should not be negative and sometimes they happen to be. How can I conditionate it to when it draws a negative number, it transforms into zero in that time step? Here is the function: stochastic_prost - function(Fmean, Fsd, Smean, Ssd, f, s, n, time, out=FALSE, plot=TRUE) { nt - rep(0, time) nt[1] - n for(n in 2:time) { nt[n] - 0.5*rnorm(1, Fmean, Fsd)*rnorm(1, Smean, Ssd)*exp(1)^(-(f +s)*nt[n-1])*nt[n-1]} if(out==TRUE) {print(data.frame(nt))} if(plot==TRUE) {plot(1:time, nt, type='l', main='Simulation', ylab='Population', xlab='Generations')} } The 2 rnorm()'s should not be negative; when negative they should turn into zero. ...*max(0, rnorm(1, Fmean, Fsd)*max(0, rnorm(1, Smean, Ssd)*... Thanks in advance, Rafael [[elided Yahoo spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: ^ operator
On 11/16/09, Ted Harding ted.hard...@manchester.ac.uk wrote: Not in this case (see below), though of course in general - takes precedence over ^, so, for example, in the expression -2^(1/3) the - is applied first, giving (-2); and then ^ is applied next, giving (-2)^(1/3). There is a work-round (see below). Hmm.. I may be doing something wrong, but from here it looks to be the opposite. -2^(1/3); -(2)^(1/3); -(2^(1/3)); [1] -1.2599 [1] -1.2599 [1] -1.2599 (-2)^(1/3) [1] NaN The results don't change when switching from the unary minus. 0-2^(1/3); 0-(2)^(1/3); 0-(2^(1/3)); [1] -1.2599 [1] -1.2599 [1] -1.2599 It seems to me that in this example ^ is applied first, and - second. There is also this fortune entry. fortune(unary) Thomas Lumley: The precedence of ^ is higher than that of unary minus. It may be surprising, [...] Hervé Pagès: No, it's not surprising. At least to me... In the country where I grew up, I've been teached that -x^2 means -(x^2) not (-x)^2. -- Thomas Lumley and Hervé Pagès (both explaining that operator precedence is working perfectly well) R-devel (January 2006) Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lapply() not converting columns to factors (no error message)
Didn't you notice the difference between Sundar's code and yours? Sundar put the data.frame name before the column name while you did not do so in your check step. -- DW On Nov 16, 2009, at 8:07 AM, A Singh wrote: Oh, strange! I thought it might be a problem with the 'base' package installation, because the same thing's worked for me too before but won't do now. I tried to reinstall it (base), but R says its there already which I expected it to be anyway. I don't quite know where the issue is. Very odd. --On 16 November 2009 04:59 -0800 Sundar Dorai-Raj sdorai...@gmail.com wrote: Works for me: x - read.csv(url(http://dc170.4shared.com/download/153147281/a5c78386/Testvc omp10.csv?tsid=20091116-075223-c3093ab0)) names(x) x[2:13] - lapply(x[2:13], factor) levels(x$P1L55) [1] 0 1 is.factor(x$P1L96) [1] TRUE sessionInfo() R version 2.10.0 (2009-10-26) i386-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lattice_0.17-26 loaded via a namespace (and not attached): [1] grid_2.10.0 tools_2.10.0 On Mon, Nov 16, 2009 at 4:50 AM, A Singh aditi.si...@bristol.ac.uk wrote: Sorry, my file is at: http://www.4shared.com/file/153147281/a5c78386/Testvcomp10.html -- A Singh aditi.si...@bristol.ac.uk School of Biological Sciences University of Bristol __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- A Singh aditi.si...@bristol.ac.uk School of Biological Sciences University of Bristol __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lapply() not converting columns to factors (no error message)
Could it be you have factor redefined in your workspace? Have you tried it in a clean directory? I.e. a directory where no .RData exists? On Mon, Nov 16, 2009 at 5:07 AM, A Singh aditi.si...@bristol.ac.uk wrote: Oh, strange! I thought it might be a problem with the 'base' package installation, because the same thing's worked for me too before but won't do now. I tried to reinstall it (base), but R says its there already which I expected it to be anyway. I don't quite know where the issue is. Very odd. --On 16 November 2009 04:59 -0800 Sundar Dorai-Raj sdorai...@gmail.com wrote: Works for me: x - read.csv(url(http://dc170.4shared.com/download/153147281/a5c78386/Testvc omp10.csv?tsid=20091116-075223-c3093ab0)) names(x) x[2:13] - lapply(x[2:13], factor) levels(x$P1L55) [1] 0 1 is.factor(x$P1L96) [1] TRUE sessionInfo() R version 2.10.0 (2009-10-26) i386-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lattice_0.17-26 loaded via a namespace (and not attached): [1] grid_2.10.0 tools_2.10.0 On Mon, Nov 16, 2009 at 4:50 AM, A Singh aditi.si...@bristol.ac.uk wrote: Sorry, my file is at: http://www.4shared.com/file/153147281/a5c78386/Testvcomp10.html -- A Singh aditi.si...@bristol.ac.uk School of Biological Sciences University of Bristol __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- A Singh aditi.si...@bristol.ac.uk School of Biological Sciences University of Bristol __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error on reading an excel file
Hello everybody, here is the code I use to read an excel file containing two rows, one of date, the other of prices: library(RODBC) z - odbcConnectExcel(SPX_HistoricalData.xls) datas - sqlFetch(z,Sheet1) close(z) It works pretty well but the only thing is that the datas stop at row 7530 and I don´t know why datas is a data frame that contains 7531 rows with the last two ones = NA... - Anna Lippel new in R so be careful I should be asking a lt of questions!:teeth: -- View this message in context: http://old.nabble.com/Error-on-reading-an-excel-file-tp26371750p26371750.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: ^ operator
Hi, You forgot to put the parenthesis in the way Petr told you : (-6.108576e-05)^(1/3) and the result is NaN. What do you want to preserve? Alain carol white wrote: but with complex, I get complex numbers for the first and last elements: (as.complex(tmp))^(1/3) [1] 0.01969170+0.03410703i 0.03478442+0.i 0.03285672+0.i [4] 0.08950802+0.i 0.05848363+0.10129661i whereas for the first element, we get the followings. Moreover, -6.108576e-05^(1/3) [1] -0.03938341 and -(6.108576e-05^(1/3)) [1] -0.03938341 and -((6.108576e-05)^(1/3)) [1] -0.03938341 give the same results. so using () doesn't preserve any thing --- On Mon, 11/16/09, Petr PIKAL petr.pi...@precheza.cz wrote: From: Petr PIKAL petr.pi...@precheza.cz Subject: Odp: [R] ^ operator To: carol white wht_...@yahoo.com Cc: r-h...@stat.math.ethz.ch Date: Monday, November 16, 2009, 3:40 AM Hi AFAIK, this is issue of the preference of operators. r-help-boun...@r-project.org napsal dne 16.11.2009 11:24:59: Hi, I want to apply ^ operator to a vector but it is applied to some of the elements correctly and to some others, it generates NaN. Why is it not able to calculate -6.108576e-05^(1/3) even though it exists? tmp [1] -6.108576e-05 4.208762e-05 3.547092e-05 7.171101e-04 -1.600269e-03 tmp^(1/3) [1]NaN 0.03478442 0.03285672 0.08950802NaN This computes (-a)^(1/3) which is not possible in real numbers. You have to use as.complex(tmp)^(1/3) to get a result. -6.108576e-05^(1/3) [1] -0.03938341 this is actually -(6.108576e-05^(1/3)) Regards Petr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] on gsub (simple, but not to me!) sintax
Dear R users, my problem today deals with my ignorance on regular expressions. a matter I recently discovered. Consider the following foo - c(V_7_101110_V, V_7_101110_V, V_9_101110_V, V_9_101110_V, V_9_s101110_V, V_9_101110_V, V_9_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V, V_17_101110_V, V_17_101110_V) what I'm trying to obtain is to add a zero in front of numbers below 10, as in c(V_07_101110_V, V_07_101110_V, V_09_101110_V, V_09_101110_V, V_09_101110_V, V_09_101110_V, V_09_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V, V_17_101110_V, V_17_101110_V) I'm able to do this on the emacs buffer through query-replace-regexp C-M-% search for V_\(.\)_ and substitute with V_0\1_ but I completely ignore how to do it with gsub within R and the help is quite complicate to understand (at least to me, at this moment in time) I can search the vector through grep(V_._, foo) but I always get errors either on gsub('V_\(.\)_', 'V_0\1_', foo) or I get not what I'm looking for on gsub('V_._', 'V_0._', foo) gsub('V_._', 'V_0\1_', foo) Thanks in advance -- Ottorino-Luca Pantani, Università di Firenze Dip. Scienza del Suolo e Nutrizione della Pianta P.zle Cascine 28 50144 Firenze Italia Ubuntu 8.04.3 LTS -- GNU Emacs 23.0.60.1 (x86_64-pc-linux-gnu, GTK+ Version 2.12.9) ESS version 5.5 -- R 2.10.0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: ^ operator
Hi r-help-boun...@r-project.org napsal dne 16.11.2009 13:27:03: but with complex, I get complex numbers for the first and last elements: (as.complex(tmp))^(1/3) [1] 0.01969170+0.03410703i 0.03478442+0.i 0.03285672+0.i [4] 0.08950802+0.i 0.05848363+0.10129661i And that is a right answer whereas for the first element, we get the followings. Moreover, -6.108576e-05^(1/3) [1] -0.03938341 and -(6.108576e-05^(1/3)) [1] -0.03938341 and -((6.108576e-05)^(1/3)) [1] -0.03938341 No. With all constructions like above you compute a cube root of ***positive*** number and then you put *-* sign before the result, hence the same result. Try instead to make a cube root of negative number. (-6.108576e-05)^(1/3) [1] NaN This is what you exactly do by the first call. Beware also that 1/3 is not exactly representable in binary arithmetic and so you actually do not compute cube root but some root which is quite near to cube root. (1000^(1/3))-10 [1] -1.776357e-15 If you want cube root and have negative numbers you need probably something like sign(tmp) * abs(tmp)^(1/3) give the same results. so using () doesn't preserve any thing You need to use parentheses on correct places. To see what is the precedence of operators see ?Syntax Regards Petr --- On Mon, 11/16/09, Petr PIKAL petr.pi...@precheza.cz wrote: From: Petr PIKAL petr.pi...@precheza.cz Subject: Odp: [R] ^ operator To: carol white wht_...@yahoo.com Cc: r-h...@stat.math.ethz.ch Date: Monday, November 16, 2009, 3:40 AM Hi AFAIK, this is issue of the preference of operators. r-help-boun...@r-project.org napsal dne 16.11.2009 11:24:59: Hi, I want to apply ^ operator to a vector but it is applied to some of the elements correctly and to some others, it generates NaN. Why is it not able to calculate -6.108576e-05^(1/3) even though it exists? tmp [1] -6.108576e-05 4.208762e-05 3.547092e-05 7.171101e-04 -1.600269e-03 tmp^(1/3) [1]NaN 0.03478442 0.03285672 0.08950802NaN This computes (-a)^(1/3) which is not possible in real numbers. You have to use as.complex(tmp)^(1/3) to get a result. -6.108576e-05^(1/3) [1] -0.03938341 this is actually -(6.108576e-05^(1/3)) Regards Petr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: ^ operator
Hi r-help-boun...@r-project.org napsal dne 16.11.2009 13:55:30: On 16-Nov-09 11:40:29, Petr PIKAL wrote: Hi AFAIK, this is issue of the preference of operators. r-help-boun...@r-project.org napsal dne 16.11.2009 11:24:59: Not in this case (see below), though of course in general - takes precedence over ^, so, for example, in the expression -2^(1/3) the - is applied first, giving (-2); and then ^ is applied next, giving (-2)^(1/3). There is a work-round (see below). Are you sure? I get -2^(1/3) [1] -1.259921 (-2)^(1/3) [1] NaN 2^(1/3) [1] 1.259921 So ^ is applied first and then the result is negated. See ?Syntax. I agree with what you write below, though. Regards Petr Hi, I want to apply ^ operator to a vector but it is applied to some of the elements correctly and to some others, it generates NaN. Why is it not able to calculate -6.108576e-05^(1/3) even though it exists? It only exists (in the real domain) if ^ takes precedence over - which (in R) it does not! tmp [1] -6.108576e-05 4.208762e-05 3.547092e-05 7.171101e-04 -1.600269e-03 tmp^(1/3) [1]NaN 0.03478442 0.03285672 0.08950802NaN This computes (-a)^(1/3) which is not possible in real numbers. In this example, that is not accurate. tmp has already been defined, and contains numbers which are already stored as negative numbers, so - is no longer on the scene as an operator, before ^ is applied; the issue of precedence of - over ^ is no longer present. The NaN arises from x^(1/3) where x is negative. You have to use as.complex(tmp)^(1/3) to get a result. -6.108576e-05^(1/3) [1] -0.03938341 This is not the result I get: as.complex(tmp)^(1/3) # [1] 0.01969171+0.03410703i 0.03478442+0.i # [3] 0.03285672+0.i 0.08950802+0.i # [5] 0.05848363+0.10129662i this is actually -(6.108576e-05^(1/3)) Regards Petr It is possible to work round the problem without using as.complex which can introduce complications -- see above, and also: x - (-1) x^(1/3) # [1] NaN as.complex(x)^(1/3) # [1] 0.5+0.8660254i as.complex(-1)^(1/2) # [1] 0+1i which you would not want if you are working throughout in real numbers (you would want the result -1 instead). Although, in the mathematics of complex numbers, (-1)^(1/3) has three values, one of which is -1, R only returns a single value. However, you would have to define a new operator, called say %^%: %^%-function(X,x){sign(X)*(abs(X)^x)} tmp - c(-6.108576e-05, 4.208762e-05, 3.547092e-05, 7.171101e-04, -1.600269e-03) tmp%^%(1/3) # [1] -0.03938341 0.03478442 0.03285672 0.08950802 -0.11696726 The definition of %^% forces ^ to take precedence over -, by in effect removing - from the scene until ^ has done its work. But, if you hope to rely on this, note that if you apply to 'tmp' any function in which the ordinary ^ will be used on a negative number, you will still have the same problem. Note: Trying to redefine ^ will not work, since invoking the result initiates an infinite recursion: ^- function(X,x){sign(X)*(abs(X)^x)} ## (This definition will be accepted by R) tmp%^%(1/3) # Error: evaluation nested too deeply: infinite recursion / # options(expressions=)? It's not a clean situatio, but I hope the above helps! Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 16-Nov-09 Time: 12:55:25 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: ^ operator
On 16-Nov-09 13:13:27, Liviu Andronic wrote: On 11/16/09, Ted Harding ted.hard...@manchester.ac.uk wrote: Not in this case (see below), though of course in general - takes precedence over ^, so, for example, in the expression -2^(1/3) the - is applied first, giving (-2); and then ^ is applied next, giving (-2)^(1/3). There is a work-round (see below). Hmm.. I may be doing something wrong, but from here it looks to be the opposite. -2^(1/3); -(2)^(1/3); -(2^(1/3)); [1] -1.2599 [1] -1.2599 [1] -1.2599 (-2)^(1/3) [1] NaN The results don't change when switching from the unary minus. 0-2^(1/3); 0-(2)^(1/3); 0-(2^(1/3)); [1] -1.2599 [1] -1.2599 [1] -1.2599 Correct!!! I was inadvertently put on the wrong foot by Pietr Pikal's comment about precedence, and as a result what I wrote about precedence of ^ relative to - was on the wrong foot throughout, and should be ignored. My apologies for any confusion this may have caused to anybody. In any case, this is not relevant to Carol White's query about taking the cube root (or indeed any fractional power) of a negative number. This can only be done (as Carol intended it) by using the form sign(x)*(abs(x)^power). As I tried to point out, there is a distinction between an expression which the user may enter as x - -1.234, and then x^(1/3), expecting -(1.234^(1/3)), and the cube root of the negative number x. Ted. It seems to me that in this example ^ is applied first, and - second. There is also this fortune entry. fortune(unary) Thomas Lumley: The precedence of ^ is higher than that of unary minus. It may be surprising, [...] Hervé Pagès: No, it's not surprising. At least to me... In the country where I grew up, I've been teached that -x^2 means -(x^2) not (-x)^2. -- Thomas Lumley and Hervé Pagès (both explaining that operator precedence is working perfectly well) R-devel (January 2006) Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 16-Nov-09 Time: 13:40:06 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lapply() not converting columns to factors (no error message)
Oh yes. Did notice that now. Thanks for pointing that out. I was a bit concerned because that's a crucial step for running further lmer models, and that isn't working too, based on this factoring of columns. Will hopefully be able to weed it out. Am sorry if this wasted a bit of time. I realized that summary(prmdf) gives me what I need. Thanks a lot, Aditi --On 16 November 2009 08:17 -0500 David Winsemius dwinsem...@comcast.net wrote: Didn't you notice the difference between Sundar's code and yours? Sundar put the data.frame name before the column name while you did not do so in your check step. -- DW On Nov 16, 2009, at 8:07 AM, A Singh wrote: Oh, strange! I thought it might be a problem with the 'base' package installation, because the same thing's worked for me too before but won't do now. I tried to reinstall it (base), but R says its there already which I expected it to be anyway. I don't quite know where the issue is. Very odd. --On 16 November 2009 04:59 -0800 Sundar Dorai-Raj sdorai...@gmail.com wrote: Works for me: x - read.csv(url(http://dc170.4shared.com/download/153147281/a5c78386/Test vc omp10.csv?tsid=20091116-075223-c3093ab0)) names(x) x[2:13] - lapply(x[2:13], factor) levels(x$P1L55) [1] 0 1 is.factor(x$P1L96) [1] TRUE sessionInfo() R version 2.10.0 (2009-10-26) i386-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lattice_0.17-26 loaded via a namespace (and not attached): [1] grid_2.10.0 tools_2.10.0 On Mon, Nov 16, 2009 at 4:50 AM, A Singh aditi.si...@bristol.ac.uk wrote: Sorry, my file is at: http://www.4shared.com/file/153147281/a5c78386/Testvcomp10.html -- A Singh aditi.si...@bristol.ac.uk School of Biological Sciences University of Bristol __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- A Singh aditi.si...@bristol.ac.uk School of Biological Sciences University of Bristol __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT -- A Singh aditi.si...@bristol.ac.uk School of Biological Sciences University of Bristol __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] on gsub (simple, but not to me!) sintax
On 11/16/2009 8:21 AM, Ottorino-Luca Pantani wrote: Dear R users, my problem today deals with my ignorance on regular expressions. a matter I recently discovered. You were close. First, gsub by default doesn't need escapes before the parens. (There are lots of different conventions for regular expressions, unfortunately.) So the Emacs regular expression V_\(.\)_ is entered as V_(.)_ in the default version of gsub(). Second, to enter a backslash into a string, you need to escape it. So the replacement pattern V_0\1_ is entered as V_0\\1_. So gsub(V_(.)_, V_0\\1_, foo) should give you what you want. Duncan Murdoch Consider the following foo - c(V_7_101110_V, V_7_101110_V, V_9_101110_V, V_9_101110_V, V_9_s101110_V, V_9_101110_V, V_9_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V, V_17_101110_V, V_17_101110_V) what I'm trying to obtain is to add a zero in front of numbers below 10, as in c(V_07_101110_V, V_07_101110_V, V_09_101110_V, V_09_101110_V, V_09_101110_V, V_09_101110_V, V_09_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V, V_17_101110_V, V_17_101110_V) I'm able to do this on the emacs buffer through query-replace-regexp C-M-% search for V_\(.\)_ and substitute with V_0\1_ but I completely ignore how to do it with gsub within R and the help is quite complicate to understand (at least to me, at this moment in time) I can search the vector through grep(V_._, foo) but I always get errors either on gsub('V_\(.\)_', 'V_0\1_', foo) or I get not what I'm looking for on gsub('V_._', 'V_0._', foo) gsub('V_._', 'V_0\1_', foo) Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] on gsub (simple, but not to me!) sintax
On Nov 16, 2009, at 8:21 AM, Ottorino-Luca Pantani wrote: Dear R users, my problem today deals with my ignorance on regular expressions. a matter I recently discovered. Consider the following foo - c(V_7_101110_V, V_7_101110_V, V_9_101110_V, V_9_101110_V, V_9_s101110_V, V_9_101110_V, V_9_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V, V_17_101110_V, V_17_101110_V) what I'm trying to obtain is to add a zero in front of numbers below 10, as in c(V_07_101110_V, V_07_101110_V, V_09_101110_V, V_09_101110_V, V_09_101110_V, V_09_101110_V, V_09_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V, V_11_101110_V, V_17_101110_V, V_17_101110_V) Any of these (the need for doubling of the \\ for the back-reference seems to be the main issue: gsub(_([[:digit:]])_., _0\\1_, foo) [1] V_07_01110_V V_07_01110_V V_09_01110_V V_09_01110_V V_09_101110_V [6] V_09_01110_V V_09_01110_V V_11_101110_V V_11_101110_V V_11_101110_V [11] V_11_101110_V V_11_101110_V V_17_101110_V V_17_101110_V gsub(_(\\d)_., _0\\1_, foo) [1] V_07_01110_V V_07_01110_V V_09_01110_V V_09_01110_V V_09_101110_V [6] V_09_01110_V V_09_01110_V V_11_101110_V V_11_101110_V V_11_101110_V [11] V_11_101110_V V_11_101110_V V_17_101110_V V_17_101110_V gsub(V_(.)_, V_0\\1_, foo) [1] V_07_101110_V V_07_101110_V V_09_101110_V V_09_101110_V V_09_s101110_V [6] V_09_101110_V V_09_101110_V V_11_101110_V V_11_101110_V V_11_101110_V [11] V_11_101110_V V_11_101110_V V_17_101110_V V_17_101110_V I'm able to do this on the emacs buffer through query-replace-regexp C-M-% search for V_\(.\)_ and substitute with V_0\1_ but I completely ignore how to do it with gsub within R and the help is quite complicate to understand (at least to me, at this moment in time) I can search the vector through grep(V_._, foo) but I always get errors either on gsub('V_\(.\)_', 'V_0\1_', foo) or I get not what I'm looking for on gsub('V_._', 'V_0._', foo) gsub('V_._', 'V_0\1_', foo) Thanks in advance -- Ottorino-Luca Pantani, Università di Firenze Dip. Scienza del Suolo e Nutrizione della Pianta P.zle Cascine 28 50144 Firenze Italia Ubuntu 8.04.3 LTS -- GNU Emacs 23.0.60.1 (x86_64-pc-linux-gnu, GTK+ Version 2.12.9) ESS version 5.5 -- R 2.10.0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Relase positive with log and zero of negative with 0
David Winsemius wrote: On Nov 15, 2009, at 10:18 AM, rkevinbur...@charter.net wrote: This is a very simple question but I couldn't form a site search quesry that would return a reasonable result set. Say I have a vector: x - c(0,2,3,4,5,-1,-2) I want to replace all of the values in 'x' with the log of x. Naturally this runs into problems since some of the values are negative or zero. So how can I replace all of the positive elements of x with the log(x) and the rest with zero? x - c(0,2,3,4,5,-1,-2) x - ifelse(x0, log(x), 0) Warning message: In log(x) : NaNs produced x [1] 0.000 0.6931472 1.0986123 1.3862944 1.6094379 0.000 0.000 The warning is harmless as you can see, but if you wanted to avoid it, then: x[x=0] - 0; x[x0] -log(x[x0]) In the second command, you need to have the logical test on both sides to avoid replacement out of synchrony. Here is one more way, somewhat less transparent, motivated by the examples on the ?ifelse page: x - log(ifelse(x 0, x, 1)) -Peter Ehlers -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Relase positive with log and zero of negative with 0
On Nov 16, 2009, at 8:55 AM, Peter Ehlers wrote: David Winsemius wrote: On Nov 15, 2009, at 10:18 AM, rkevinbur...@charter.net wrote: This is a very simple question but I couldn't form a site search quesry that would return a reasonable result set. Say I have a vector: x - c(0,2,3,4,5,-1,-2) I want to replace all of the values in 'x' with the log of x. Naturally this runs into problems since some of the values are negative or zero. So how can I replace all of the positive elements of x with the log(x) and the rest with zero? x - c(0,2,3,4,5,-1,-2) x - ifelse(x0, log(x), 0) Warning message: In log(x) : NaNs produced x [1] 0.000 0.6931472 1.0986123 1.3862944 1.6094379 0.000 0.000 The warning is harmless as you can see, but if you wanted to avoid it, then: x[x=0] - 0; x[x0] -log(x[x0]) In the second command, you need to have the logical test on both sides to avoid replacement out of synchrony. Here is one more way, somewhat less transparent, motivated by the examples on the ?ifelse page: x - log(ifelse(x 0, x, 1)) Here's yet another motivated by the above: log( (x=0) + (x0)*x ) [1] 0.000 0.6931472 1.0986123 1.3862944 1.6094379 0.000 0.000 -Peter Ehlers -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R: phase determination
Hi again: Any thought on the following?? I'm trying to determine the phase of irregularly sampled data. Is there any particular reason why both spec.pgram and spec.ls return phase-NULL for vectors? Thank you. Lisandro x Lisandro Benedetti-Cecchi Associate Professor in Ecology Department of Biology - University of Pisa Via Derna 1, 56126 Pisa, Italy Office: 39 050 2211413 Fax: 39 050 2211410 e-mail: lbenede...@biologia.unipi.it http://www.discat.unipi.it/BiolMar/people/LBC/LBC.htm http://www.unipi.it [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] on gsub (simple, but not to me!) sintax
Duncan Murdoch wrote: On 11/16/2009 8:21 AM, Ottorino-Luca Pantani wrote: Dear R users, my problem today deals with my ignorance on regular expressions. a matter I recently discovered. You were close. First, gsub by default doesn't need escapes before the parens. (There are lots of different conventions for regular expressions, unfortunately.) So the Emacs regular expression V_\(.\)_ is entered as V_(.)_ in the default version of gsub(). Second, to enter a backslash into a string, you need to escape it. So the replacement pattern V_0\1_ is entered as V_0\\1_. So gsub(V_(.)_, V_0\\1_, foo) should give you what you want. actually, guessing from the form of the input, sub is more appropriate, though the performance gain seems inessential (~3%). vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lapply() not converting columns to factors (no error message)
A Singh wrote: Dear List, I'm having a curious problem with lapply(). I've used it before to convert a subset of columns in my dataframe, to factors, and its worked. But now, on re-running the identical code as before it just doesn't convert the columns into factors at all. As far as I can see I've done nothing different, and its strange that it shouldn't do the action. Has anybody come across this before? Any input on this strange issue much appreciated.. Hope I haven't missed something obvious. Thanks a lot, Aditi (P.s.- I've tried converting columns one by one to factors this time, and that works. P1L55-factor(P1L55) levels(P1L55) [1] 0 1 Code: prm-read.table(P:\\. .csv, header=T, ...sep=,, ...) prmdf-data.frame(prm) prmdf[2:13]-lapply(prmdf[2:13], factor) ## action performed, no error message ##I tried to pick random columns and check levels(P1L55) NULL is.factor(P1L96) FALSE Make sure that you are looking in the same object that you changed. E.g. attach(prmdf) prmdf[2:13]-lapply(prmdf[2:13], factor) levels(P1L55) is not going to work levels(prmdf$P1L55) should, or attaching _after_ the change. Also, make sure that you don't have P1L55 et al. sitting in the global enviromnent. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cluster analysis: hclust manipulation possible?
I am doing cluster analysis [hclust(Dist, method=average)] on data that potentially contains redundant objects. As expected, the inclusion of redundant objects affects the clustering result, i.e., the data a1, = a2, = a3, b, c, d, e1, = e2 is likely to cluster differently from the same data without the redundancy, i.e., a1, b, c, d, e1. This is apparent when the outcome is visualized as a dendrogram. Now, it seems that the clustering result for which the redundancy has been eliminated is more robust for the present assignment than that of the redundant data. Naturally, there is no problem in the elimination: just exclude the redundant objects from Dist. However, it would be very convenient to be able to include the redundant objects in the *dendrogram* by attaching them as 0-level branches to the subtrees, i.e.: 1.0--- 0.5___|___|_.. 0.0.._|_..|..|..|.._|_ |.|.|.|..|..|.|...|... ...a1a2a3.b..c..d.e1.e2... instead of 1.0--- 0.5___|___|_.. 0.0...|...|..|..|...|. ..a1..b..c..d..e1. The question: Can this be accomplished in the *dendrogram plot* by manipulating the resulting hclust data structure or by some other means, and if yes, how? Jopi Harri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] violin - like plots for bivariate data
I'm attempting to produce something like a violin plot to display how y changes with x for members of different groups (My specific case is how floral area changes over time for several species of plants). I've looked at panel.violin (in lattice), which makes nice violin plots, but is really set up to work on a single variable - the area trace represents the frequency of each value of x for each group. I'm wondering if anyone is aware of a function to do this? I can imagine how to accomplish this using polygon, but I will admit I'm not sure what the best way would be to smooth the data. That said, I would prefer not to reinvent the wheel! Thanks in advance for any wisdom you can share! Eric -- View this message in context: http://old.nabble.com/violin---like-plots-for-bivariate-data-tp26373071p26373071.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] violin - like plots for bivariate data
Try: RSiteSearch(violin plot) On Mon, Nov 16, 2009 at 9:40 AM, Eric Nord ericn...@psu.edu wrote: I'm attempting to produce something like a violin plot to display how y changes with x for members of different groups (My specific case is how floral area changes over time for several species of plants). I've looked at panel.violin (in lattice), which makes nice violin plots, but is really set up to work on a single variable - the area trace represents the frequency of each value of x for each group. I'm wondering if anyone is aware of a function to do this? I can imagine how to accomplish this using polygon, but I will admit I'm not sure what the best way would be to smooth the data. That said, I would prefer not to reinvent the wheel! Thanks in advance for any wisdom you can share! Eric -- View this message in context: http://old.nabble.com/violin---like-plots-for-bivariate-data-tp26373071p26373071.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] on gsub (simple, but not to me!) sintax
Duncan Murdoch ha scritto: On 11/16/2009 8:21 AM, Ottorino-Luca Pantani wrote: Dear R users, my problem today deals with my ignorance on regular expressions. a matter I recently discovered. You were close. First, gsub by default doesn't need escapes before the parens. (There are lots of different conventions for regular expressions, unfortunately.) So the Emacs regular expression V_\(.\)_ is entered as V_(.)_ in the default version of gsub(). Second, to enter a backslash into a string, you need to escape it. So the replacement pattern V_0\1_ is entered as V_0\\1_. So gsub(V_(.)_, V_0\\1_, foo) should give you what you want. Duncan Murdoch Any of these (the need for doubling of the \\ for the back-reference seems to be the main issue: gsub(_([[:digit:]])_., _0\\1_, foo) gsub(_(\\d)_., _0\\1_, foo) gsub(V_(.)_, V_0\\1_, foo) David Winsemius, MD Heritage Laboratories West Hartford, CT I suspected something on the double escape.. Thanks to you all. R is a wonderful software and R-help is always a great place to visit !!! 8rino __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weighted descriptives by levels of another variables
Thanks! Using the plyr package and the approach you outlined seems to work well for relatively simple functions (like wtd.mean), but so far I haven't had much success in using it with more complex descriptive functions like describe {Hmisc}. I'll take a look later, though, and see if I can figure out why. At any rate, ddply() looks like it will simplify writing a function that will allow for weighting data and subdividing it, but still give comprehensive summary statistics (i.e. not just the mean or quantiles, but all in one). I'll post it to the list once I have the time to write it up. I also took a stab at using the svyby funtion in the survey package, but received the following error message when I input : svyby(cbind(educ, age), female, svynlsy, svymean) Error in `[.survey.design2`(design, byfactor %in% byfactor[i], ) : (subscript) logical subscript too long __ In addition to using the survey package (and the svyby function), I've found that many of the 'weighted' functions, such as wtd.mean, work well with the plyr package. For example, wtdmean=function(df)wtd.mean(df$obese,df$sampwt); ddply(mydata, ~cut2(age,c(2,6,12,16)),'wtdmean') hth, david freedman Andrew Miles-2 wrote: I've noticed that R has a number of very useful functions for obtaining descriptive statistics on groups of variables, including summary {stats}, describe {Hmisc}, and describe {psych}, but none that I have found is able to provided weighted descriptives of subsets of a data set (ex. descriptives for both males and females for age, where accurate results require use of sampling weights). Does anybody know of a function that does this? What I've looked at already: I have looked at describe.by {psych} which will give descriptives by levels of another variable (eg. mean ages of males and females), but does not accept sample weights. I have also looked at describe {Hmisc} which allows for weights, but has no functionality for subdivision. I tried using a by() function with describe{Hmisc}: by(cbind(my, variables, here), division.variable, describe, weights=weight.variable) but found that this returns an error message stating that the variables to be described and the weights variable are not the same length: Error in describe.vector(xx, nam[i], exclude.missing = exclude.missing, : length of weights must equal length of x In addition: Warning message: In present !is.na(weights) : longer object length is not a multiple of shorter object length This comes because the by() function passes down a subset of the variables to be described to describe(), but not a subset of the weights variable. describe() then searches the whatever data set is attached in order to find the weights variables, but this is in its original (i.e. not subsetted) form. Here is an example using the ChickWeight dataset that comes in the datasets package. data(ChickWeight) attach(ChickWeight) library(Hmisc) #this gives descriptive data on the variables Time and Chick by levels of Diet) by(cbind(Time, Chick), Diet, describe) #trying to add weights, however, does not work for reasons described above wgt=rnorm(length(Chick), 12, 1) by(cbind(Time, Chick), Diet, describe, weights=wgt) Again, my question is, does anybody know of a function that combines both the ability to provided weighted descriptives with the ability to subdivide by the levels of some other variable? Andrew Miles Department of Sociology Duke University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] package tm fails to remove the with remove stopwords
Thanks Ingo. Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please On Sun, Nov 15, 2009 at 11:05 AM, Ingo Feinerer feine...@logic.at wrote: On Thu, Nov 12, 2009 at 11:29:50AM -0500, Mark Kimpel wrote: I am using code that previously worked to remove stopwords using package tm. Thanks for reporting. This is a bug in the removeWords() function in tm version 0.5-1 available from CRAN: require(tm) myDocument - c(the rain in Spain, falls mainly on the plain, jack and jill ran up the hill, to fetch a pail of water) text.corp - Corpus(VectorSource(myDocument)) # text.corp - tm_map(text.corp, stripWhitespace) text.corp - tm_map(text.corp, removeNumbers) text.corp - tm_map(text.corp, removePunctuation) ## text.corp - tm_map(text.corp, stemDocument) text.corp - tm_map(text.corp, removeWords, c(the, stopwords(english))) dtm - DocumentTermMatrix(text.corp) dtm dtm.mat - as.matrix(dtm) dtm.mat dtm.mat Terms Docs falls fetch hill jack jill mainly pail plain rain ran spain the water 1 0 0000 00 01 0 1 1 0 2 1 0000 10 10 0 0 0 0 3 0 0111 00 00 1 0 0 0 4 0 1000 01 00 0 0 0 1 The function removeWords() fails to remove patterns at the beginning or at the end of a line. This bug is fixed in the latest development version on R-Forge, and the fix will be included in the next CRAN release. Please see https://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/inst/NEWS?root=tmview=markup for a list of all bug fixes and changes between each tm version. Best regards, Ingo Feinerer -- Ingo Feinerer Vienna University of Technology http://www.dbai.tuwien.ac.at/staff/feinerer [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data source name not found and no default driver specified
I'm stumped. When trying to connect to Oracle using the RODBC package I get an error: *[RODBC] Data source name not found and no default driver specified. ODBC connect failed.* I've read over all the posts and documentation manuals. The system is Windows Server 2003 with R 2.81. and the latest downloadable RODBC package. The Oracle SID/DSN is mfopdw. I made sure to add it to Control Panel-Administrative Priviledges-Microsoft ODBC system/user DNS. I've also tried the following in no particular order: 1.) Turn on all oracle services in control panel-administrative priviledges. 2.) Checked tsnnames.ora for SID. 3.) Add microsoft ODBC service to Control Panel services for SID 4.) Use Sqldeveler to test connection another way besides R (It was successful) 5.) channel-odbcDriverConnect(connection=Driver={Microsoft ODBC for Oracle}; DSN=abc,UID=abc;PWD=abc;case=oracle) received error drivers SQLAllocHandle on SQL_HANDLE_ENV failed one time; another time I got the error that Oracle client and networking components 7.3 or greater is not found. 6.) tnsping mfopdw lsnrctl start mfopdw tried to add oracle/bin to path Nothing is working. Please advise. Thank you, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] extracting estimated covariance parameters from lme fit
Dear all Apologies in advance as this seems like a trivial question. Nonetheless, a question I haven't been able to resolve myself !. Within a single repetition of a simulation (to be repeated many times) I am fitting the following linear mixed model using lme... Y_{gtr} = \mu + U_{g} + W_{gt} + Z_{gtr} U_{g} ~ N(0,\gamma^{2}), W_{gt} ~ N(0,\kappa^{2}), Z_{gtr} ~ N(0,\tau^{2}) g = 1,...,G t = 1,...,T r= 1,...,R ...by doing Model.fit - lme(Y ~ 1, data=data, random= ~1|gene/treatment) I would like to be able to extract the estimated covariance parameters contained within the lme object. I know if I type... Model.fit$sigma ...then I get the estimated residual variance, i.e. within the context of the above model, the estimate for \tau. But I would also like to extract the estimates for \gamma and \kappa by doing Model.fit$something. I am aware that I can view the output using the extractor function summary, but within a single repetition of my simulation routine I want to be able to code something like gamma - Model.fit$. kappa - Model.fit$. and then plug `gamma' and `kappa' into some formulae. This process of fitting and extracting will be repeated many times, which is why I wish to automate everything. Again, any help would be greatly appreciated Best Gerwyn Green School of Health and Medicine Lancaster University Any help would be greatly appreciated Best Gerwyn Green School of Health and Medicine Lancaster Uinversity __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] violin - like plots for bivariate data
sounds like bivariate density contours may be what you're looking for. Andy From: Eric Nord I'm attempting to produce something like a violin plot to display how y changes with x for members of different groups (My specific case is how floral area changes over time for several species of plants). I've looked at panel.violin (in lattice), which makes nice violin plots, but is really set up to work on a single variable - the area trace represents the frequency of each value of x for each group. I'm wondering if anyone is aware of a function to do this? I can imagine how to accomplish this using polygon, but I will admit I'm not sure what the best way would be to smooth the data. That said, I would prefer not to reinvent the wheel! Thanks in advance for any wisdom you can share! Eric -- View this message in context: http://old.nabble.com/violin---like-plots-for-bivariate-data-t p26373071p26373071.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Notice: This e-mail message, together with any attachme...{{dropped:10}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (Parallel) Random number seed question...
Hi All, I have k identical parallel pieces of code running, each using n.rand random numbers. I would like to use the same RNG (for now), and set the seeds so that I can guarantee that there are no overlaps in the random numbers sampled by the k pieces of code. Another side goal is to have reproducibility of my results. In the past I have used C with SPRNG for this task, but I'm hoping that there is an easy way to do this in R with poor man's parallelization (eg running multiple Rs on multiple processors without the overhead of setting up any mpi or using snow(fall)). It is not clear from the documentation if set.seed arguments are sequential or not for a given RNG, eg if set.seed(1) on processor 1, set.seed(1+n.rand) on processor 2, set.seed(1+2*n.rand) on processor 3, etc for the default RNG Mersenne-Twister. An easy approach would be to simply write a script to generate n.rand numbers, records the .Random.seed and proceeds in that manner- inelegant, but effective. My question here is Is there a better way? (mvtnorm part directed to Torsten Hothorn) To further clarify, it seems there is a different RNG for normal (rnorm) than for everything else? (eg RNGKind( .., normal.kind=Inversion); further, does anybody know if mvtnorm uses this generator? Further, some RNGs seem to be based on the archictecture (eg the Knuth-TAOCP-2002 for example)- is the period really related to 2^32, or is it dependent the architecture, 2^64 for 64 bit R and 2^32 for 32 bit R? I noticed there are several packages related to RNG- please direct me to a vignette/R news article/previous post if this has been covered ad nauseum. I have skimmed vignettes/docs for rsprng package, RNG doc in base, setRNG package, mvtnorm package vignette (Or am I setting myself up to write a current RNG doc?) (directed to Gregory Warnes) I found a presentation by Gregory Warnes from 1999 addressing these same questions (and uses a collings generator in some C code). http://www.r-project.org/conferences/DSC-1999/slides/warnes.ps.gz Have you turned to the snowfall related parallel implementations, did your Collings generator work well, or have you discovered another trick you might like to share? Thank you all for your time and excellent contributions to the open source community, Blair __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] extracting values from correlation matrix
Hi! All, I have 2 correlation matrices of 4000x4000 both with same row names and column names say cor1 and cor2. I have extracted some information from 1st matrix cor1 which is something like this: rowname colname cor1_value a b0.8 b a0.8 c f 0.62 d k0.59 - - -- - - -- Now I wish to extract values from matrix cor2 for the same rowname and colname as above so that it looks similar to something like this with values in cor2_value: rowname colname cor1_value cor2_value a b0.8 --- b a0.8 --- c f 0.62 --- d k0.59 --- - - -- --- - - -- --- I am running out of ideas. So I decided to post this on mailing list. Please Help! Best Lee [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting values from correlation matrix
Assuming that your data is in a dataframe 'cordata' , then following should work: cordata$cor2_value - sapply(1:nrow(cordata), function(.row){ cor2[cordata$rowname[.row], cordata$colname[.row]] } On Mon, Nov 16, 2009 at 11:44 AM, Lee William leeon...@gmail.com wrote: Hi! All, I have 2 correlation matrices of 4000x4000 both with same row names and column names say cor1 and cor2. I have extracted some information from 1st matrix cor1 which is something like this: rowname colname cor1_value a b 0.8 b a 0.8 c f 0.62 d k 0.59 - - -- - - -- Now I wish to extract values from matrix cor2 for the same rowname and colname as above so that it looks similar to something like this with values in cor2_value: rowname colname cor1_value cor2_value a b 0.8 --- b a 0.8 --- c f 0.62 --- d k 0.59 --- - - -- --- - - -- --- I am running out of ideas. So I decided to post this on mailing list. Please Help! Best Lee [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cluster analysis: hclust manipulation possible?
On Mon, 16 Nov 2009, Jopi Harri wrote: I am doing cluster analysis [hclust(Dist, method=average)] on data that potentially contains redundant objects. As expected, the inclusion of redundant objects affects the clustering result, i.e., the data a1, = a2, = a3, b, c, d, e1, = e2 is likely to cluster differently from the same data without the redundancy, i.e., a1, b, c, d, e1. This is apparent when the outcome is visualized as a dendrogram. Now, it seems that the clustering result for which the redundancy has been eliminated is more robust for the present assignment than that of the redundant data. Naturally, there is no problem in the elimination: just exclude the redundant objects from Dist. However, it would be very convenient to be able to include the redundant objects in the *dendrogram* by attaching them as 0-level branches to the subtrees, i.e.: 1.0--- 0.5___|___|_.. 0.0.._|_..|..|..|.._|_ |.|.|.|..|..|.|...|... ...a1a2a3.b..c..d.e1.e2... instead of 1.0--- 0.5___|___|_.. 0.0...|...|..|..|...|. ..a1..b..c..d..e1. The question: Can this be accomplished in the *dendrogram plot* by manipulating the resulting hclust data structure or by some other means, and if yes, how? Yes, you need to study ?hclust particularly the part about 'Value' from which you will see what needs modification. Here is a very simple example: res - hclust(dist(1-diag(3)*rnorm(3))) plot(res) res2 - res res2$merge - rbind(-cbind(1:3,4:6), matrix(ifelse( res2$merge0, -res2$merge, res2$merge+sum(res2$merge0)),2)) res2$height - c(rep(0,3), res2$height) res2$order - as.vector( rbind(res2$order,(4:6)[res2$order]) ) plot(res2) str( res ) str( res2 ) Alternatively, you could use as.dendrogram( res ) as the point of departure and manipulate the value. HTH, Chuck Jopi Harri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] specifying group plots using panel.groups
Hi, I am trying to plot two types of data on the same graph: points and distributions. I am attempting to use the panel.groups function, but cannot seem to get it to work. I have a melted data set and put in a FLAG column to separate my data into the two groups that I would like to plot, point data (FLAG=0) and the distribution(FLAG=1). Here is the code i am using in R: stripplot( variable~value, conf(RunlogBootCL), groups=FLAG, panel=panel.superpose, panel.groups=function(x,y, group.number, ...){ if(group.number==1)panel.xyplot(x,y, group.number...) else if(group.number==0)panel.covplot(x,y, group.number...)}, ref=1, cex = 0.5, col=black, main='Covariate Effects on Clearance', xlab='relative clearance', fill='transparent' ) For some reason I can only get one or the other to plot!! (points or distributions). Can you please direct me to my error?! thanks! -- View this message in context: http://old.nabble.com/specifying-group-plots-using-panel.groups-tp26374674p26374674.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R-help
I have been trying to write a function for the following problem: Suppose I have three vectors a,b,c of different lengths: e.g. a=c(a1,a2,a3,...) where a[i] form the basis of our function variables: if we define a table for example: and define the fn(x) -function{..sum(argument)..} where x-c(a,b,c) so that we can maximise fn(x) as: optim(...,fn,). So in other words the problem if to find optimal values for vectors a,b,c. I'm not sure how to set this problem up as a function in R. Any help would be much appreciated. Regards Lloyd Barcza - Forwarded by Lloyd Barcza/UK/RoyalSun on 16/11/2009 15:07 - Lloyd Barcza Pricing Analyst Affinity Pricing Royal SunAlliance Tel: 01403 234784 Email: lloyd.bar...@uk.rsagroup.com __ ** MORE THN ® is a trading style of Royal Sun Alliance Insurance plc (No. 93792). Registered in England and Wales at St. Mark's Court, Chart Way, Horsham, West Sussex RH12 1XL. Authorised regulated by the Financial Services Authority For your protection, telephone calls will be recorded and may be monitored. The information in this e-mail is confidential and ma...{{dropped:15}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] test for causality
Hi useRs.. I cant figure out how to test for causality using causality() in vars package I have two datasets (A, B) and i want to test if A (Granger)cause B. How do I write the script? I dont understand ?causality. How do I get x to contain A and B. Further using the command VAR() to specify x, I dont either understand. Kind regards Tobias -- View this message in context: http://old.nabble.com/test-for-causality-tp26373931p26373931.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R-help
I have been trying to write a function for the following problem: Suppose I have three vectors a,b,c of different lengths: e.g. a=c(a1,a2,a3,...) where a[i] form the basis of our function variables: if we define a table for example: and define the fn(x) -function{..sum(argument)..} where x-c(a,b,c) so that we can maximise fn(x) as: optim(...,fn,). So in other words the problem if to find optimal values for vectors a,b,c. I'm not sure how to set this problem up as a function in R. Any help would be much appreciated. Regards Lloyd Barcza - Forwarded by Lloyd Barcza/UK/RoyalSun on 16/11/2009 15:07 - Lloyd Barcza Pricing Analyst Affinity Pricing Royal SunAlliance Tel: 01403 234784 Email: lloyd.bar...@uk.rsagroup.com __ ** MORE THN ® is a trading style of Royal Sun Alliance Insurance plc (No. 93792). Registered in England and Wales at St. Mark's Court, Chart Way, Horsham, West Sussex RH12 1XL. Authorised regulated by the Financial Services Authority For your protection, telephone calls will be recorded and may be monitored. The information in this e-mail is confidential and ma...{{dropped:15}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Step Function Freezing R
Can you think of any systemic changes that might interefere with R besides Symantec EndPoint and LiveUpdate? I have removed those programs and allocated more memory to R, but it is still way too slow. On Nov 13, 10:45 pm, J Dougherty j...@surewest.net wrote: On Friday 13 November 2009 07:17:28 am Jgabriel wrote: I can't fully answer all these questions, but I'll do my best - There have not been any updates of Windows, and I did not update R during the period, although I did reinstall it after the problem started. There have been no changes to Norton or any other software that uses system resources in that way. The one thing I can think of is that I installed a program called Digitizer (creates data tables/ csds from visually analyzing line charts) around the same time it started freezing. I have completely uninstalled it and deleted all related files. The problem should not be with the data, which is fine. I am running Windows XP Professional and Excel 2007. I have allowed the process to run overnight on two separate occasions. The first time it was completely frozen and there was no progress. Last night it actually made progress, but the fact remains that a process that used to take an hour at the most still has not even halfway completed after over 8 hours. I even installed extra RAM on the system so that there is more than when the process used to work. I agree that it is probably a change in software, but I can't figure out what has changed or what I can do about it. OK, from what you are saying, it seems clear the problem is a system problem rather than an R issue. MS issues patches every month, so there may not have been upgrades, but there could still be systemic changes. As regards R, did you update R? How big is the data table? Has it grown? Did you or someone else alter the available memory to R by an environment setting? Have you searched Memory in R help and manuals? Is this computer yours, or is it used by others who may have altered settings? These are all questions that may be pertinent. Good luck. JWDougherty __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.- Hide quoted text - - Show quoted text - __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (Parallel) Random number seed question...
On Mon, 16 Nov 2009, Blair Christian wrote: Hi All, I have k identical parallel pieces of code running, each using n.rand random numbers. I would like to use the same RNG (for now), and set the seeds so that I can guarantee that there are no overlaps in the random numbers sampled by the k pieces of code. Another side goal is to have reproducibility of my results. In the past I have used C with SPRNG for this task, but I'm hoping that there is an easy way to do this in R with poor man's parallelization (eg running multiple Rs on multiple processors without the overhead of setting up any mpi or using snow(fall)). It is not clear from the documentation if set.seed arguments are sequential or not for a given RNG, eg if set.seed(1) on processor 1, set.seed(1+n.rand) on processor 2, set.seed(1+2*n.rand) on processor 3, etc for the default RNG Mersenne-Twister. An easy approach would be to simply write a script to generate n.rand numbers, records the .Random.seed and proceeds in that manner- inelegant, but effective. My question here is Is there a better way? (mvtnorm part directed to Torsten Hothorn) To further clarify, it seems there is a different RNG for normal (rnorm) than for everything else? (eg RNGKind( .., normal.kind=Inversion); further, does anybody know if mvtnorm uses this generator? mvtnorm is based on FORTRAN code which uses unif_rand() from the C API: void F77_SUB(rndstart)(void) { GetRNGstate(); } void F77_SUB(rndend)(void) { PutRNGstate(); } double F77_SUB(unifrnd)(void) { return unif_rand(); } Torsten Further, some RNGs seem to be based on the archictecture (eg the Knuth-TAOCP-2002 for example)- is the period really related to 2^32, or is it dependent the architecture, 2^64 for 64 bit R and 2^32 for 32 bit R? I noticed there are several packages related to RNG- please direct me to a vignette/R news article/previous post if this has been covered ad nauseum. I have skimmed vignettes/docs for rsprng package, RNG doc in base, setRNG package, mvtnorm package vignette (Or am I setting myself up to write a current RNG doc?) (directed to Gregory Warnes) I found a presentation by Gregory Warnes from 1999 addressing these same questions (and uses a collings generator in some C code). http://www.r-project.org/conferences/DSC-1999/slides/warnes.ps.gz Have you turned to the snowfall related parallel implementations, did your Collings generator work well, or have you discovered another trick you might like to share? Thank you all for your time and excellent contributions to the open source community, Blair __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Discontinuous graph
Hi, I wanted to make a graph with the following table (2 rows, 3 columns): a b c x 1 3 5 y 5 8 6 The first column represents the start cordinate, and the second column contains the end cordinate for the x-axis. The third column contains the y-axis co-ordinate. For example, the first row in the matrix above represents the points (1,5),(2,5), (3,5). How would I go about making a discontinuous graph ? thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discontinuous graph
Hi Tim, On Nov 16, 2009, at 12:40 PM, Tim Smith wrote: Hi, I wanted to make a graph with the following table (2 rows, 3 columns): a b c x 1 3 5 y 5 8 6 The first column represents the start cordinate, and the second column contains the end cordinate for the x-axis. The third column contains the y-axis co-ordinate. For example, the first row in the matrix above represents the points (1,5),(2,5), (3,5). How would I go about making a discontinuous graph ? What is it that you want to do with this graph? Or, how do you want represent it? Do you just want to generate the sequence of points? I'm guessing not, but here's code to do that and stores into the edge.pairs matrix (first row is the x-values, 2nd row is the y-value of the same point) data.matrix - matrix(c(1,3,5,5,8,6), nrow=2, byrow=T) points - apply(data.matrix, 1, function(row) unlist(t(expand.grid(row [1]:row[2], row[3] edge.pairs - do.call(cbind, points) It should be pretty straightforward to convert edge.paris into an adjacency matrix, if you like. Also, if you're thinking about using R to work with graphs, I'd suggest checking out the igraph pacakge. Hope that helps, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discontinuous graph
On Nov 16, 2009, at 12:40 PM, Tim Smith wrote: Hi, I wanted to make a graph with the following table (2 rows, 3 columns): a b c x 1 3 5 y 5 8 6 The first column represents the start cordinate, and the second column contains the end cordinate for the x-axis. The third column contains the y-axis co-ordinate. For example, the first row in the matrix above represents the points (1,5),(2,5), (3,5). How would I go about making a discontinuous graph ? thanks! coords - read.table(textConnection(a b c x 1 3 5 y 5 8 6), header=TRUE) plot(NULL, NULL, xlim = c(min(coords$a)-.5, max(coords$b)+.5), ylim=c(min(coords$c)-.5, max(coords$c)+.5) ) apply(coords, 1, function(x) segments(x0=x[1],y0= x[3], x1= x[2], y1=x[3]) ) -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error on reading an excel file
anna_l wrote: Hello everybody, here is the code I use to read an excel file containing two rows, one of date, the other of prices: library(RODBC) z - odbcConnectExcel(SPX_HistoricalData.xls) datas - sqlFetch(z,Sheet1) close(z) It works pretty well but the only thing is that the datas stop at row 7530 and I don´t know why datas is a data frame that contains 7531 rows with the last two ones = NA... I find this occurs sometimes when I export an Excel worksheet to CSV. Excel will include one or more rows of blank cells after the data stops. I would imagine the behavior you are seeing with RODBC is due to the same issue. I don't know if there is anything that can be done about it other than to trim your dataset back to the appropriate length once it gets into R. Good luck! -Charlie - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://old.nabble.com/Error-on-reading-an-excel-file-tp26371750p26376554.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error on reading an excel file
Thanks Charlie, well yes it included one row with two NA datas. I guess there is an explanation, let´s wait and see if someone knows more about it :) cls59 wrote: anna_l wrote: Hello everybody, here is the code I use to read an excel file containing two rows, one of date, the other of prices: library(RODBC) z - odbcConnectExcel(SPX_HistoricalData.xls) datas - sqlFetch(z,Sheet1) close(z) It works pretty well but the only thing is that the datas stop at row 7530 and I don´t know why datas is a data frame that contains 7531 rows with the last two ones = NA... I find this occurs sometimes when I export an Excel worksheet to CSV. Excel will include one or more rows of blank cells after the data stops. I would imagine the behavior you are seeing with RODBC is due to the same issue. I don't know if there is anything that can be done about it other than to trim your dataset back to the appropriate length once it gets into R. Good luck! -Charlie - Anna Lippel new in R so be careful I should be asking a lt of questions!:teeth: -- View this message in context: http://old.nabble.com/Error-on-reading-an-excel-file-tp26371750p26376656.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discontinuous graph
On Nov 16, 2009, at 12:58 PM, David Winsemius wrote: On Nov 16, 2009, at 12:40 PM, Tim Smith wrote: Hi, I wanted to make a graph with the following table (2 rows, 3 columns): a b c x 1 3 5 y 5 8 6 The first column represents the start cordinate, and the second column contains the end cordinate for the x-axis. The third column contains the y-axis co-ordinate. For example, the first row in the matrix above represents the points (1,5),(2,5), (3,5). How would I go about making a discontinuous graph ? thanks! coords - read.table(textConnection(a b c x 1 3 5 y 5 8 6), header=TRUE) plot(NULL, NULL, xlim = c(min(coords$a)-.5, max(coords$b)+.5), ylim=c (min(coords$c)-.5, max(coords$c)+.5) ) apply(coords, 1, function(x) segments(x0=x[1],y0= x[3], x1= x[2], y1=x[3]) ) Oh, *that* kind of graph! ... my high-school English teacher once said that all communication is miscommunication because we each interpret things according to our own experiences, etc ... I guess that goes to show: (i) me that he was right (once again); (ii) you what I've been working on lately :-) Sorry for the line-noise, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error on reading an excel file
You could try one of the other methods of reading Excel files and see if they are affected: http://wiki.r-project.org/rwiki/doku.php?id=tips:data-io:ms_windows On Mon, Nov 16, 2009 at 8:19 AM, anna_l lippelann...@hotmail.com wrote: Hello everybody, here is the code I use to read an excel file containing two rows, one of date, the other of prices: library(RODBC) z - odbcConnectExcel(SPX_HistoricalData.xls) datas - sqlFetch(z,Sheet1) close(z) It works pretty well but the only thing is that the datas stop at row 7530 and I don´t know why datas is a data frame that contains 7531 rows with the last two ones = NA... - Anna Lippel new in R so be careful I should be asking a lt of questions!:teeth: -- View this message in context: http://old.nabble.com/Error-on-reading-an-excel-file-tp26371750p26371750.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error on reading an excel file
Gabor Grothendieck wrote: You could try one of the other methods of reading Excel files and see if they are affected: I would guess that since Excel includes the blank rows when exporting to CSV, then blank cells are being stored by Excel in the data files-- therefore any method of extracting data from those files will also pick up the empty cells. I think the crux of this issue lies with Excel and you will probably have to look for a fix there. -Charlie - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://old.nabble.com/Error-on-reading-an-excel-file-tp26371750p26376915.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sum over indexed value
I am sure this is easy but I am not finding a function to do this. I have two columns in a matrix. The first column contains multiple entries of numbers from 1 to 100 (i.e. 10 ones, 8 twos etc.). The second column contains unique numbers. I want to sum the numbers in column two based on the indexed values in column one (e.g. sum of all values in column two associated with the value 1 in column one). I would like two columns in return - the indexed value in column one (i.e. this time no duplicates) and the sum in column two. How do I do this? -- View this message in context: http://old.nabble.com/Sum-over-indexed-value-tp26376359p26376359.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] printing a single row, but dont know which row to print
I have 20 columns of data, and in column 5 I have a value of 17600 but I dont know which row this value is in (i have over 300,000 rows). I'm trying to do 2 things: 1) I want to find out which row in column 5 has this number in it. 2) Then I want to print out that row with all the column headers so i can look at the other parameters in the row that are associated with this value. How do i do it? -- View this message in context: http://old.nabble.com/printing-a-single-row%2C-but-dont-know-which-row-to-print-tp26376647p26376647.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discontinuous graph
Hi, An alternative with ggplot2, library(ggplot2) ggplot(data=coords) + geom_segment(aes(x=a, xend=b, y=c, yend=c)) HTH, baptiste 2009/11/16 David Winsemius dwinsem...@comcast.net: On Nov 16, 2009, at 12:40 PM, Tim Smith wrote: Hi, I wanted to make a graph with the following table (2 rows, 3 columns): a b c x 1 3 5 y 5 8 6 The first column represents the start cordinate, and the second column contains the end cordinate for the x-axis. The third column contains the y-axis co-ordinate. For example, the first row in the matrix above represents the points (1,5),(2,5), (3,5). How would I go about making a discontinuous graph ? thanks! coords - read.table(textConnection(a b c x 1 3 5 y 5 8 6), header=TRUE) plot(NULL, NULL, xlim = c(min(coords$a)-.5, max(coords$b)+.5), ylim=c(min(coords$c)-.5, max(coords$c)+.5) ) apply(coords, 1, function(x) segments(x0=x[1],y0= x[3], x1= x[2], y1=x[3]) ) -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sum over indexed value
P=data.frame(x=c(1,1,2,3,2,1),y=rnorm(6)) tapply(P$y,P$x,sum) regards, stefan On Mon, Nov 16, 2009 at 09:49:17AM -0800, Gunadi wrote: I am sure this is easy but I am not finding a function to do this. I have two columns in a matrix. The first column contains multiple entries of numbers from 1 to 100 (i.e. 10 ones, 8 twos etc.). The second column contains unique numbers. I want to sum the numbers in column two based on the indexed values in column one (e.g. sum of all values in column two associated with the value 1 in column one). I would like two columns in return - the indexed value in column one (i.e. this time no duplicates) and the sum in column two. How do I do this? -- View this message in context: http://old.nabble.com/Sum-over-indexed-value-tp26376359p26376359.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] printing a single row, but dont know which row to print
Hi, Try this, set.seed(2) # reproducible d = matrix(sample(1:20,20), 4, 5) d d[ d[ ,2] == 18 , ] You may need to test with all.equal if your values are subject to rounding errors. HTH, baptiste 2009/11/16 frenchcr frenc...@btinternet.com: I have 20 columns of data, and in column 5 I have a value of 17600 but I dont know which row this value is in (i have over 300,000 rows). I'm trying to do 2 things: 1) I want to find out which row in column 5 has this number in it. 2) Then I want to print out that row with all the column headers so i can look at the other parameters in the row that are associated with this value. How do i do it? -- View this message in context: http://old.nabble.com/printing-a-single-row%2C-but-dont-know-which-row-to-print-tp26376647p26376647.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] No Visible Binding for global variable
While building a package, I see the following: * checking R code for possible problems ... NOTE cheat.fit: no visible binding for global variable 'Zobs' plot.jml: no visible binding for global variable 'Var1' I see the issue has come up before, but I'm having a hard time discerning how solutions applied elsewhere would apply here. The entire code for both functions is below, but the only place the variable Zobs appears in the function cheat.fit is: cheaters - cbind(data.frame(cheaters), exactMatch) names(cheaters)[1] - 'Zobs' names(cheaters)[2] - 'Nexact' cheaters$Zcrit - Zcrit cheaters$Mean - means cheaters$Var - vars cheaters - subset(cheaters, Zobs = Zcrit) result - list(pairs = c(row.names(cheaters)), Ncheat = nrow(cheaters), TotalCompare = totalCompare, alpha = alpha, ExactMatch = cheaters$Nexact, Zobs = cheaters$Zobs, Zcrit = Zcrit, Mean = cheaters$Mean, Variance = cheaters$Var, Probs = stuProbs) result and the only place Var1 appears in the plot function is here prop.correct - subset(data.frame(prop.table(table(tmp[, i+1], tmp$Estimate), margin=2)), Var1 == 1)[, 2:3] Many thanks, Harold sessionInfo() R version 2.10.0 (2009-10-26) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base cheat.fit - function(dat, key, wrongChoice, alpha = .01, rfa = c('nr', 'uni', 'bsct'), bonf = c('yes','no'), con = 1e-12, lower = 0, upper = 50){ bonf - tolower(bonf) bonf - match.arg(bonf) rfa - match.arg(rfa) rfa - tolower(rfa) dat - t(dat) correctStuMat - numeric(ncol(dat)) for(i in 1:ncol(dat)){ correctStuMat[i] - mean(key==dat[,i], na.rm= TRUE) } correctClsMat - numeric(length(key)) for(i in 1:length(key)){ correctClsMat[i] - mean(key[i]==dat[i,], na.rm= TRUE) } ### this is here for cases if all students in a class ### did not answer the item correctClsMat[is.na(correctClsMat)] - 0 pCorr - function(R,c,q){ numer - function(R,a,c,q){ result - sum((1-(1-R)^a)^(1/a), na.rm= TRUE)-c*q result } denom - function(R,a,c,q){ result - sum(na.rm= TRUE, -((1 - (1 - R)^a)^(1/a) * (log((1 - (1 - R)^a)) * (1/a^2)) + (1 - (1 - R)^a)^((1/a) - 1) * ((1/a) * ((1 - R)^a * log((1 - R)) result } aConst - function(R, c, q, con){ a - .5 # starting value for a change - 1 while(abs(change) con) { r1 - numer(R,a,c,q) r2 - denom(R,a,c,q) change - r1/r2 a - a - change } a } bisect - function(R, c, q, lower, upper, con){ f - function(a) sum((1 - (1-R)^a)^(1/a)) - c * q if(f(lower) * f(upper) 0) stop(endpoints must have opposite signs) while(abs(lower-upper) con){ x = .5 * (lower+upper) if(f(x) * f(lower) =0) lower = x else upper = x } .5 * (lower+upper) } if(rfa == 'nr'){ if(any(correctClsMat==1)) correctClsMat[correctClsMat==1]-. else correctClsMat if(any(correctClsMat==0)) correctClsMat[correctClsMat==0]-.0001 else correctClsMat a - aConst(R,c,q, con) } else if(rfa == 'uni'){ f - function(R, a, c, q) sum((1 - (1-R)^a)^(1/a)) - c * q a - uniroot(f, c(lower,upper), R = R, c = c, q = q)$root } else if(rfa == 'bsct'){ a - bisect(R, c, q, lower = lower, upper = upper, con) } result - (1-(1-R)^a)^(1/a) result } # end pCorr function
Re: [R] printing a single row, but dont know which row to print
On Nov 16, 2009, at 1:38 PM, baptiste auguie wrote: Hi, Try this, set.seed(2) # reproducible d = matrix(sample(1:20,20), 4, 5) d d[ d[ ,2] == 18 , ] You may need to test with all.equal if your values are subject to rounding errors. HTH, baptiste 2009/11/16 frenchcr frenc...@btinternet.com: I have 20 columns of data, and in column 5 I have a value of 17600 but I dont know which row this value is in (i have over 300,000 rows). I'm trying to do 2 things: 1) I want to find out which row in column 5 has this number in it. Using baptiste's setup: which(d[, 2]==18) [1] 4 2) Then I want to print out that row with all the column headers so i can look at the other parameters in the row that are associated with this value. How do i do it? -- View this message in context: http://old.nabble.com/printing-a-single-row%2C-but-dont-know-which-row-to-print-tp26376647p26376647.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] No Visible Binding for global variable
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Doran, Harold Sent: Monday, November 16, 2009 10:45 AM To: r-help@r-project.org Subject: [R] No Visible Binding for global variable While building a package, I see the following: * checking R code for possible problems ... NOTE cheat.fit: no visible binding for global variable 'Zobs' plot.jml: no visible binding for global variable 'Var1' I see the issue has come up before, but I'm having a hard time discerning how solutions applied elsewhere would apply here. The entire code for both functions is below, but the only place the variable Zobs appears in the function cheat.fit is: cheaters - cbind(data.frame(cheaters), exactMatch) names(cheaters)[1] - 'Zobs' names(cheaters)[2] - 'Nexact' cheaters$Zcrit - Zcrit cheaters$Mean - means cheaters$Var - vars cheaters - subset(cheaters, Zobs = Zcrit) The code in the codetools package does not know that subset() does not evaluate its second argument in the standard way. Hence it gives a false alarm here. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com result - list(pairs = c(row.names(cheaters)), Ncheat = nrow(cheaters), TotalCompare = totalCompare, alpha = alpha, ExactMatch = cheaters$Nexact, Zobs = cheaters$Zobs, Zcrit = Zcrit, Mean = cheaters$Mean, Variance = cheaters$Var, Probs = stuProbs) result and the only place Var1 appears in the plot function is here prop.correct - subset(data.frame(prop.table(table(tmp[, i+1], tmp$Estimate), margin=2)), Var1 == 1)[, 2:3] Many thanks, Harold sessionInfo() R version 2.10.0 (2009-10-26) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base cheat.fit - function(dat, key, wrongChoice, alpha = .01, rfa = c('nr', 'uni', 'bsct'), bonf = c('yes','no'), con = 1e-12, lower = 0, upper = 50){ bonf - tolower(bonf) bonf - match.arg(bonf) rfa - match.arg(rfa) rfa - tolower(rfa) dat - t(dat) correctStuMat - numeric(ncol(dat)) for(i in 1:ncol(dat)){ correctStuMat[i] - mean(key==dat[,i], na.rm= TRUE) } correctClsMat - numeric(length(key)) for(i in 1:length(key)){ correctClsMat[i] - mean(key[i]==dat[i,], na.rm= TRUE) } ### this is here for cases if all students in a class ### did not answer the item correctClsMat[is.na(correctClsMat)] - 0 pCorr - function(R,c,q){ numer - function(R,a,c,q){ result - sum((1-(1-R)^a)^(1/a), na.rm= TRUE)-c*q result } denom - function(R,a,c,q){ result - sum(na.rm= TRUE, -((1 - (1 - R)^a)^(1/a) * (log((1 - (1 - R)^a)) * (1/a^2)) + (1 - (1 - R)^a)^((1/a) - 1) * ((1/a) * ((1 - R)^a * log((1 - R)) result } aConst - function(R, c, q, con){ a - .5 # starting value for a change - 1 while(abs(change) con) { r1 - numer(R,a,c,q) r2 - denom(R,a,c,q) change - r1/r2 a - a - change } a } bisect - function(R, c, q, lower, upper, con){ f - function(a) sum((1 - (1-R)^a)^(1/a)) - c * q if(f(lower) * f(upper) 0) stop(endpoints must have opposite signs) while(abs(lower-upper) con){ x = .5 * (lower+upper) if(f(x) * f(lower) =0) lower = x else upper = x } .5 * (lower+upper) } if(rfa == 'nr'){ if(any(correctClsMat==1)) correctClsMat[correctClsMat==1]-. else correctClsMat
Re: [R] extracting estimated covariance parameters from lme fit
The VarCorr function will extract the components of the random effects covariance matrix, but note the quirk that it returns values as characters: library(nlme) f1 - lme(distance ~ age, data = Orthodont, random = ~1 + age|Subject) (vc - VarCorr(f1)) # Subject = pdLogChol(1 + age) # Variance StdDevCorr # (Intercept) 5.41508724 2.3270340 (Intr) # age 0.05126955 0.2264278 -0.609 # Residual1.71620401 1.3100397 str(vc) # 'VarCorr.lme' chr [1:3, 1:3] 5.41508724 0.05126955 1.71620401 ... # - attr(*, dimnames)=List of 2 # ..$ : chr [1:3] (Intercept) age Residual # ..$ : chr [1:3] Variance StdDev Corr # - attr(*, title)= chr Subject = pdLogChol(1 + age) (sigma2.age - as.numeric(vc[2, 1])) # [1] 0.05126955 hth, Kingsford Jones On Mon, Nov 16, 2009 at 9:25 AM, Green, Gerwyn (greeng6) g.gre...@lancaster.ac.uk wrote: Dear all Apologies in advance as this seems like a trivial question. Nonetheless, a question I haven't been able to resolve myself !. Within a single repetition of a simulation (to be repeated many times) I am fitting the following linear mixed model using lme... Y_{gtr} = \mu + U_{g} + W_{gt} + Z_{gtr} U_{g} ~ N(0,\gamma^{2}), W_{gt} ~ N(0,\kappa^{2}), Z_{gtr} ~ N(0,\tau^{2}) g = 1,...,G t = 1,...,T r= 1,...,R ...by doing Model.fit - lme(Y ~ 1, data=data, random= ~1|gene/treatment) I would like to be able to extract the estimated covariance parameters contained within the lme object. I know if I type... Model.fit$sigma ...then I get the estimated residual variance, i.e. within the context of the above model, the estimate for \tau. But I would also like to extract the estimates for \gamma and \kappa by doing Model.fit$something. I am aware that I can view the output using the extractor function summary, but within a single repetition of my simulation routine I want to be able to code something like gamma - Model.fit$. kappa - Model.fit$. and then plug `gamma' and `kappa' into some formulae. This process of fitting and extracting will be repeated many times, which is why I wish to automate everything. Again, any help would be greatly appreciated Best Gerwyn Green School of Health and Medicine Lancaster University Any help would be greatly appreciated Best Gerwyn Green School of Health and Medicine Lancaster Uinversity __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sum over indexed value
Try this: with(DF, rowsum(Col2, Col1)) On Mon, Nov 16, 2009 at 3:49 PM, Gunadi boydkra...@gmail.com wrote: I am sure this is easy but I am not finding a function to do this. I have two columns in a matrix. The first column contains multiple entries of numbers from 1 to 100 (i.e. 10 ones, 8 twos etc.). The second column contains unique numbers. I want to sum the numbers in column two based on the indexed values in column one (e.g. sum of all values in column two associated with the value 1 in column one). I would like two columns in return - the indexed value in column one (i.e. this time no duplicates) and the sum in column two. How do I do this? -- View this message in context: http://old.nabble.com/Sum-over-indexed-value-tp26376359p26376359.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] in excel i can sort my dataset, what do i use in R
In excel a handy tool is the sort data by column ...i.e. i can highlight the whole dataset and sort it according to a particular column...like sort the data in a column in acending or decending order where all the other columns change aswell. I need to do this in R now but dont know how. ...heres an example... Say I have dataset... Header 1Header 2Header 3 1 3 Working 12 2 4 Off1 3 5 Works2 4 2 Works13 5 4 Off5 ...and i want to sort the data by putting the values in the third column in acending order, like this... Header 1Header 2Header 3 1 4 Off1 2 5 Works2 3 4 Off5 4 3 Working 12 5 2 Works13 ...although im sorting column three in acending order all the rows shuffle so that the parameters in each row stay aligned. How do i do this in R? -- View this message in context: http://old.nabble.com/in-excel-i-can-sort-my-dataset%2C-what-do-i-use-in-R-tp26377540p26377540.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] in excel i can sort my dataset, what do i use in R
?order ?sort 2009/11/16 frenchcr frenc...@btinternet.com: In excel a handy tool is the sort data by column ...i.e. i can highlight the whole dataset and sort it according to a particular column...like sort the data in a column in acending or decending order where all the other columns change aswell. I need to do this in R now but dont know how. ...heres an example... Say I have dataset... Header 1 Header 2 Header 3 1 3 Working 12 2 4 Off 1 3 5 Works 2 4 2 Works 13 5 4 Off 5 ...and i want to sort the data by putting the values in the third column in acending order, like this... Header 1 Header 2 Header 3 1 4 Off 1 2 5 Works 2 3 4 Off 5 4 3 Working 12 5 2 Works 13 ...although im sorting column three in acending order all the rows shuffle so that the parameters in each row stay aligned. How do i do this in R? -- View this message in context: http://old.nabble.com/in-excel-i-can-sort-my-dataset%2C-what-do-i-use-in-R-tp26377540p26377540.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data source name not found and no default driver specified
I forgot to mention that it's running Windows Server 2003 x64 OS version On Mon, Nov 16, 2009 at 11:22 AM, helpme myrquesti...@gmail.com wrote: I'm stumped. When trying to connect to Oracle using the RODBC package I get an error: *[RODBC] Data source name not found and no default driver specified. ODBC connect failed.* I've read over all the posts and documentation manuals. The system is Windows Server 2003 with R 2.81. and the latest downloadable RODBC package. The Oracle SID/DSN is mfopdw. I made sure to add it to Control Panel-Administrative Priviledges-Microsoft ODBC system/user DNS. I've also tried the following in no particular order: 1.) Turn on all oracle services in control panel-administrative priviledges. 2.) Checked tsnnames.ora for SID. 3.) Add microsoft ODBC service to Control Panel services for SID 4.) Use Sqldeveler to test connection another way besides R (It was successful) 5.) channel-odbcDriverConnect(connection=Driver={Microsoft ODBC for Oracle}; DSN=abc,UID=abc;PWD=abc;case=oracle) received error drivers SQLAllocHandle on SQL_HANDLE_ENV failed one time; another time I got the error that Oracle client and networking components 7.3 or greater is not found. 6.) tnsping mfopdw lsnrctl start mfopdw tried to add oracle/bin to path Nothing is working. Please advise. Thank you, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] fitting a logistic regression with mixed type of variables
Hi, I am trying to fit a logistic regression using glm, but my explanatory variables are of mixed type: some are numeric, some are ordinal, some are categorical, say If x1 is numeric, x2 is ordinal, x3 is categorical, is the following formula OK? *model - glm(y~x1+x2+x3, family=binomial(link=logit), na.action=na.pass)* * * *Thanks,* * * *-Jack* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] object not found inside step() function
Hi, there, My appologize if someone ask the same question before. I searched the mailing list and found one similar post, but not what i want. The problem for me is, I use the step( glm()) to do naive forward selection for logistic regression. My code is functional in the open environment. But if I wrap it up as a function, then R keeps saying object 'a' not found. Actually, data frame a is inside the function. I did some search online. i guess the reason may be R did not keep the data in glm() output after building the model but not sure. Can anyone please tell me how to work around this problem? Thanks a lot in advance. I am using R 2.9.0. Here is the sample code: # naivelr-function(x,y){ : : : a-data.frame(x) form-paste(y~1+,paste(grep(X.*,names(a),value=T),collapse=+),sep=) if(is.null(force.in)!=T){ lowmo-paste(y~1+,paste(grep(X.*,names(a)[force.in],value=T),collapse=+),sep=) } else {lowmo-y~1} lower1-glm(lowmo,family=binomial,data=data.frame(a,y)) upper1-glm(form,family=binomial,data=data.frame(a,y)) stepout-step(lower1,scope=list(lower=lower1,upper=upper1),direction=forward,k=0,trace=100) # here is the error:Start: #AIC=689.62 #y ~ 1 #Error in data.frame(a, y) : object 'a' not found---but a is there! : : : } Sincerely samer yuan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fitting a logistic regression with mixed type of variables
On Nov 16, 2009, at 2:22 PM, Jack Luo wrote: Hi, I am trying to fit a logistic regression using glm, but my explanatory variables are of mixed type: some are numeric, some are ordinal, some are categorical, say If x1 is numeric, x2 is ordinal, x3 is categorical, is the following formula OK? The formula's certainly OK. What may be non-OK will be your understanding of the output. The default handling of ordinal factors is a common source of questions to R-help, so read up first. *model - glm(y~x1+x2+x3, family=binomial(link=logit), na.action=na.pass)* Why have you chosen that na.action option? -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs
I forgot to say that there are no ties in each row. So any number can occur only once in each row. Also as I mentioned earlier, actually I only need the top 50 most frequent pairs, is there a more efficient way to do it? Because I have 15000 numbers, output of all the pairs would be too long. Thank you, Cindy On Mon, Nov 16, 2009 at 7:02 AM, David Winsemius dwinsem...@comcast.netwrote: I stuck in another 7 in one of the lines with a 2 and reasoned that we could deal with the desire for non-ordered pair counting by pasting min(x,y) to max(x,y); dput(prmtx) structure(c(2, 1, 3, 9, 5, 7, 7, 8, 1, 7, 6, 5, 6, 2, 2, 7), .Dim = c(4L, 4L)) prmtx [,1] [,2] [,3] [,4] [1,]2516 [2,]1772 [3,]3762 [4,]9857 pair.str - sapply(1:nrow(prmtx), function(z) apply(combn(prmtx[z,], 2), 2,function(x) paste(min(x[2],x[1]), max(x[2],x[1]), sep=.))) The logic: sapply(1:nrow(prmtx), ... just loops over the rows of the matrix. combn(prmtx[z,], 2) ... returns a two row matrix of combination in a single row. apply(combn(prmtx[z,], 2), 2 ... since combn( , 2) returns a matrix that has two _rows_ I needed to loop over the columns. paste(min(x[2],x[1]), max(x[2],x[1]), sep=.) ... stick the minimum of a pair in front of the max and separates them with a period to prevent two+ digits from being non-unique Then using table() and logical tests in an index for the desired multiple pairs: tpair -table(pair.str) tpair pair.str 1.2 1.5 1.6 1.7 2.3 2.5 2.6 2.7 3.6 3.7 5.6 5.7 5.8 5.9 6.7 7.7 7.8 7.9 8.9 2 1 1 2 1 1 2 3 1 1 1 1 1 1 1 1 1 1 1 tpair[tpair1] pair.str 1.2 1.7 2.6 2.7 2 2 2 3 -- David. On Nov 16, 2009, at 7:02 AM, David Winsemius wrote: I'm not convinced it's right. In fact, I'm pretty sure the last step taking only the first half of the list is wrong. I also do not know if you have considered how you want to count situations like: 3 2 7 4 5 7 ... 7 3 8 6 1 2 9 2 .. How many pairs of 2-7/7-2 would that represent? -- David On Nov 15, 2009, at 11:06 PM, cindy Guo wrote: Hi, David, The matrix has 20 columns. Thank you very much for your help. I think it's right, but it seems I need some time to figure it out. I am a green hand. There are so many functions here I never used before. :) Cindy On Sun, Nov 15, 2009 at 5:19 PM, David Winsemius dwinsem...@comcast.net wrote: Assuming that the number of columns is 4, then consider this approach: prs -scan() 1: 2 5 1 6 5: 1 7 8 2 9: 3 7 6 2 13: 9 8 5 7 17: Read 16 items prmtx - matrix(prs, 4,4, byrow=T) #Now make copus of x.y and y.x pair.str - sapply(1:nrow(prmtx), function(z) c(apply(combn(prmtx[z,], 2), 2,function(x) paste(x[1],x[2], sep=.)) , apply(combn(prmtx[z,], 2), 2,function(x) paste(x[2],x[1], sep=.))) ) tpair -table(pair.str) # This then gives you a duplicated list tpair[tpair1] pair.str 1.2 2.1 2.6 2.7 6.2 7.2 7.8 8.7 2 2 2 2 2 2 2 2 # So only take the first half of the pairs: head(tpair[tpair1], sum(tpair1)/2) pair.str 1.2 2.1 2.6 2.7 2 2 2 2 -- David. On Nov 15, 2009, at 8:06 PM, David Winsemius wrote: I could of course be wrong but have you yet specified the number of columns for this pairing exercise? On Nov 15, 2009, at 5:26 PM, cindy Guo wrote: Hi, All, I have an n by m matrix with each entry between 1 and 15000. I want to know the frequency of each pair in 1:15000 that occur together in rows. So for example, if the matrix is 2 5 1 6 1 7 8 2 3 7 6 2 9 8 5 7 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return the value 2 for this pair as well as that for all pairs. Is there a fast way to do this avoiding loops? Loops take too long. and provide commented, minimal, self-contained, reproducible code. ^^ David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list
Re: [R] No Visible Binding for global variable
On 11/16/2009 1:54 PM, William Dunlap wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Doran, Harold Sent: Monday, November 16, 2009 10:45 AM To: r-help@r-project.org Subject: [R] No Visible Binding for global variable While building a package, I see the following: * checking R code for possible problems ... NOTE cheat.fit: no visible binding for global variable 'Zobs' plot.jml: no visible binding for global variable 'Var1' I see the issue has come up before, but I'm having a hard time discerning how solutions applied elsewhere would apply here. The entire code for both functions is below, but the only place the variable Zobs appears in the function cheat.fit is: cheaters - cbind(data.frame(cheaters), exactMatch) names(cheaters)[1] - 'Zobs' names(cheaters)[2] - 'Nexact' cheaters$Zcrit - Zcrit cheaters$Mean - means cheaters$Var - vars cheaters - subset(cheaters, Zobs = Zcrit) The code in the codetools package does not know that subset() does not evaluate its second argument in the standard way. Hence it gives a false alarm here. Right. And if you want to keep it quiet, something like Zobs - NULL # to satisfy codetools near the start of the function should work. Duncan Murdoch Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com result - list(pairs = c(row.names(cheaters)), Ncheat = nrow(cheaters), TotalCompare = totalCompare, alpha = alpha, ExactMatch = cheaters$Nexact, Zobs = cheaters$Zobs, Zcrit = Zcrit, Mean = cheaters$Mean, Variance = cheaters$Var, Probs = stuProbs) result and the only place Var1 appears in the plot function is here prop.correct - subset(data.frame(prop.table(table(tmp[, i+1], tmp$Estimate), margin=2)), Var1 == 1)[, 2:3] Many thanks, Harold sessionInfo() R version 2.10.0 (2009-10-26) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base cheat.fit - function(dat, key, wrongChoice, alpha = .01, rfa = c('nr', 'uni', 'bsct'), bonf = c('yes','no'), con = 1e-12, lower = 0, upper = 50){ bonf - tolower(bonf) bonf - match.arg(bonf) rfa - match.arg(rfa) rfa - tolower(rfa) dat - t(dat) correctStuMat - numeric(ncol(dat)) for(i in 1:ncol(dat)){ correctStuMat[i] - mean(key==dat[,i], na.rm= TRUE) } correctClsMat - numeric(length(key)) for(i in 1:length(key)){ correctClsMat[i] - mean(key[i]==dat[i,], na.rm= TRUE) } ### this is here for cases if all students in a class ### did not answer the item correctClsMat[is.na(correctClsMat)] - 0 pCorr - function(R,c,q){ numer - function(R,a,c,q){ result - sum((1-(1-R)^a)^(1/a), na.rm= TRUE)-c*q result } denom - function(R,a,c,q){ result - sum(na.rm= TRUE, -((1 - (1 - R)^a)^(1/a) * (log((1 - (1 - R)^a)) * (1/a^2)) + (1 - (1 - R)^a)^((1/a) - 1) * ((1/a) * ((1 - R)^a * log((1 - R)) result } aConst - function(R, c, q, con){ a - .5 # starting value for a change - 1 while(abs(change) con) { r1 - numer(R,a,c,q) r2 - denom(R,a,c,q) change - r1/r2 a - a - change } a } bisect - function(R, c, q, lower, upper, con){ f - function(a) sum((1 - (1-R)^a)^(1/a)) - c * q if(f(lower) * f(upper) 0) stop(endpoints must have opposite signs) while(abs(lower-upper) con){ x = .5 * (lower+upper) if(f(x) * f(lower) =0) lower = x else upper = x } .5 * (lower+upper) } if(rfa == 'nr'){ if(any(correctClsMat==1))
Re: [R] pairs
On Nov 16, 2009, at 2:32 PM, cindy Guo wrote: I forgot to say that there are no ties in each row. So any number can occur only once in each row. Also as I mentioned earlier, actually I only need the top 50 most frequent pairs, is there a more efficient way to do it? Because I have 15000 numbers, output of all the pairs would be too long. ?order Thank you, Cindy On Mon, Nov 16, 2009 at 7:02 AM, David Winsemius dwinsem...@comcast.net wrote: I stuck in another 7 in one of the lines with a 2 and reasoned that we could deal with the desire for non-ordered pair counting by pasting min(x,y) to max(x,y); dput(prmtx) structure(c(2, 1, 3, 9, 5, 7, 7, 8, 1, 7, 6, 5, 6, 2, 2, 7), .Dim = c(4L, 4L)) prmtx [,1] [,2] [,3] [,4] [1,]2516 [2,]1772 [3,]3762 [4,]9857 pair.str - sapply(1:nrow(prmtx), function(z) apply(combn(prmtx[z,], 2), 2,function(x) paste(min(x[2],x[1]), max(x[2],x[1]), sep=.))) The logic: sapply(1:nrow(prmtx), ... just loops over the rows of the matrix. combn(prmtx[z,], 2) ... returns a two row matrix of combination in a single row. apply(combn(prmtx[z,], 2), 2 ... since combn( , 2) returns a matrix that has two _rows_ I needed to loop over the columns. paste(min(x[2],x[1]), max(x[2],x[1]), sep=.) ... stick the minimum of a pair in front of the max and separates them with a period to prevent two+ digits from being non-unique Then using table() and logical tests in an index for the desired multiple pairs: tpair -table(pair.str) tpair pair.str 1.2 1.5 1.6 1.7 2.3 2.5 2.6 2.7 3.6 3.7 5.6 5.7 5.8 5.9 6.7 7.7 7.8 7.9 8.9 2 1 1 2 1 1 2 3 1 1 1 1 1 1 1 1 1 1 1 tpair[tpair1] pair.str 1.2 1.7 2.6 2.7 2 2 2 3 -- David. On Nov 16, 2009, at 7:02 AM, David Winsemius wrote: I'm not convinced it's right. In fact, I'm pretty sure the last step taking only the first half of the list is wrong. I also do not know if you have considered how you want to count situations like: 3 2 7 4 5 7 ... 7 3 8 6 1 2 9 2 .. How many pairs of 2-7/7-2 would that represent? -- David On Nov 15, 2009, at 11:06 PM, cindy Guo wrote: Hi, David, The matrix has 20 columns. Thank you very much for your help. I think it's right, but it seems I need some time to figure it out. I am a green hand. There are so many functions here I never used before. :) Cindy On Sun, Nov 15, 2009 at 5:19 PM, David Winsemius dwinsem...@comcast.net wrote: Assuming that the number of columns is 4, then consider this approach: prs -scan() 1: 2 5 1 6 5: 1 7 8 2 9: 3 7 6 2 13: 9 8 5 7 17: Read 16 items prmtx - matrix(prs, 4,4, byrow=T) #Now make copus of x.y and y.x pair.str - sapply(1:nrow(prmtx), function(z) c(apply(combn(prmtx[z,], 2), 2,function(x) paste(x[1],x[2], sep=.)) , apply(combn(prmtx[z,], 2), 2,function(x) paste(x[2],x[1], sep=.))) ) tpair -table(pair.str) # This then gives you a duplicated list tpair[tpair1] pair.str 1.2 2.1 2.6 2.7 6.2 7.2 7.8 8.7 2 2 2 2 2 2 2 2 # So only take the first half of the pairs: head(tpair[tpair1], sum(tpair1)/2) pair.str 1.2 2.1 2.6 2.7 2 2 2 2 -- David. On Nov 15, 2009, at 8:06 PM, David Winsemius wrote: I could of course be wrong but have you yet specified the number of columns for this pairing exercise? On Nov 15, 2009, at 5:26 PM, cindy Guo wrote: Hi, All, I have an n by m matrix with each entry between 1 and 15000. I want to know the frequency of each pair in 1:15000 that occur together in rows. So for example, if the matrix is 2 5 1 6 1 7 8 2 3 7 6 2 9 8 5 7 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return the value 2 for this pair as well as that for all pairs. Is there a fast way to do this avoiding loops? Loops take too long. and provide commented, minimal, self-contained, reproducible code. ^^ David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML
Re: [R] Sum over indexed value
Gunadi wrote: I am sure this is easy but I am not finding a function to do this. I have two columns in a matrix. The first column contains multiple entries of numbers from 1 to 100 (i.e. 10 ones, 8 twos etc.). The second column contains unique numbers. I want to sum the numbers in column two based on the indexed values in column one (e.g. sum of all values in column two associated with the value 1 in column one). I would like two columns in return - the indexed value in column one (i.e. this time no duplicates) and the sum in column two. How do I do this? Supposing you had the data: tstData - data.frame( index = c(1,2,1,1,3,2), value = c( 0, 4, 0, 0, 7, 4 ) ) You could use the by() function to divide the data.frame and sum the value column: sums - by( tstData, tstData[['index']], function( slice ){ return( sum( slice[['value']] ) ) }) However, by() tends to do a poor job of cleanly expressing which values of 'index' generated the sums. I would recomend the __ply() functions in Hadley Wickham's plyr package. Specifically ddply(): require( plyr ) sums - ddply( tstData, 'index', function( slice ){ return( data.frame( sum = sum( slice[['value']] ) ) ) }) sums index sum 1 1 0 2 2 8 3 3 7 Hope this helps! -Charlie - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://old.nabble.com/Sum-over-indexed-value-tp26376359p26378112.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs
Do you mean if the numbers in each row are ordered? They are not, but if it's needed, we can order them. The matrix only has 5000 rows. On Mon, Nov 16, 2009 at 1:34 PM, David Winsemius dwinsem...@comcast.netwrote: On Nov 16, 2009, at 2:32 PM, cindy Guo wrote: I forgot to say that there are no ties in each row. So any number can occur only once in each row. Also as I mentioned earlier, actually I only need the top 50 most frequent pairs, is there a more efficient way to do it? Because I have 15000 numbers, output of all the pairs would be too long. ?order Thank you, Cindy On Mon, Nov 16, 2009 at 7:02 AM, David Winsemius dwinsem...@comcast.netwrote: I stuck in another 7 in one of the lines with a 2 and reasoned that we could deal with the desire for non-ordered pair counting by pasting min(x,y) to max(x,y); dput(prmtx) structure(c(2, 1, 3, 9, 5, 7, 7, 8, 1, 7, 6, 5, 6, 2, 2, 7), .Dim = c(4L, 4L)) prmtx [,1] [,2] [,3] [,4] [1,]2516 [2,]1772 [3,]3762 [4,]9857 pair.str - sapply(1:nrow(prmtx), function(z) apply(combn(prmtx[z,], 2), 2,function(x) paste(min(x[2],x[1]), max(x[2],x[1]), sep=.))) The logic: sapply(1:nrow(prmtx), ... just loops over the rows of the matrix. combn(prmtx[z,], 2) ... returns a two row matrix of combination in a single row. apply(combn(prmtx[z,], 2), 2 ... since combn( , 2) returns a matrix that has two _rows_ I needed to loop over the columns. paste(min(x[2],x[1]), max(x[2],x[1]), sep=.) ... stick the minimum of a pair in front of the max and separates them with a period to prevent two+ digits from being non-unique Then using table() and logical tests in an index for the desired multiple pairs: tpair -table(pair.str) tpair pair.str 1.2 1.5 1.6 1.7 2.3 2.5 2.6 2.7 3.6 3.7 5.6 5.7 5.8 5.9 6.7 7.7 7.8 7.9 8.9 2 1 1 2 1 1 2 3 1 1 1 1 1 1 1 1 1 1 1 tpair[tpair1] pair.str 1.2 1.7 2.6 2.7 2 2 2 3 -- David. On Nov 16, 2009, at 7:02 AM, David Winsemius wrote: I'm not convinced it's right. In fact, I'm pretty sure the last step taking only the first half of the list is wrong. I also do not know if you have considered how you want to count situations like: 3 2 7 4 5 7 ... 7 3 8 6 1 2 9 2 .. How many pairs of 2-7/7-2 would that represent? -- David On Nov 15, 2009, at 11:06 PM, cindy Guo wrote: Hi, David, The matrix has 20 columns. Thank you very much for your help. I think it's right, but it seems I need some time to figure it out. I am a green hand. There are so many functions here I never used before. :) Cindy On Sun, Nov 15, 2009 at 5:19 PM, David Winsemius dwinsem...@comcast.net wrote: Assuming that the number of columns is 4, then consider this approach: prs -scan() 1: 2 5 1 6 5: 1 7 8 2 9: 3 7 6 2 13: 9 8 5 7 17: Read 16 items prmtx - matrix(prs, 4,4, byrow=T) #Now make copus of x.y and y.x pair.str - sapply(1:nrow(prmtx), function(z) c(apply(combn(prmtx[z,], 2), 2,function(x) paste(x[1],x[2], sep=.)) , apply(combn(prmtx[z,], 2), 2,function(x) paste(x[2],x[1], sep=.))) ) tpair -table(pair.str) # This then gives you a duplicated list tpair[tpair1] pair.str 1.2 2.1 2.6 2.7 6.2 7.2 7.8 8.7 2 2 2 2 2 2 2 2 # So only take the first half of the pairs: head(tpair[tpair1], sum(tpair1)/2) pair.str 1.2 2.1 2.6 2.7 2 2 2 2 -- David. On Nov 15, 2009, at 8:06 PM, David Winsemius wrote: I could of course be wrong but have you yet specified the number of columns for this pairing exercise? On Nov 15, 2009, at 5:26 PM, cindy Guo wrote: Hi, All, I have an n by m matrix with each entry between 1 and 15000. I want to know the frequency of each pair in 1:15000 that occur together in rows. So for example, if the matrix is 2 5 1 6 1 7 8 2 3 7 6 2 9 8 5 7 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return the value 2 for this pair as well as that for all pairs. Is there a fast way to do this avoiding loops? Loops take too long. and provide commented, minimal, self-contained, reproducible code. ^^ David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
[R] extracting the last row of each group in a data frame
Hi, I would like to extract the last row of each group in a data frame. The data frame is as follows Name Value A 1 A 2 A 3 B 4 B 8 C 2 D 3 I would like to get a data frame as Name Value A 3 B 8 C 2 D 3 Thank you for your suggestions in advance Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs
David Winsemius wrote: ?order cindy Guo wrote: Do you mean if the numbers in each row are ordered? They are not, but if it's needed, we can order them. The matrix only has 5000 rows. No, he's suggesting you check out the order() function by calling it's help page: ?order order() will sort your results into ascending or descending order. You could then pick off the top 50 by using head(). Hope that helps! -Charlie - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://old.nabble.com/pairs-tp26364801p26378236.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs
On Nov 16, 2009, at 2:41 PM, cindy Guo wrote: Do you mean if the numbers in each row are ordered? They are not, but if it's needed, we can order them. The matrix only has 5000 rows. No, I mean type ?order at the R command line and read the help page. On Mon, Nov 16, 2009 at 1:34 PM, David Winsemius dwinsem...@comcast.net wrote: On Nov 16, 2009, at 2:32 PM, cindy Guo wrote: I forgot to say that there are no ties in each row. So any number can occur only once in each row. Also as I mentioned earlier, actually I only need the top 50 most frequent pairs, is there a more efficient way to do it? Because I have 15000 numbers, output of all the pairs would be too long. ?order Thank you, Cindy On Mon, Nov 16, 2009 at 7:02 AM, David Winsemius dwinsem...@comcast.net wrote: I stuck in another 7 in one of the lines with a 2 and reasoned that we could deal with the desire for non-ordered pair counting by pasting min(x,y) to max(x,y); dput(prmtx) structure(c(2, 1, 3, 9, 5, 7, 7, 8, 1, 7, 6, 5, 6, 2, 2, 7), .Dim = c(4L, 4L)) prmtx [,1] [,2] [,3] [,4] [1,]2516 [2,]1772 [3,]3762 [4,]9857 pair.str - sapply(1:nrow(prmtx), function(z) apply(combn(prmtx[z,], 2), 2,function(x) paste(min(x[2],x[1]), max(x[2],x[1]), sep=.))) The logic: sapply(1:nrow(prmtx), ... just loops over the rows of the matrix. combn(prmtx[z,], 2) ... returns a two row matrix of combination in a single row. apply(combn(prmtx[z,], 2), 2 ... since combn( , 2) returns a matrix that has two _rows_ I needed to loop over the columns. paste(min(x[2],x[1]), max(x[2],x[1]), sep=.) ... stick the minimum of a pair in front of the max and separates them with a period to prevent two+ digits from being non-unique Then using table() and logical tests in an index for the desired multiple pairs: tpair -table(pair.str) tpair pair.str 1.2 1.5 1.6 1.7 2.3 2.5 2.6 2.7 3.6 3.7 5.6 5.7 5.8 5.9 6.7 7.7 7.8 7.9 8.9 2 1 1 2 1 1 2 3 1 1 1 1 1 1 1 1 1 1 1 tpair[tpair1] pair.str 1.2 1.7 2.6 2.7 2 2 2 3 -- David. On Nov 16, 2009, at 7:02 AM, David Winsemius wrote: I'm not convinced it's right. In fact, I'm pretty sure the last step taking only the first half of the list is wrong. I also do not know if you have considered how you want to count situations like: 3 2 7 4 5 7 ... 7 3 8 6 1 2 9 2 .. How many pairs of 2-7/7-2 would that represent? -- David On Nov 15, 2009, at 11:06 PM, cindy Guo wrote: Hi, David, The matrix has 20 columns. Thank you very much for your help. I think it's right, but it seems I need some time to figure it out. I am a green hand. There are so many functions here I never used before. :) Cindy On Sun, Nov 15, 2009 at 5:19 PM, David Winsemius dwinsem...@comcast.net wrote: Assuming that the number of columns is 4, then consider this approach: prs -scan() 1: 2 5 1 6 5: 1 7 8 2 9: 3 7 6 2 13: 9 8 5 7 17: Read 16 items prmtx - matrix(prs, 4,4, byrow=T) #Now make copus of x.y and y.x pair.str - sapply(1:nrow(prmtx), function(z) c(apply(combn(prmtx[z,], 2), 2,function(x) paste(x[1],x[2], sep=.)) , apply(combn(prmtx[z,], 2), 2,function(x) paste(x[2],x[1], sep=.))) ) tpair -table(pair.str) # This then gives you a duplicated list tpair[tpair1] pair.str 1.2 2.1 2.6 2.7 6.2 7.2 7.8 8.7 2 2 2 2 2 2 2 2 # So only take the first half of the pairs: head(tpair[tpair1], sum(tpair1)/2) pair.str 1.2 2.1 2.6 2.7 2 2 2 2 -- David. On Nov 15, 2009, at 8:06 PM, David Winsemius wrote: I could of course be wrong but have you yet specified the number of columns for this pairing exercise? On Nov 15, 2009, at 5:26 PM, cindy Guo wrote: Hi, All, I have an n by m matrix with each entry between 1 and 15000. I want to know the frequency of each pair in 1:15000 that occur together in rows. So for example, if the matrix is 2 5 1 6 1 7 8 2 3 7 6 2 9 8 5 7 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return the value 2 for this pair as well as that for all pairs. Is there a fast way to do this avoiding loops? Loops take too long. and provide commented, minimal, self-contained, reproducible code. ^^ David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list
[R] Writing a data frame in an excel file
Hello, I am having trouble by using the write.table function to write a data frame of 4 columns and 7530 rows. I don´t know if I should just use a sep=\n and change the .xls file into a .csv file. Thanks in advance - Anna Lippel new in R so be careful I should be asking a lt of questions!:teeth: -- View this message in context: http://old.nabble.com/Writing-a-data-frame-in-an-excel-file-tp26378240p26378240.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting the last row of each group in a data frame
Hi, You could try plyr, library(plyr) ddply(d,.(Name), tail,1) Name Value 1A 3 2B 8 3C 2 4D 3 HTH, baptiste 2009/11/16 Hao Cen h...@andrew.cmu.edu: Hi, I would like to extract the last row of each group in a data frame. The data frame is as follows Name Value A 1 A 2 A 3 B 4 B 8 C 2 D 3 I would like to get a data frame as Name Value A 3 B 8 C 2 D 3 Thank you for your suggestions in advance Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fitting a logistic regression with mixed type of variables
David, Thanks for your reply. Since I am kinda new to this forum, could you please advise me on where to read those questions in R-help? In addition, I did not pay much attention to the na.action, probably I should use na.action = na.omit instead of na.pass. -Jack On Mon, Nov 16, 2009 at 2:32 PM, David Winsemius dwinsem...@comcast.netwrote: On Nov 16, 2009, at 2:22 PM, Jack Luo wrote: Hi, I am trying to fit a logistic regression using glm, but my explanatory variables are of mixed type: some are numeric, some are ordinal, some are categorical, say If x1 is numeric, x2 is ordinal, x3 is categorical, is the following formula OK? The formula's certainly OK. What may be non-OK will be your understanding of the output. The default handling of ordinal factors is a common source of questions to R-help, so read up first. *model - glm(y~x1+x2+x3, family=binomial(link=logit), na.action=na.pass)* Why have you chosen that na.action option? -- David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs
Thank you. I will check that. Cindy On Mon, Nov 16, 2009 at 1:45 PM, cls59 ch...@sharpsteen.net wrote: David Winsemius wrote: ?order cindy Guo wrote: Do you mean if the numbers in each row are ordered? They are not, but if it's needed, we can order them. The matrix only has 5000 rows. No, he's suggesting you check out the order() function by calling it's help page: ?order order() will sort your results into ascending or descending order. You could then pick off the top 50 by using head(). Hope that helps! -Charlie - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://old.nabble.com/pairs-tp26364801p26378236.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting the last row of each group in a data frame
On Nov 16, 2009, at 2:42 PM, Hao Cen wrote: Hi, I would like to extract the last row of each group in a data frame. The data frame is as follows Name Value A 1 A 2 A 3 B 4 B 8 C 2 D 3 by(dfname$Value, dfname$Name, tail, 1) #which gets you a list Or: aggregate(dfname$Value, list(dfname$Name), tail, 1) #which returns a data.frame Group.1 x 1 A 3 2 B 8 3 C 2 4 D 3 I would like to get a data frame as Name Value A 3 B 8 C 2 D 3 -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting the last row of each group in a data frame
Dear Jeff, Here is a suggestion using tapply: data.frame(last = with(x, tapply(Value, Name, function(x) x[length(x)]))) See ?tapply for more information. HTH, Jorge On Mon, Nov 16, 2009 at 2:42 PM, Hao Cen wrote: Hi, I would like to extract the last row of each group in a data frame. The data frame is as follows Name Value A 1 A 2 A 3 B 4 B 8 C 2 D 3 I would like to get a data frame as Name Value A 3 B 8 C 2 D 3 Thank you for your suggestions in advance Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting the last row of each group in a data frame
jeffc wrote: Hi, I would like to extract the last row of each group in a data frame. The data frame is as follows Name Value A 1 A 2 A 3 B 4 B 8 C 2 D 3 I would like to get a data frame as Name Value A 3 B 8 C 2 D 3 Thank you for your suggestions in advance Jeff Try using the base function by() or ddply() from Hadley Wickham's plyr package: require( plyr ) tstData - structure(list(Name = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 4L), .Label = c(A, B, C, D), class = factor), Value = c(1L, 2L, 3L, 4L, 8L, 2L, 3L)), .Names = c(Name, Value), class = data.frame, row.names = c(NA, -7L)) lastRows - ddply( tstData, 'Name', function( group ){ return( data.frame( Value = tail( group[['Value']], n = 1 ) ) ) }) lastRows Name Value 1A 3 2B 8 3C 2 4D 3 Hope this helps! -Charlie - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://old.nabble.com/extracting-the-last-row-of-each-group-in-a-data-frame-tp26378194p26378404.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] object not found inside step() function
On Nov 16, 2009, at 2:31 PM, shuai yuan wrote: Hi, there, My appologize if someone ask the same question before. I searched the mailing list and found one similar post, but not what i want. The problem for me is, I use the step( glm()) to do naive forward selection for logistic regression. My code is functional in the open environment. But if I wrap it up as a function, then R keeps saying object 'a' not found. Actually, data frame a is inside the function. I did some search online. i guess the reason may be R did not keep the data in glm() output after building the model but not sure. Can anyone please tell me how to work around this problem? Thanks a lot in advance. I am using R 2.9.0. Here is the sample code: # naivelr-function(x,y){ : : : a-data.frame(x) form- paste(y~1+,paste(grep(X.*,names(a),value=T),collapse=+),sep=) if(is.null(force.in)!=T){ lowmo-paste(y~1+,paste(grep(X.*,names(a) [force.in],value=T),collapse=+),sep=) } else {lowmo-y~1} lower1-glm(lowmo,family=binomial,data=data.frame(a,y)) upper1-glm(form,family=binomial,data=data.frame(a,y)) You are sticking data.frame= a inside another data.frame stepout- step (lower1 ,scope =list(lower=lower1,upper=upper1),direction=forward,k=0,trace=100) Thats not the way I remember step-ping. I thought you made a fit and then stepped the formulas (using the same data), rather putting the whole glm object into a lower and an upper. I could be wrong about that since I try to avoid using stepwise methods. # here is the error:Start: #AIC=689.62 #y ~ 1 #Error in data.frame(a, y) : object 'a' not found---but a is there! But it's probably not in a form that can be interpreted. Consider adding y as a column in a. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.