[R] R-package question
Dear R-users, Suppose I want to modify and use internal functions of an R-package as my requirement. By any way is it possible to explore the internal coding structure of a package and get a list of internal functions? thanks. -- View this message in context: http://www.nabble.com/R-package-question-tp24554613p24554613.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Zinb for Non-interger data
Sorry bit of a Newbie question, and I promise I have searched the forum already, but I'm getting a bit desperate! I have over-dispersed, zero inflated data, with variance greater than the mean, suggesting Zero-Inflated Negative Binomial - which I attempted in R with the pscl package suggested on http://www.ats.ucla.edu/stat/R/dae/zinbreg.htm However my data is non-integer with some pesky decimals (i.e. 33.12) and zinb / pscl doesn't like that - not surprising as zinb is for count data, normally whole integers etc. Does anyone know of a different zinb package that will allow non-integers or and equivalent test/ model to zinb for non-integer data? Or should I try something else like a quasi-Poisson GLM? Apologies for the Newbie question! Any help much appreciated! Thanks! -- View this message in context: http://www.nabble.com/Zinb-for-Non-interger-data-tp24550044p24550044.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem With Repeated Use Of Load/Save/Close Commands
Duncan, Thank you very much for your help. Your first work-around solved my problem, even if the second one didn't. I'm not sure if it was the elimination of the file() or the close() commands, or both, that did it. You got me going again after a couple of weeks of casting about. My hat is off to you! Thanks, Rich - Original Message - From: Duncan Murdoch murd...@stats.uwo.ca To: Marilyn Rich Short rm.sh...@comcast.net Cc: r-help@r-project.org Sent: Saturday, July 18, 2009 8:21 AM Subject: Re: [R] Problem With Repeated Use Of Load/Save/Close Commands On 18/07/2009 11:08 AM, Duncan Murdoch wrote: On 17/07/2009 7:57 PM, Marilyn Rich Short wrote: Hello, I'm having a problem in R with repeated use of the load, save, and close commands. I'm getting an error message that reads, Too many open files. I'm opening files and closing them (and unlinking them), but when I go through that process 509 times, the program halts and I get this error message: cannot open the connection with warning messages: Too many open files. I've been working on this problem for a couple of weeks and have gleaned a bit of info from different internet threads, but no solutions yet. I'm using Windows XP, SP3 and R 2.9.1. Here is my session info: R version 2.9.1 (2009-06-26) i286-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States. 1252;LC_MONETARY=English_United States. 1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base I'm using the vanilla load of R, with nothing added after booting up. The problem also occurs on my Vista machine as well. The program below will induce the problem. junk = 1 PathA= tempdir() conX = paste(PathA,junk,sep=\\) outJunk = file(conX, open=wb) save(junk, file=outJunk) close(outJunk) for(i in 1:4000){ outMIA = file(conX, open=rb) load(file=outMIA) close(outMIA) closeAllConnections() unlink(conX) rm(outMIA) rm(junk) cat( i = ,i,sep= ) gc() zzz = showConnections(all=FALSE) cat( zzz = ,zzz,\n,sep= ) } There is some talk on the internet that some windows systems have a limit of 512 files that can be open at one time. Even though I'm closing my files each time, something is keeping track of how many times I've opened and closed a file in a session. I've talked to Microsoft and run a test program in Visual Studio C#, and, at the moment, it looks like the problem does not lie in the Microsoft arena. The C# program performed a similar task 10,000 times without a problem. I'm not totally convinced, but the current evidence says to look elsewhere. I've attached a script that will induce the problem. However, be warned that, if you use it, you will have to get out of R after you run it. R will no longer be able to access files such as help or sessionInfo(). This can be solved by getting out of the R GUI and back in. R E-mails from as far back as 2006 ask for help on the issue. There have been several suggestions, but no confirmed solution that I can find. You will see my attempt at these suggestions in the script [ rm(outMIA); rm(junk); closeAllConnections(); and gc(); after close(outMIA); and unlink(conX);]. For me, this becomes important because it limits the total number if iterations in nested do-loops. This is where I ran into the problem. The program above will allow you to reproduce the problem. Any suggestion would be greatly appreciated. I've somewhat localized the problem. load(file=outMIA) passes the outMIA connection to gzcon(), which handles decompression of the data. gzcon() re-opens the file, and it looks as though the original file handle is lost, because closing either the result of gzcon() or the original connection results in only the second file handle being closed. So a second workaround besides the one I sent earlier is just not to open or close outMIA. That is, rewrite it as junk = 1 PathA= tempdir() conX = paste(PathA,junk,sep=\\) outJunk = file(conX, open=wb) save(junk, file=outJunk) close(outJunk) for(i in 1:4000){ outMIA = file(conX) load(file=outMIA) rm(outMIA) rm(junk) cat( i = ,i,sep= ) gc() zzz = showConnections(all=FALSE) cat( zzz = ,zzz,\n,sep= ) } so that load() does the open and close, and things are fine. Oops, not quite fine. There's a whole set of warnings... Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal,
[R] transform(_data,...) using strptime gives an error
I have timstamped data like this: sd[1:10,] Tstamp Density Mesh50 Mesh70 Mesh100 Mesh150 Mesh200 2 2009/02/27 07:0030.50.7 10.721.432.841.6 3 2009/02/27 08:0032.21.6 12.423.334.543.0 4 2009/02/27 09:0032.74.8 13.024.035.143.5 5 2009/02/27 10:0026.70.36.517.628.136.9 6 2009/02/27 11:0026.60.96.617.028.637.9 7 2009/02/27 12:0023.36.33.414.025.534.6 8 2009/02/27 13:0025.21.15.115.427.336.8 9 2009/02/27 14:0028.60.28.719.430.940.0 10 2009/02/27 15:0028.00.68.018.630.239.3 11 2009/02/27 16:0028.30.98.318.930.539.5 The timstamps are character vectors: str(sd) 'data.frame': 591 obs. of 7 variables: $ Tstamp : chr 2009/02/27 07:00 2009/02/27 08:00 2009/02/27 09:00 2009/02/27 10:00 ... $ Density: num 30.5 32.2 32.7 26.7 26.6 23.3 25.2 28.6 28 28.3 ... $ Mesh50 : num 0.7 1.6 4.8 0.3 0.9 6.3 1.1 0.2 0.6 0.9 ... $ Mesh70 : num 10.7 12.4 13 6.5 6.6 3.4 5.1 8.7 8 8.3 ... $ Mesh100: num 21.4 23.3 24 17.6 17 14 15.4 19.4 18.6 18.9 ... $ Mesh150: num 32.8 34.5 35.1 28.1 28.6 25.5 27.3 30.9 30.2 30.5 ... $ Mesh200: num 41.6 43 43.5 36.9 37.9 34.6 36.8 40 39.3 39.5 ... - attr(*, na.action)=Class 'exclude' Named int [1:58] 1 88 89 90 250 318 319 320 321 322 ... .. ..- attr(*, names)= chr [1:58] 1 88 89 90 ... Trying to transform the timestamped character vector 'in place' using transform gives this error message: sd-transform(sd,Tstamp=strptime(Tstamp,format='%Y/%m/%d %H:%M')) Error in `[-.data.frame`(`*tmp*`, inx[matched], value = list(Tstamp = list( : replacement element 1 has 9 rows, need 591 Why? It beats me... I do have a backup of course: td-strptime(sd$Tstamp,format='%Y/%m/%d %H:%M') sd-data.frame(Tstamp=td, sd[2:7]) this works fine but is one step more complicated. Something I miss about transform()? Thanks in advance, Alex van der Spek __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transform(_data,...) using strptime gives an error
strptime produces a POSIXlt result and you likely intended POSIXct instead: as.POSIXct(x, format = %Y/%m/%d %H:%M) for storing in a data frame. Also to avoid time zone problems down the line you might want to use chron: library(chron) as.chron(x, format = %Y/%m/%d %H:%M) See R News 4/1. On Sun, Jul 19, 2009 at 6:26 AM, am...@xs4all.nl wrote: I have timstamped data like this: sd[1:10,] Tstamp Density Mesh50 Mesh70 Mesh100 Mesh150 Mesh200 2 2009/02/27 07:00 30.5 0.7 10.7 21.4 32.8 41.6 3 2009/02/27 08:00 32.2 1.6 12.4 23.3 34.5 43.0 4 2009/02/27 09:00 32.7 4.8 13.0 24.0 35.1 43.5 5 2009/02/27 10:00 26.7 0.3 6.5 17.6 28.1 36.9 6 2009/02/27 11:00 26.6 0.9 6.6 17.0 28.6 37.9 7 2009/02/27 12:00 23.3 6.3 3.4 14.0 25.5 34.6 8 2009/02/27 13:00 25.2 1.1 5.1 15.4 27.3 36.8 9 2009/02/27 14:00 28.6 0.2 8.7 19.4 30.9 40.0 10 2009/02/27 15:00 28.0 0.6 8.0 18.6 30.2 39.3 11 2009/02/27 16:00 28.3 0.9 8.3 18.9 30.5 39.5 The timstamps are character vectors: str(sd) 'data.frame': 591 obs. of 7 variables: $ Tstamp : chr 2009/02/27 07:00 2009/02/27 08:00 2009/02/27 09:00 2009/02/27 10:00 ... $ Density: num 30.5 32.2 32.7 26.7 26.6 23.3 25.2 28.6 28 28.3 ... $ Mesh50 : num 0.7 1.6 4.8 0.3 0.9 6.3 1.1 0.2 0.6 0.9 ... $ Mesh70 : num 10.7 12.4 13 6.5 6.6 3.4 5.1 8.7 8 8.3 ... $ Mesh100: num 21.4 23.3 24 17.6 17 14 15.4 19.4 18.6 18.9 ... $ Mesh150: num 32.8 34.5 35.1 28.1 28.6 25.5 27.3 30.9 30.2 30.5 ... $ Mesh200: num 41.6 43 43.5 36.9 37.9 34.6 36.8 40 39.3 39.5 ... - attr(*, na.action)=Class 'exclude' Named int [1:58] 1 88 89 90 250 318 319 320 321 322 ... .. ..- attr(*, names)= chr [1:58] 1 88 89 90 ... Trying to transform the timestamped character vector 'in place' using transform gives this error message: sd-transform(sd,Tstamp=strptime(Tstamp,format='%Y/%m/%d %H:%M')) Error in `[-.data.frame`(`*tmp*`, inx[matched], value = list(Tstamp = list( : replacement element 1 has 9 rows, need 591 Why? It beats me... I do have a backup of course: td-strptime(sd$Tstamp,format='%Y/%m/%d %H:%M') sd-data.frame(Tstamp=td, sd[2:7]) this works fine but is one step more complicated. Something I miss about transform()? Thanks in advance, Alex van der Spek __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-package question
spime wrote: Dear R-users, Suppose I want to modify and use internal functions of an R-package as my requirement. By any way is it possible to explore the internal coding structure of a package and get a list of internal functions? The best way is to download the source to the package, which will be on CRAN if that's where you got the package. You can see a deparsed version of the R code just by asking to print the functions (possibly prefixed with pkgname::: if they aren't exported), but the source is always better. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Zinb for Non-interger data
On 18-Jul-09 17:26:36, JPS2009 wrote: Sorry bit of a Newbie question, and I promise I have searched the forum already, but I'm getting a bit desperate! I have over-dispersed, zero inflated data, with variance greater than the mean, suggesting Zero-Inflated Negative Binomial - which I attempted in R with the pscl package suggested on http://www.ats.ucla.edu/stat/R/dae/zinbreg.htm However my data is non-integer with some pesky decimals (i.e. 33.12) and zinb / pscl doesn't like that - not surprising as zinb is for count data, normally whole integers etc. Does anyone know of a different zinb package that will allow non-integers or and equivalent test/ model to zinb for non-integer data? Or should I try something else like a quasi-Poisson GLM? Apologies for the Newbie question! Any help much appreciated! Thanks! The presence of decimals suggests that those data values are records of quantities which ought to be modelled as continuous variables. For instance, in answer to a survey question How much did you spend on alcoholic drinks yesterday, the answer would be either a positive sum of money (with decimals), or zero, depending on whether the person spent anything at all on alcohol. So: With probability p, the amount spent was positive and, conditional on being positive, has a distribution which can be modelled by a particular continuous distribution (maybe Log-normal?). With probability (1-p), the amount spent was zero. So a correct approach first requires you to face the question of how to model the positive part of the distribution. Once you have settled that question, it is then possible to see whether that particular class of problem is covered by some package in R, or whether you need to develop an approach yourself. In any case, if I am barking up the right tree above, neither negative binomial nor Poisson would, in principle, be correct for such data since, as you observe, these are intended for count data, not for data which is essentially continuous. Hoping this helps, Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 19-Jul-09 Time: 12:25:39 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can I use mcnemar.test for 3*3 tables (or is there a bug in the command?)
There is a function mh_test in the coin package. library(coin) mh_test(tt) The documentation states, The null hypothesis of independence of row and column totals is tested. The corresponding test for binary factors x and y is known as McNemar test. For larger tables, Stuart’s W0 statistic (Stuart, 1955, Agresti, 2002, page 422, also known as Stuart-Maxwell test) is computed. hth, david freedman Tal Galili wrote: Hello all, I wish to perform a mcnemar test for a 3 by 3 matrix. By running the slandered R command I am getting a result but I am not sure I am getting the correct one. Here is an example code: (tt - as.table(t(matrix(c(1,4,1, 0,5,5, 3,1,5), ncol = 3 mcnemar.test(tt, correct=T) #And I get: McNemar's Chi-squared test data: tt McNemar's chi-squared = 7.6667, df = 3, p-value = *0.05343* Now I was wondering if the test I just performed is the correct one. From looking at the Wikipedia article on mcnemar ( http://en.wikipedia.org/wiki/McNemar's_test), it is said that: The Stuart-Maxwell testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm is different generalization of the McNemar test, used for testing marginal homogeneity in a square table with more than two rows/columns From searching for a Stuart-Maxwell testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm in google, I found an algorithm here: http://www.m-hikari.com/ams/ams-password-2009/ams-password9-12-2009/abbasiAMS9-12-2009.pdf From running this algorithm I am getting a different P value, here is the (somewhat ugly) code I produced for this: get.d - function(xx) { length1 - dim(xx)[1] ret1 - margin.table(xx,1) - margin.table(xx,2) return(ret1) } get.s - function(xx) { the.s - xx for( i in 1:dim(xx)[1]) { for(j in 1:dim(xx)[2]) { if(i == j) { the.s[i,j] - margin.table(xx,1)[i] + margin.table(xx,2)[i] - 2*xx[i,i] } else { the.s[i,j] - -(xx[i,j] + xx[j,i]) } } } return(the.s) } chi.statistic - t(get.d(tt)[-3]) %*% solve(get.s(tt)[-3,-3]) %*% get.d(tt)[-3] paste(the P value:, pchisq(chi.statistic, 2)) #and the result was: the P value: 0.268384371053358 So to summarize my questions: 1) can I use mcnemar.test for 3*3 (or more) tables ? 2) if so, what test is being performed ( Stuart-Maxwellhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm) ? 3) Do you have a recommended link to an explanation of the algorithm employed? Thanks, Tal -- -- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: http://www.r-statistics.com/ http://www.talgalili.com http://www.biostatistics.co.il [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Can-I-use-%22mcnemar.test%22-for-3*3-tables-%28or-is-there-a-bug-in-the-command-%29-tp24556414p24556693.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Building a big.matrix using foreach
Michael, If you have a big.matrix, you just want to iterate over the rows. I'm not in R and am just making this up on the fly (from a bar in Beijing, if you believe that): foreach(i=1:nrow(x),.combine=c) %dopar% f(x[i,]) should work, essentially applying the functin f() to the rows of x? But perhaps I misunderstand you. Please feel free to email me or Mike ( michael.k...@yale.edu) directoy with questions about bigmemory, we are very interested in applications of it to real problems. Note that the package foreach uses package iterators, and is very flexible, in case you need more general iteration in parellel. Regards, Jay Original message: Hi there! I have become a big fan of the 'foreach' package allowing me to do a lot of stuff in parallel. For example, evaluating the function f on all elements in a vector x is easily accomplished: foreach(i=1:length(x),.combine=c) %dopar% f(x[i]) Here the .combine=c option tells foreach to combine output using the c()-function. That is, to return it as a vector. Today I discovered the 'bigmemory' package, and I would like to contruct a big.matrix in a parralel fashion row by row. To use foreach I see no other way than to come up with a substitute for c in the .combine option. I have checked out the big.matrix manual, but I can't find a function suitable for just that. Actually, I wouldn't even know how to do it for a usual matrix. Any clues? Thanks! -- Michael Knudsen micknud...@gmail.com http://lifeofknudsen.blogspot.com/ -- John W. Emerson (Jay) Associate Professor of Statistics Department of Statistics Yale University http://www.stat.yale.edu/~jay [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can I use mcnemar.test for 3*3 tables (or is there a bug in the command?)
Hello David,Thank you for your answer. Do you know then what does the mcnemar.test do in the case of a 3*3 table ? Because the results for the simple example I gave are rather different (P value of 0.053 VS 0.73) In case the mcnemar can't really handle a 3*3 matrix (or more), shouldn't there be an error massage for this case? (if so, who should I turn to, in order to report this?) Thanks again, Tal On Sun, Jul 19, 2009 at 3:47 PM, David Freedman 3.14da...@gmail.com wrote: There is a function mh_test in the coin package. library(coin) mh_test(tt) The documentation states, The null hypothesis of independence of row and column totals is tested. The corresponding test for binary factors x and y is known as McNemar test. For larger tables, Stuarts W0 statistic (Stuart, 1955, Agresti, 2002, page 422, also known as Stuart-Maxwell test) is computed. hth, david freedman Tal Galili wrote: Hello all, I wish to perform a mcnemar test for a 3 by 3 matrix. By running the slandered R command I am getting a result but I am not sure I am getting the correct one. Here is an example code: (tt - as.table(t(matrix(c(1,4,1, 0,5,5, 3,1,5), ncol = 3 mcnemar.test(tt, correct=T) #And I get: McNemar's Chi-squared test data: tt McNemar's chi-squared = 7.6667, df = 3, p-value = *0.05343* Now I was wondering if the test I just performed is the correct one. From looking at the Wikipedia article on mcnemar ( http://en.wikipedia.org/wiki/McNemar's_test), it is said that: The Stuart-Maxwell testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm is different generalization of the McNemar test, used for testing marginal homogeneity in a square table with more than two rows/columns From searching for a Stuart-Maxwell testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm in google, I found an algorithm here: http://www.m-hikari.com/ams/ams-password-2009/ams-password9-12-2009/abbasiAMS9-12-2009.pdf From running this algorithm I am getting a different P value, here is the (somewhat ugly) code I produced for this: get.d - function(xx) { length1 - dim(xx)[1] ret1 - margin.table(xx,1) - margin.table(xx,2) return(ret1) } get.s - function(xx) { the.s - xx for( i in 1:dim(xx)[1]) { for(j in 1:dim(xx)[2]) { if(i == j) { the.s[i,j] - margin.table(xx,1)[i] + margin.table(xx,2)[i] - 2*xx[i,i] } else { the.s[i,j] - -(xx[i,j] + xx[j,i]) } } } return(the.s) } chi.statistic - t(get.d(tt)[-3]) %*% solve(get.s(tt)[-3,-3]) %*% get.d(tt)[-3] paste(the P value:, pchisq(chi.statistic, 2)) #and the result was: the P value: 0.268384371053358 So to summarize my questions: 1) can I use mcnemar.test for 3*3 (or more) tables ? 2) if so, what test is being performed ( Stuart-Maxwell http://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm) ? 3) Do you have a recommended link to an explanation of the algorithm employed? Thanks, Tal -- -- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: http://www.r-statistics.com/ http://www.talgalili.com http://www.biostatistics.co.il [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Can-I-use-%22mcnemar.test%22-for-3*3-tables-%28or-is-there-a-bug-in-the-command-%29-tp24556414p24556693.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- -- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: http://www.r-statistics.com/ http://www.talgalili.com http://www.biostatistics.co.il [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hmisc, Design, summary.Design plot- changing confidence intervals, adding color or decreasing font size
David Winsemius wrote: On Jul 18, 2009, at 11:14 AM, Elizabeth Stanny wrote: Frank, I had already tried q=c(0.7,0.8,0.9,0.95) as an argument in plot.summary.Design and it did not work (i.e., CIs were not plotted). Could you point me to an example using plot.summary.Design that uses q as argument and has output? I would greatly appreciate it. I am encountering an error when using any number of probabilities other than 5 and I think it relates to how the confbar col= defaults are set. plot(s, log=TRUE, at=c(.1,.5,1,1.5,2,4,8), q=c(0.7,0.8,0.9,0.95)) Error in confbar(nbar - (i - is + 1) + 1, effect[i], se[i], q = q, type = h, : q and col must have same length ?confbar Using the example on the help page with a modified q vector of length 5 gives output: plot(s, log=TRUE, at=c(.1,.5,1,1.5,2,4,8), q = c(0.6, 0.8, 0.95, 0.975, 0.99)) Experimentation shows that providing a col vector of the same length as q also succeeds: plot(s, log=TRUE, at=c(.1,.5,1,1.5,2,4,8), q=c(0.7,0.8,0.9,0.95), col=gray(c(0, 0.25, 0.75, 1))) Exactly, as per the documentation. I have just added a sentence to the help file to make this more clear. Thanks Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with as.POSIXct and daylight savings time
[was [R] end of daylight saving time] Hi, I got no reply with the previous subject line, probably a bad choice of subject on my part, so here it is again. I read from the help on DateTimeClasses and various posts on this list that, quite logically, one needs to specify if DST is active or not when time is between 1 and 2 AM on the first Sunday in November (for North America in recent years). This I can do for on date at a time: a - as.POSIXct(2008-11-02 01:30:00, tz=EST5EDT) # to get automatic use of DST b - as.POSIXct(2008-11-02 01:30:00, tz=EST) # to tell T this is the second occurrence of 1:30 that day, in ST difftime(b,a) Time difference of 1 hours But why can't I do the following, which appears to be a typical R way of doing things, to handle several date-times at once? c - rep(2008-11-02 01:30:00, 2) tzone = c(EST5EDT, EST) as.POSIXct(c, tz=tzone) Erreur dans strptime(xx, f - %Y-%m-%d %H:%M:%OS, tz = tz) : valeur 'tz' incorrecte ??? Thanks, Denis Chabot sessionInfo() R version 2.9.1 Patched (2009-07-09 r48929) x86_64-apple-darwin9.7.0 locale: fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.9.1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can I use mcnemar.test for 3*3 tables (or is there a bug in the command?)
Hello all, I wish to perform a mcnemar test for a 3 by 3 matrix. By running the slandered R command I am getting a result but I am not sure I am getting the correct one. Here is an example code: (tt - as.table(t(matrix(c(1,4,1, 0,5,5, 3,1,5), ncol = 3 mcnemar.test(tt, correct=T) #And I get: McNemar's Chi-squared test data: tt McNemar's chi-squared = 7.6667, df = 3, p-value = *0.05343* Now I was wondering if the test I just performed is the correct one. From looking at the Wikipedia article on mcnemar ( http://en.wikipedia.org/wiki/McNemar's_test), it is said that: The Stuart-Maxwell testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm is different generalization of the McNemar test, used for testing marginal homogeneity in a square table with more than two rows/columns From searching for a Stuart-Maxwell testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm in google, I found an algorithm here: http://www.m-hikari.com/ams/ams-password-2009/ams-password9-12-2009/abbasiAMS9-12-2009.pdf From running this algorithm I am getting a different P value, here is the (somewhat ugly) code I produced for this: get.d - function(xx) { length1 - dim(xx)[1] ret1 - margin.table(xx,1) - margin.table(xx,2) return(ret1) } get.s - function(xx) { the.s - xx for( i in 1:dim(xx)[1]) { for(j in 1:dim(xx)[2]) { if(i == j) { the.s[i,j] - margin.table(xx,1)[i] + margin.table(xx,2)[i] - 2*xx[i,i] } else { the.s[i,j] - -(xx[i,j] + xx[j,i]) } } } return(the.s) } chi.statistic - t(get.d(tt)[-3]) %*% solve(get.s(tt)[-3,-3]) %*% get.d(tt)[-3] paste(the P value:, pchisq(chi.statistic, 2)) #and the result was: the P value: 0.268384371053358 So to summarize my questions: 1) can I use mcnemar.test for 3*3 (or more) tables ? 2) if so, what test is being performed ( Stuart-Maxwellhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm) ? 3) Do you have a recommended link to an explanation of the algorithm employed? Thanks, Tal -- -- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: http://www.r-statistics.com/ http://www.talgalili.com http://www.biostatistics.co.il [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Building a big.matrix using foreach
Another thing to realize if you are doing this in parallel, f(x[i,]) is being executed on each of the worker R sessions. Now, a big.matrix object is essentially a pointer to an object created in C++ and, a pointer address space is specific to a process (in this case the master R session). As a result, when you call f(x[i,]) in each of the workers, you are trying to use a pointer that no longer points to the original C++ object. To get around this you need to either use a shared.big.matrix or filebacked.big.matrix object and you need to use the descriptor in the foreach function. desc = describe(x) foreach (i=1:nrow(x), .combine=c, .packages='bigmemory') %dopar% { x = attach.big.matrix(desc) f(x[i,]) } The descriptor is just a list of information that is not process specific. So, in the example, the descriptor is passed to each of the workers. Then the worker attaches to the big.matrix object which is shared across R sessions, and then function is called on the attached big.matrix object. I may have provided more detail than you were interested. The thing to take away is that if you are working with foreach and bigmemory in parallel, you need to use descriptors to the workers and attach to the big.matrix object. Thanks, Mike On Sun, Jul 19, 2009 at 8:29 AM, Jay Emerson jayemer...@gmail.com wrote: Michael, If you have a big.matrix, you just want to iterate over the rows. I'm not in R and am just making this up on the fly (from a bar in Beijing, if you believe that): foreach(i=1:nrow(x),.combine=c) %dopar% f(x[i,]) should work, essentially applying the functin f() to the rows of x? But perhaps I misunderstand you. Please feel free to email me or Mike ( michael.k...@yale.edu) directoy with questions about bigmemory, we are very interested in applications of it to real problems. Note that the package foreach uses package iterators, and is very flexible, in case you need more general iteration in parellel. Regards, Jay Original message: Hi there! I have become a big fan of the 'foreach' package allowing me to do a lot of stuff in parallel. For example, evaluating the function f on all elements in a vector x is easily accomplished: foreach(i=1:length(x),.combine=c) %dopar% f(x[i]) Here the .combine=c option tells foreach to combine output using the c()-function. That is, to return it as a vector. Today I discovered the 'bigmemory' package, and I would like to contruct a big.matrix in a parralel fashion row by row. To use foreach I see no other way than to come up with a substitute for c in the .combine option. I have checked out the big.matrix manual, but I can't find a function suitable for just that. Actually, I wouldn't even know how to do it for a usual matrix. Any clues? Thanks! -- Michael Knudsen micknud...@gmail.com http://lifeofknudsen.blogspot.com/ -- John W. Emerson (Jay) Associate Professor of Statistics Department of Statistics Yale University http://www.stat.yale.edu/~jay http://www.stat.yale.edu/%7Ejay [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hmisc, Design, summary.Design plot- changing confidence intervals, adding color or decreasing font size
Elizabeth Stanny wrote: Hi, 1. I want 95% not 99% confidence intervals in my summary.Design plot using the Design package. Putting conf.int http://conf.int/=.95 as an argument in plot does not work. The default appears to be .99 not .95 as stated in the package Design manual (p. 164). (hope this works--I haven't posted from the digest before) It may not be immediately obvious, but the color (col) argument must always match q in length. When you change the length of q to display a number of CIs that is different from the default (5, I think), col is still set to correspond to that default number of CIs. So, you also must change col so that it matches q in length. For example, this will not work because there is no matching col argument: plot(summymodel,q=.95) You'll get this error msg: Error in confbar(nbar - (i - is + 1) + 1, effect[i], se[i], q = q, type = h, q and col must have same length But this will: plot(summymodel,q=.95,col=2) If you wanted two sets of CIs, say 95 and 99: plot(summymodel,q=c(.95,.99),col=c(2,5)) Mike Babyak Duke Medical Center [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] package X12
Ana Quiterio wrote: Dear All. I can't execute X12GUI() from package x12. I have problems in the step : Please choose the x12 binaries. Can anyone help me? Thanks in advance, AnaQ I know barely anything about the package. I gather from a bit of googling that X12 is a time-series analysis package from the US Census Bureau, and that the x12 package provides an R interface. Have you followed any available installation instructions that come with the package? Have you tried contacting the package author (see help(package=x12) ) ? Ben Bolker -- View this message in context: http://www.nabble.com/package-X12-tp24548691p24557843.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with as.POSIXct and daylight savings time
[was [R] end of daylight saving time] Hi, I got no reply with the previous subject line, probably a bad choice of subject on my part, so here it is again. I read from the help on DateTimeClasses and various posts on this list that, quite logically, one needs to specify if DST is active or not when time is between 1 and 2 AM on the first Sunday in November (for North America in recent years). This I can do for on date at a time: a - as.POSIXct(2008-11-02 01:30:00, tz=EST5EDT) # to get automatic use of DST b - as.POSIXct(2008-11-02 01:30:00, tz=EST) # to tell T this is the second occurrence of 1:30 that day, in ST difftime(b,a) Time difference of 1 hours But why can't I do the following, which appears to be a typical R way of doing things, to handle several date-times at once? c - rep(2008-11-02 01:30:00, 2) tzone = c(EST5EDT, EST) as.POSIXct(c, tz=tzone) Erreur dans strptime(xx, f - %Y-%m-%d %H:%M:%OS, tz = tz) : valeur 'tz' incorrecte ??? Thanks, Denis Chabot sessionInfo() R version 2.9.1 Patched (2009-07-09 r48929) x86_64-apple-darwin9.7.0 locale: fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.9.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with as.POSIXct and daylight savings time
On 19/07/2009 11:23 AM, Denis Chabot wrote: [was [R] end of daylight saving time] Hi, I got no reply with the previous subject line, probably a bad choice of subject on my part, so here it is again. I read from the help on DateTimeClasses and various posts on this list that, quite logically, one needs to specify if DST is active or not when time is between 1 and 2 AM on the first Sunday in November (for North America in recent years). This I can do for on date at a time: a - as.POSIXct(2008-11-02 01:30:00, tz=EST5EDT) # to get automatic use of DST b - as.POSIXct(2008-11-02 01:30:00, tz=EST) # to tell T this is the second occurrence of 1:30 that day, in ST difftime(b,a) Time difference of 1 hours But why can't I do the following, which appears to be a typical R way of doing things, to handle several date-times at once? c - rep(2008-11-02 01:30:00, 2) tzone = c(EST5EDT, EST) as.POSIXct(c, tz=tzone) Erreur dans strptime(xx, f - %Y-%m-%d %H:%M:%OS, tz = tz) : valeur 'tz' incorrecte ??? Objects of the POSIXlt and POSIXct classes don't support multiple time zones, so if you specified several time zones on input, how would the conversion functions decide which one to use for output? You'll need to write your own wrapper function to make this decision, and do the conversions separately for each input timezone. Why don't those classes support a separate time zone for each entry? Presumably because their designer never thought anyone would want to do that. Duncan Murdoch Thanks, Denis Chabot sessionInfo() R version 2.9.1 Patched (2009-07-09 r48929) x86_64-apple-darwin9.7.0 locale: fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.9.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with as.POSIXct and daylight savings time
Thank you very much Duncan. I'll follow your suggestion. Why do I want to do what the designer did not think anyone would want to do? I have data acquisition equipment taking measurements every 15 min or so for days at a time, and I need to compile all such experiments in a master data set. The data acquisition equipment automatically switches to DST in spring and back to ST in autumn, which I did not disable because it is easier to work with while we are running the experiments. I could use chron to ignore time zones and daylight savings time, but this would not be of much help as whether or not I use as.POSIXct or chron, there is one day of the year that has 25 h and I need to deal with that 25th hour or I'll lose one hour of data! Denis Le 09-07-19 à 11:45, Duncan Murdoch a écrit : On 19/07/2009 11:23 AM, Denis Chabot wrote: [was [R] end of daylight saving time] Hi, I got no reply with the previous subject line, probably a bad choice of subject on my part, so here it is again. I read from the help on DateTimeClasses and various posts on this list that, quite logically, one needs to specify if DST is active or not when time is between 1 and 2 AM on the first Sunday in November (for North America in recent years). This I can do for on date at a time: a - as.POSIXct(2008-11-02 01:30:00, tz=EST5EDT) # to get automatic use of DST b - as.POSIXct(2008-11-02 01:30:00, tz=EST) # to tell T this is the second occurrence of 1:30 that day, in ST difftime(b,a) Time difference of 1 hours But why can't I do the following, which appears to be a typical R way of doing things, to handle several date-times at once? c - rep(2008-11-02 01:30:00, 2) tzone = c(EST5EDT, EST) as.POSIXct(c, tz=tzone) Erreur dans strptime(xx, f - %Y-%m-%d %H:%M:%OS, tz = tz) : valeur 'tz' incorrecte ??? Objects of the POSIXlt and POSIXct classes don't support multiple time zones, so if you specified several time zones on input, how would the conversion functions decide which one to use for output? You'll need to write your own wrapper function to make this decision, and do the conversions separately for each input timezone. Why don't those classes support a separate time zone for each entry? Presumably because their designer never thought anyone would want to do that. Duncan Murdoch Thanks, Denis Chabot sessionInfo() R version 2.9.1 Patched (2009-07-09 r48929) x86_64-apple-darwin9.7.0 locale: fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.9.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with as.POSIXct and daylight savings time
Have you considered the timeDate package? Spencer Denis Chabot wrote: Thank you very much Duncan. I'll follow your suggestion. Why do I want to do what the designer did not think anyone would want to do? I have data acquisition equipment taking measurements every 15 min or so for days at a time, and I need to compile all such experiments in a master data set. The data acquisition equipment automatically switches to DST in spring and back to ST in autumn, which I did not disable because it is easier to work with while we are running the experiments. I could use chron to ignore time zones and daylight savings time, but this would not be of much help as whether or not I use as.POSIXct or chron, there is one day of the year that has 25 h and I need to deal with that 25th hour or I'll lose one hour of data! Denis Le 09-07-19 à 11:45, Duncan Murdoch a écrit : On 19/07/2009 11:23 AM, Denis Chabot wrote: [was [R] end of daylight saving time] Hi, I got no reply with the previous subject line, probably a bad choice of subject on my part, so here it is again. I read from the help on DateTimeClasses and various posts on this list that, quite logically, one needs to specify if DST is active or not when time is between 1 and 2 AM on the first Sunday in November (for North America in recent years). This I can do for on date at a time: a - as.POSIXct(2008-11-02 01:30:00, tz=EST5EDT) # to get automatic use of DST b - as.POSIXct(2008-11-02 01:30:00, tz=EST) # to tell T this is the second occurrence of 1:30 that day, in ST difftime(b,a) Time difference of 1 hours But why can't I do the following, which appears to be a typical R way of doing things, to handle several date-times at once? c - rep(2008-11-02 01:30:00, 2) tzone = c(EST5EDT, EST) as.POSIXct(c, tz=tzone) Erreur dans strptime(xx, f - %Y-%m-%d %H:%M:%OS, tz = tz) : valeur 'tz' incorrecte ??? Objects of the POSIXlt and POSIXct classes don't support multiple time zones, so if you specified several time zones on input, how would the conversion functions decide which one to use for output? You'll need to write your own wrapper function to make this decision, and do the conversions separately for each input timezone. Why don't those classes support a separate time zone for each entry? Presumably because their designer never thought anyone would want to do that. Duncan Murdoch Thanks, Denis Chabot sessionInfo() R version 2.9.1 Patched (2009-07-09 r48929) x86_64-apple-darwin9.7.0 locale: fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.9.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with as.POSIXct and daylight savings time
Thanks for the suggestion, Spencer. I will take a look and will report to the list if I find this a better solution for my situation. Might take a couple of days though. Denis Le 09-07-19 à 12:42, spencerg a écrit : Have you considered the timeDate package? Spencer Denis Chabot wrote: Thank you very much Duncan. I'll follow your suggestion. Why do I want to do what the designer did not think anyone would want to do? I have data acquisition equipment taking measurements every 15 min or so for days at a time, and I need to compile all such experiments in a master data set. The data acquisition equipment automatically switches to DST in spring and back to ST in autumn, which I did not disable because it is easier to work with while we are running the experiments. I could use chron to ignore time zones and daylight savings time, but this would not be of much help as whether or not I use as.POSIXct or chron, there is one day of the year that has 25 h and I need to deal with that 25th hour or I'll lose one hour of data! Denis Le 09-07-19 à 11:45, Duncan Murdoch a écrit : On 19/07/2009 11:23 AM, Denis Chabot wrote: [was [R] end of daylight saving time] Hi, I got no reply with the previous subject line, probably a bad choice of subject on my part, so here it is again. I read from the help on DateTimeClasses and various posts on this list that, quite logically, one needs to specify if DST is active or not when time is between 1 and 2 AM on the first Sunday in November (for North America in recent years). This I can do for on date at a time: a - as.POSIXct(2008-11-02 01:30:00, tz=EST5EDT) # to get automatic use of DST b - as.POSIXct(2008-11-02 01:30:00, tz=EST) # to tell T this is the second occurrence of 1:30 that day, in ST difftime(b,a) Time difference of 1 hours But why can't I do the following, which appears to be a typical R way of doing things, to handle several date-times at once? c - rep(2008-11-02 01:30:00, 2) tzone = c(EST5EDT, EST) as.POSIXct(c, tz=tzone) Erreur dans strptime(xx, f - %Y-%m-%d %H:%M:%OS, tz = tz) : valeur 'tz' incorrecte ??? Objects of the POSIXlt and POSIXct classes don't support multiple time zones, so if you specified several time zones on input, how would the conversion functions decide which one to use for output? You'll need to write your own wrapper function to make this decision, and do the conversions separately for each input timezone. Why don't those classes support a separate time zone for each entry? Presumably because their designer never thought anyone would want to do that. Duncan Murdoch Thanks, Denis Chabot sessionInfo() R version 2.9.1 Patched (2009-07-09 r48929) x86_64-apple-darwin9.7.0 locale: fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.9.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sampling of non-overlapping intervals of variable length
Hi, I hope I am not repeating a question which has been posed already. I am trying to do the following in the most efficient way: I would like to sample from a finite (large) set of integers n non-overlapping intervals, where each interval i has a different, set length L_i (which is the number of integers in the interval). I had the idea to sample recursively on a vector with the already chosen intervals discarded but that seems to be too complicated. Any suggestions on that? Thanks a lot. Hadassa -- Hadassa Brunschwig PhD Student Department of Statistics The Hebrew University of Jerusalem http://www.stat.huji.ac.il __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with as.POSIXct and daylight savings time
as.chron.POSIXt has an offset= argument and it may be a vector: args(chron:::as.chron.POSIXt) function (x, offset = 0, tz = GMT, ...) On Sun, Jul 19, 2009 at 9:10 AM, Denis Chabotchab...@globetrotter.net wrote: [was [R] end of daylight saving time] Hi, I got no reply with the previous subject line, probably a bad choice of subject on my part, so here it is again. I read from the help on DateTimeClasses and various posts on this list that, quite logically, one needs to specify if DST is active or not when time is between 1 and 2 AM on the first Sunday in November (for North America in recent years). This I can do for on date at a time: a - as.POSIXct(2008-11-02 01:30:00, tz=EST5EDT) # to get automatic use of DST b - as.POSIXct(2008-11-02 01:30:00, tz=EST) # to tell T this is the second occurrence of 1:30 that day, in ST difftime(b,a) Time difference of 1 hours But why can't I do the following, which appears to be a typical R way of doing things, to handle several date-times at once? c - rep(2008-11-02 01:30:00, 2) tzone = c(EST5EDT, EST) as.POSIXct(c, tz=tzone) Erreur dans strptime(xx, f - %Y-%m-%d %H:%M:%OS, tz = tz) : valeur 'tz' incorrecte ??? Thanks, Denis Chabot sessionInfo() R version 2.9.1 Patched (2009-07-09 r48929) x86_64-apple-darwin9.7.0 locale: fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.9.1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (-8)^(1/3) == NaN?
It' true, but if you type -8^(1/3) returns -2, and if you type -8^1/3 it returns -2.6, maybe there are some rules about parenthesis... regards Víctor De: r-help-boun...@r-project.org en nombre de Dave DeBarr Enviado el: sáb 18/07/2009 05:04 Para: r-help@r-project.org Asunto: [R] (-8)^(1/3) == NaN? Why does the expression (-8)^(1/3) return NaN, instead of -2? This is not answered by http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-are-powers-of-negative-numbers-wrong_003f Thanks, Dave [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (-8)^(1/3) == NaN?
It is the order of operator precedence. Look at what you typed: -8^1/3 is parsed as -(8^1)/3 = -2.6 On Sun, Jul 19, 2009 at 1:12 PM, Victor Manuel Garcia Guerrerovmgar...@colmex.mx wrote: It' true, but if you type -8^(1/3) returns -2, and if you type -8^1/3 it returns -2.6, maybe there are some rules about parenthesis... regards Víctor De: r-help-boun...@r-project.org en nombre de Dave DeBarr Enviado el: sáb 18/07/2009 05:04 Para: r-help@r-project.org Asunto: [R] (-8)^(1/3) == NaN? Why does the expression (-8)^(1/3) return NaN, instead of -2? This is not answered by http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-are-powers-of-negative-numbers-wrong_003f Thanks, Dave [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] space in column name
I read a table from Microsoft Access using RODBC. Some of the variables had a name with a space in it. R has no problem with it but I do. I cannot find out how to specify the space names(alltime) [1] IDLVL7 Ref Pv No Ref Pv Name DOS Pt Last Name Pt First Name MRN CPT CPT Desc DxCd1 DxCd2 DxCd3 DxCd4 [15] DOE But what do I do if I want to do something such as this alltime[grep(MIDDLE EAR EXPLORE,alltime$CPT Desc,] Error: unexpected symbol in alltime[grep(MIDDLE EAR EXPLORE,alltime$CPT Desc Farrel Buchinsky Sent from Pittsburgh, Pennsylvania, United States [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trouble using optim for maximalisation of 2-parameter function
try: # first argument of llik.nor has to be the parameter llik.nor-function(theta,x){mu-theta[1];sig-theta[2];-length(x)*log(sqrt(2*pi))-length(x)*log(sig)-(1/(2*sig**2))*sum((x-mu)**2)} # optim by default does minimization # setting fnscale = -1 one obtains a maximization problem optim(c(1,5), llik.nor, x=x, control = list(fnscale = -1)) hth, Matthias Anjali Sing schrieb: Hello, I am having trouble using optim. I want to maximalise a function to its parameters [kind of like: univariate maximum likelihood estimation, but i wrote the likelihood function myself because of data issues ] When I try to optimize a function for only one parameter there is no problem: llik.expo-function(x,lam){(length(x)*log(lam))-(length(x)*log(1-exp(-1*lam* *cons*)))-lam*sum(x)} cons- data-c(.) expomx-optimize(llik.expo,c(0,100),maximum=TRUE,tol=0.0001,x=data) expomx To optimize to two parameters you can't use optimize, so I tried the following to test my input: llik.nor-function(x,theta){mu-theta[1];sig-theta[2];-length(x)*log(sqrt(2*pi))-length(x)*log(sig)-(1/(2*sig**2))*sum((x-mu)**2)} x-c(-1,4,6,4,2) normx-optim(c(1,20),llik.nor) the output should be close to parameters: mu=3 and sigma=2.366 [This I calculated by hand to compare with the output] but in stead of output I get an error: Error in fn(par, ...) : argument theta is missing, with no default I don't understand why this happened? I hope someone can help me with this for I am still a [R]ookie. Kind regards, Sonko Lady [Anjali] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] space in column name
Farrel Buchinsky-3 wrote: I read a table from Microsoft Access using RODBC. Some of the variables had a name with a space in it. R has no problem with it but I do. I cannot find out how to specify the space names(alltime) [1] IDLVL7 Ref Pv No Ref Pv Name DOS Pt Last Name Pt First Name MRN CPT CPT Desc DxCd1 DxCd2 DxCd3 DxCd4 [15] DOE But what do I do if I want to do something such as this alltime[grep(MIDDLE EAR EXPLORE,alltime$CPT Desc,] Error: unexpected symbol in alltime[grep(MIDDLE EAR EXPLORE,alltime$CPT Desc Farrel Buchinsky The following might work: alltime[grep(MIDDLE EAR EXPLORE,alltime[[ CPT Desc ]] ] -Charlie - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://www.nabble.com/space-in-column-name-tp24559626p24559726.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] space in column name
cls59 wrote: The following might work: alltime[grep(MIDDLE EAR EXPLORE,alltime[[ CPT Desc ]] ] -Charlie ACK! Terribly sorry about the double post- but I forgot to close the quote. It should be: alltime[grep(MIDDLE EAR EXPLORE,alltime[[ CPT Desc ]] ] Maybe I should wait until AFTER my morning coffee to browse R-help... - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://www.nabble.com/space-in-column-name-tp24559626p24559754.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] space in column name
I sifted some more and read about a workaround for the problem. I could simply rename the columns so that there were no more spaces names(alltime) -gsub( ,., names(alltime)) names(alltime) [1] IDLVL7 Ref.Pv.No Ref.Pv.Name DOS Pt.Last.Name Pt.First.Name MRN CPT CPT.Desc DxCd1 DxCd2 DxCd3 DxCd4 [15] DOE Farrel Buchinsky Google Voice Tel: (412) 567-7870 Sent from Pittsburgh, Pennsylvania, United States On Sun, Jul 19, 2009 at 14:32, Farrel Buchinsky fjb...@gmail.com wrote: I read a table from Microsoft Access using RODBC. Some of the variables had a name with a space in it. R has no problem with it but I do. I cannot find out how to specify the space names(alltime) [1] IDLVL7 Ref Pv No Ref Pv Name DOS Pt Last Name Pt First Name MRN CPT CPT Desc DxCd1 DxCd2 DxCd3 DxCd4 [15] DOE But what do I do if I want to do something such as this alltime[grep(MIDDLE EAR EXPLORE,alltime$CPT Desc,] Error: unexpected symbol in alltime[grep(MIDDLE EAR EXPLORE,alltime$CPT Desc Farrel Buchinsky Sent from Pittsburgh, Pennsylvania, United States [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sampling of non-overlapping intervals of variable length
On Jul 19, 2009, at 1:05 PM, Hadassa Brunschwig wrote: Hi, I hope I am not repeating a question which has been posed already. I am trying to do the following in the most efficient way: I would like to sample from a finite (large) set of integers n non- overlapping intervals, where each interval i has a different, set length L_i (which is the number of integers in the interval). I had the idea to sample recursively on a vector with the already chosen intervals discarded but that seems to be too complicated. It might be ridiculously easy if you sampled on an index of a group of intervals. Why not pose the question in the form of example data.frames or other classes of R objects? Specification of the desired output would be essential. I think further specification of the sampling strategy would also help because I am unable to understand what sort of probability model you are hoping to apply. Any suggestions on that? Thanks a lot. Hadassa -- Hadassa Brunschwig PhD Student Department of Statistics David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] space in column name
use 'make.names' make.names(MIDDLE EAR EXPLORE) [1] MIDDLE.EAR.EXPLORE On Sun, Jul 19, 2009 at 2:32 PM, Farrel Buchinskyfjb...@gmail.com wrote: I read a table from Microsoft Access using RODBC. Some of the variables had a name with a space in it. R has no problem with it but I do. I cannot find out how to specify the space names(alltime) [1] ID LVL7 Ref Pv No Ref Pv Name DOS Pt Last Name Pt First Name MRN CPT CPT Desc DxCd1 DxCd2 DxCd3 DxCd4 [15] DOE But what do I do if I want to do something such as this alltime[grep(MIDDLE EAR EXPLORE,alltime$CPT Desc,] Error: unexpected symbol in alltime[grep(MIDDLE EAR EXPLORE,alltime$CPT Desc Farrel Buchinsky Sent from Pittsburgh, Pennsylvania, United States [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] trouble using optim for maximalisation of 2-parameter function
Hello, I am having trouble using optim. I want to maximalise a function to its parameters [kind of like: univariate maximum likelihood estimation, but i wrote the likelihood function myself because of data issues ] When I try to optimize a function for only one parameter there is no problem: llik.expo-function(x,lam){(length(x)*log(lam))-(length(x)*log(1-exp(-1*lam* *cons*)))-lam*sum(x)} cons- data-c(.) expomx-optimize(llik.expo,c(0,100),maximum=TRUE,tol=0.0001,x=data) expomx To optimize to two parameters you can't use optimize, so I tried the following to test my input: llik.nor-function(x,theta){mu-theta[1];sig-theta[2];-length(x)*log(sqrt(2*pi))-length(x)*log(sig)-(1/(2*sig**2))*sum((x-mu)**2)} x-c(-1,4,6,4,2) normx-optim(c(1,20),llik.nor) the output should be close to parameters: mu=3 and sigma=2.366 [This I calculated by hand to compare with the output] but in stead of output I get an error: Error in fn(par, ...) : argument theta is missing, with no default I don't understand why this happened? I hope someone can help me with this for I am still a [R]ookie. Kind regards, Sonko Lady [Anjali] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] space in column name
Farrel Buchinsky-3 wrote: I sifted some more and read about a workaround for the problem. I could simply rename the columns so that there were no more spaces names(alltime) -gsub( ,., names(alltime)) That would certainly be a solution. The method I was trying to demonstrate is that in addition to using the $ sign syntax, lists and data.frames can be accessed using strings- much like a hash. In order to do so, you must use the [[]] or [] methods. For example, just as you can do the following for a vector: x - c(1,3,5) for( i in 1:length(x) ){ print( x[ i ] ) } You could do something similar for your data.frame: for( i in names(alltime) ){ print( alltime[[ i ]] ) } The important thing to note is that if you tried to use alltime$i in the above loop, you would get a bunch of NULLS as there is no component named 'i'. Hope that helps! -Charlie - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://www.nabble.com/space-in-column-name-tp24559626p24559869.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sampling of non-overlapping intervals of variable length
Hi I am not sure what you mean by sampling an index of a group of intervals. I will try to give an example: Let's assume I have a vector 1:100. Let's say I have 10 intervals of different but known length, say, c(4,6,11,2,8,14,7,2,18,32). For simulation purposes I have to sample those 10 intervals 1000 times. The requirement is, however, that they should be of those lengths and should not be overlapping. In short, I would like to obtain a 10x1000 matrix with sampled intervals. Thanks Hadassa On Sun, Jul 19, 2009 at 9:48 PM, David Winsemiusdwinsem...@comcast.net wrote: On Jul 19, 2009, at 1:05 PM, Hadassa Brunschwig wrote: Hi, I hope I am not repeating a question which has been posed already. I am trying to do the following in the most efficient way: I would like to sample from a finite (large) set of integers n non-overlapping intervals, where each interval i has a different, set length L_i (which is the number of integers in the interval). I had the idea to sample recursively on a vector with the already chosen intervals discarded but that seems to be too complicated. It might be ridiculously easy if you sampled on an index of a group of intervals. Why not pose the question in the form of example data.frames or other classes of R objects? Specification of the desired output would be essential. I think further specification of the sampling strategy would also help because I am unable to understand what sort of probability model you are hoping to apply. Any suggestions on that? Thanks a lot. Hadassa -- Hadassa Brunschwig PhD Student Department of Statistics David Winsemius, MD Heritage Laboratories West Hartford, CT -- Hadassa Brunschwig PhD Student Department of Statistics The Hebrew University of Jerusalem http://www.stat.huji.ac.il __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] abline(v= x) in plot with time formated xaxis not working
Sorry for the lack of the plot. It's is just a simple xy plot with xvalues formated as shown in my post (strptime()). Anyway, here a working example: rm(list=ls()) x-c(100*1500:1559, 100*1600:1659) x-strptime(x, format=%H%M%S) y-c(CO2$conc, CO2$conc[1:36]) plot(x, y, pch=18, type=p, ylab=yfoo, xlab=xfoo, yaxs=r) abline(h=100, v=0, col=gray60, lty=2) text.xvalue-151500 text.xvalue-strptime(text.xvalue, format=%H%M%S) text(text.xvalue, 80, detection threshold, col=gray60, cex=0.8) event.1-152833 event.1-strptime(event.1, format=%H%M%S) abline(v=event.1, col=red) The format for the xvalues works in the case of text positioning, but doesn't in the case of abline. JimHoltman replied this topic via email with a solution: -as.POSIXct(strptime(y, format=%H%M%S)) I still don't know why this second function is necessary since everything works without it, except for the abline-thing... Nevertheless, thanks for all the suggestions! Fritz -- View this message in context: http://www.nabble.com/abline%28v%3D-x%29-in-plot-with-time-formated-xaxis-not-working-tp24505884p24558044.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can I use mcnemar.test for 3*3 tables (or is there a bug in the command?)
On Sun, 19 Jul 2009, Tal Galili wrote: Hello David,Thank you for your answer. Do you know then what does the mcnemar.test do in the case of a 3*3 table ? print(mcnemar.test) will show you what it does. Because the results for the simple example I gave are rather different (P value of 0.053 VS 0.73) The test mcnemar.test() constructs is one of symmetry, which is equivalent to marginal homogenity in hierarchical log-linear models as I recall from Bishop, Fienberg, and Holland's 1975 opus on count data. Stuart-Maxwell uses the dispersion matrix of marginal difference. These are two different tests. I suspect that Stuart-Maxwell is less susceptible to continuity issues in very sparse tables, which may account for the difference you see here. In case the mcnemar can't really handle a 3*3 matrix (or more), shouldn't there be an error massage for this case? (if so, who should I turn to, in order to report this?) Well, the code is pretty straightforward and mcnemar.test(matrix(1:16,4)) returns 11.5495 which is correct. It looks like there is nothing to report. 3,1,5), ncol = 3 Chuck Thanks again, Tal On Sun, Jul 19, 2009 at 3:47 PM, David Freedman 3.14da...@gmail.com wrote: There is a function mh_test in the coin package. library(coin) mh_test(tt) The documentation states, The null hypothesis of independence of row and column totals is tested. The corresponding test for binary factors x and y is known as McNemar test. For larger tables, Stuart?s W0 statistic (Stuart, 1955, Agresti, 2002, page 422, also known as Stuart-Maxwell test) is computed. hth, david freedman Tal Galili wrote: Hello all, I wish to perform a mcnemar test for a 3 by 3 matrix. By running the slandered R command I am getting a result but I am not sure I am getting the correct one. Here is an example code: (tt - as.table(t(matrix(c(1,4,1, 0,5,5, 3,1,5), ncol = 3 mcnemar.test(tt, correct=T) #And I get: McNemar's Chi-squared test data: tt McNemar's chi-squared = 7.6667, df = 3, p-value = *0.05343* Now I was wondering if the test I just performed is the correct one. From looking at the Wikipedia article on mcnemar ( http://en.wikipedia.org/wiki/McNemar's_test), it is said that: The Stuart-Maxwell testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm is different generalization of the McNemar test, used for testing marginal homogeneity in a square table with more than two rows/columns From searching for a Stuart-Maxwell testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm in google, I found an algorithm here: http://www.m-hikari.com/ams/ams-password-2009/ams-password9-12-2009/abbasiAMS9-12-2009.pdf From running this algorithm I am getting a different P value, here is the (somewhat ugly) code I produced for this: get.d - function(xx) { length1 - dim(xx)[1] ret1 - margin.table(xx,1) - margin.table(xx,2) return(ret1) } get.s - function(xx) { the.s - xx for( i in 1:dim(xx)[1]) { for(j in 1:dim(xx)[2]) { if(i == j) { the.s[i,j] - margin.table(xx,1)[i] + margin.table(xx,2)[i] - 2*xx[i,i] } else { the.s[i,j] - -(xx[i,j] + xx[j,i]) } } } return(the.s) } chi.statistic - t(get.d(tt)[-3]) %*% solve(get.s(tt)[-3,-3]) %*% get.d(tt)[-3] paste(the P value:, pchisq(chi.statistic, 2)) #and the result was: the P value: 0.268384371053358 So to summarize my questions: 1) can I use mcnemar.test for 3*3 (or more) tables ? 2) if so, what test is being performed ( Stuart-Maxwell http://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm) ? 3) Do you have a recommended link to an explanation of the algorithm employed? Thanks, Tal -- -- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: http://www.r-statistics.com/ http://www.talgalili.com http://www.biostatistics.co.il [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Can-I-use-%22mcnemar.test%22-for-3*3-tables-%28or-is-there-a-bug-in-the-command-%29-tp24556414p24556693.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- -- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs:
Re: [R] Sampling of non-overlapping intervals of variable length
On Jul 19, 2009, at 3:11 PM, Hadassa Brunschwig wrote: Hi I am not sure what you mean by sampling an index of a group of intervals. I will try to give an example: If you had a dataframe of the following sort: dfint start stop 3 7 12 20 40 45 60 72 And you wanted to generate a set of 100 samples with equal probability of occurring in any one of those intervals, you might sample first on the index of the intervals: idx=sample(1:4, 100, replace=TRUE) some sort of appropriate iterative construct ### and then sample within the intervals. sample((dfint[idx,1]):(dfint[idx,2]), 1) end loop Let's assume I have a vector 1:100. Let's say I have 10 intervals of different but known length, say, c(4,6,11,2,8,14,7,2,18,32). For simulation purposes I have to sample those 10 intervals 1000 times. The requirement is, however, that they should be of those lengths and should not be overlapping. In short, I would like to obtain a 10x1000 matrix with sampled intervals. I am trying to understand how the vector 1:100 relates to the intervals. What do you mean by sample the 10 intervals? For one thing what you have offered are not intervals at all. Are you saying (in part at least) that you want equal probabilities of a sampled element to come from each of the intervals, however, they might be defined? Or do you want the sampling probabilities to vary from interval to interval. I say again, ... a concrete example could do marvels in communicating your goals. Do Thanks Hadassa On Sun, Jul 19, 2009 at 9:48 PM, David Winsemiusdwinsem...@comcast.net wrote: On Jul 19, 2009, at 1:05 PM, Hadassa Brunschwig wrote: Hi, I hope I am not repeating a question which has been posed already. I am trying to do the following in the most efficient way: I would like to sample from a finite (large) set of integers n non-overlapping intervals, where each interval i has a different, set length L_i (which is the number of integers in the interval). I had the idea to sample recursively on a vector with the already chosen intervals discarded but that seems to be too complicated. It might be ridiculously easy if you sampled on an index of a group of intervals. Why not pose the question in the form of example data.frames or other classes of R objects? Specification of the desired output would be essential. I think further specification of the sampling strategy would also help because I am unable to understand what sort of probability model you are hoping to apply. Any suggestions on that? Thanks a lot. Hadassa -- Hadassa Brunschwig PhD Student Department of Statistics David Winsemius, MD Heritage Laboratories West Hartford, CT -- Hadassa Brunschwig PhD Student Department of Statistics The Hebrew University of Jerusalem http://www.stat.huji.ac.il David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sampling of non-overlapping intervals of variable length
On Sun, 19 Jul 2009, Hadassa Brunschwig wrote: Hi I am not sure what you mean by sampling an index of a group of intervals. I will try to give an example: Let's assume I have a vector 1:100. Let's say I have 10 intervals of different but known length, say, c(4,6,11,2,8,14,7,2,18,32). For simulation purposes I have to sample those 10 intervals 1000 times. The requirement is, however, that they should be of those lengths and should not be overlapping. In short, I would like to obtain a 10x1000 matrix with sampled intervals. Something like this: lens - c(4,6,11,2,8,14,7,2,18,32) perm.lens - sample(lens) sort(sample(1e06-sum(lens)+length(lens),length(lens)))+cumsum(c(0,head(perm.lens,-1))) [1] 15424 261927 430276 445976 451069 546578 656123 890494 939714 969643 The vector above gives the starting points for the intervals whose lengths are perm.lens. I'd bet every introductory combinatorics book has a problem or example in which the expression for the number of ways in which K ordered objects can be assigned to I groups consisting of n_i adjacent objects each is constructed. The construction is along the lines of the calculation above. HTH, Chuck Thanks Hadassa On Sun, Jul 19, 2009 at 9:48 PM, David Winsemiusdwinsem...@comcast.net wrote: On Jul 19, 2009, at 1:05 PM, Hadassa Brunschwig wrote: Hi, I hope I am not repeating a question which has been posed already. I am trying to do the following in the most efficient way: I would like to sample from a finite (large) set of integers n non-overlapping intervals, where each interval i has a different, set length L_i (which is the number of integers in the interval). I had the idea to sample recursively on a vector with the already chosen intervals discarded but that seems to be too complicated. It might be ridiculously easy if you sampled on an index of a group of intervals. Why not pose the question in the form of example data.frames or other classes of R objects? Specification of the desired output would be essential. I think further specification of the sampling strategy would also help because I am unable to understand what sort of probability model you are hoping to apply. Any suggestions on that? Thanks a lot. Hadassa -- Hadassa Brunschwig PhD Student Department of Statistics David Winsemius, MD Heritage Laboratories West Hartford, CT -- Hadassa Brunschwig PhD Student Department of Statistics The Hebrew University of Jerusalem http://www.stat.huji.ac.il __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] abline(v= x) in plot with time formated xaxis not working
The reason that you need the 'as.POSIXct' is that the 'abline' function does not recognize variables with POSIXlt class. methods(abline) no methods were found Warning message: In methods(abline) : function 'abline' appears not to be generic methods(plot) [1] plot.acf* plot.data.frame*plot.Date* plot.decomposed.ts* plot.default [6] plot.dendrogram*plot.densityplot.ecdf plot.factor*plot.formula* [11] plot.hclust*plot.histogram* plot.HoltWinters* plot.isoreg*plot.lm [16] plot.medpolish* plot.mlmplot.POSIXct* plot.POSIXlt* plot.ppr* [21] plot.prcomp*plot.princomp* plot.profile.nls* plot.sarplot.spec [26] plot.spec.coherency plot.spec.phase plot.srx.file plot.stepfunplot.stl* [31] plot.table* plot.ts plot.tskernel* plot.TukeyHSD 'plot' knows how to handle POSIXct/lt classes. You need to at least convert to POSIXct since the value will be within the range of what the x-axis is: str(event.1) POSIXlt[1:9], format: 2009-07-19 15:28:33 par('usr') [1] 1248015314.4 1248023025.6 58.8 1036.2 unclass(as.POSIXct(event.1)) [1] 1248017313 attr(,tzone) [1] GMT This shows the extent of the x-axis and then what the value of event.1 will be after conversion to POSIXct. HTH On Sun, Jul 19, 2009 at 11:33 AM, F!!p...@students.uni-mainz.de wrote: Sorry for the lack of the plot. It's is just a simple xy plot with xvalues formated as shown in my post (strptime()). Anyway, here a working example: rm(list=ls()) x-c(100*1500:1559, 100*1600:1659) x-strptime(x, format=%H%M%S) y-c(CO2$conc, CO2$conc[1:36]) plot(x, y, pch=18, type=p, ylab=yfoo, xlab=xfoo, yaxs=r) abline(h=100, v=0, col=gray60, lty=2) text.xvalue-151500 text.xvalue-strptime(text.xvalue, format=%H%M%S) text(text.xvalue, 80, detection threshold, col=gray60, cex=0.8) event.1-152833 event.1-strptime(event.1, format=%H%M%S) abline(v=event.1, col=red) The format for the xvalues works in the case of text positioning, but doesn't in the case of abline. JimHoltman replied this topic via email with a solution: -as.POSIXct(strptime(y, format=%H%M%S)) I still don't know why this second function is necessary since everything works without it, except for the abline-thing... Nevertheless, thanks for all the suggestions! Fritz -- View this message in context: http://www.nabble.com/abline%28v%3D-x%29-in-plot-with-time-formated-xaxis-not-working-tp24505884p24558044.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can I use mcnemar.test for 3*3 tables (or is there a bug in the command?)
Charles C. Berry wrote: The test mcnemar.test() constructs is one of symmetry, which is equivalent to marginal homogenity in hierarchical log-linear models as I recall from Bishop, Fienberg, and Holland's 1975 opus on count data. Umm, er... Symmetry in the 3x3 table is a 3DF hypothesis, whereas marginal homogeneity has 2DF, so unless I'm missing a fine point in the requirement of hierarchical log-linear, I'd say that one implies the other, but not the other way around. E.g., you can easily check that the following two matrices have the same homogeneous margins, but only one is symmetric 3 2 1 2 3 2 1 2 3 3 1 2 3 3 1 0 3 3 -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (-8)^(1/3) == NaN?
On Sun, Jul 19, 2009 at 7:33 PM, jim holtmanjholt...@gmail.com wrote: -8^1/3 is parsed as -(8^1)/3 = -2.6 However the following is evaluated as one would expect: 8^(1/3) [1] 2 -8^(1/3) [1] -2 Perhaps it is parsed in this way: -(8^(1/3)) [1] -2 Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (-8)^(1/3) == NaN?
On Sun, Jul 19, 2009 at 12:28 AM, jim holtmanjholt...@gmail.com wrote: First of all, read FAQ 7.31 to understand that 1/3 is not representable in floating point. Also a^b is actually exp(log(a) * b) and log(-8) is not valid (NaN). If this is so, why would the following evaluate as expected? (-8)^(3) [1] -512 Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] svm works but tune.svm give error
Noah Silverman wrote: Hello, I'm using the e1071 library for SVM functions. I can quickly train an SVM with: svm(formula = label ~ ., data = testdata) That works well. I want to tune the parameters, so I tried: tune.svm(label ~ ., data=testdata[1:2000, ], gamma=10^(-6:3), cost=10^(1:2)) THIS FAILS WITH AN ERROR: 'names' attribute [199] must be the same length as the vector [184] I don't understand why this is happening. We do not know, since you forgot to specify a reproducible example. Anyway, is label in your workspace and/or in the testdata? Uwe Ligges Presumably, if the data is correct for training an SVM, then it should also be correct for the tune.svm function. Can anyone help me solve this one? Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparing loadings (next to each other)
Adam D. I. Kramer wrote: Dear colleagues, I've been running some principal components analyses, which generate tables of loadings that I'm interested in looking at. print(f1$rot$load,cutoff=.4) is what I use, and it gives me what I want. However, I'm now interested in comparing these loadings across a few data sets. In other words, I would like R to match the loadings on rownames() and display them next to each other, e.g.: Loadings: PC 1PC 2PC 1PC 2PC 1PC 2 col10.400.450.90 col20.800.55 col30.770.700.42 ...I could certainly just cbind them together, but then I can't class them as loadings: class(x) - loadings What is x? What is loadings? Where is the reproducible code? Uwe Ligges Error in class(x) - loadings : cannot coerce type 'closure' to vector of type 'character' ...and I'm very interested in using the cutoff feature of print.loadings. Any suggestions? I could go in and alter print.loadings myself, but if there's an easier way let me know. Many thanks, -- Adam D. I. Kramer a...@uoregon.edu Ph.D. Candidate, Social and Personality Psycholgoy University of Oregon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (-8)^(1/3) == NaN?
If the power that a number is being raised to is integer, then is does evaluate honoring the unary minus. (-2) ^ 5 #integer power [1] -32 (-2) ^ 5.1 [1] NaN -8^(1/3) is parsed as -(8^(1/3)) according to operator precedence. On Sun, Jul 19, 2009 at 4:49 PM, Liviu Androniclandronim...@gmail.com wrote: On Sun, Jul 19, 2009 at 12:28 AM, jim holtmanjholt...@gmail.com wrote: First of all, read FAQ 7.31 to understand that 1/3 is not representable in floating point. Also a^b is actually exp(log(a) * b) and log(-8) is not valid (NaN). If this is so, why would the following evaluate as expected? (-8)^(3) [1] -512 Liviu -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ifelse choices in a data.frame?
Hi, In my data.frame I wanted to essentially write If(Test) c*d else c+d but that doesn't work. I found I could do it mathematically, but it seems forced and won't scale well for nested logic. I have two examples below writing columns e f, but I don't think the code is self-documenting as it depends on knowing that Test is a TRUE/FALSE. Is there a better way to do the following? Thanks, Mark DF - data.frame(cbind(a=1:4, b=1:2, c=1:8, d=1:16, e=0, f=0)) DF$Test - with(DF, a == b) DF$e = (DF$c*DF$d) * DF$Test + (DF$c+DF$d) * !DF$Test DF$f = with(DF, (c*d)*Test + (c+d)*!Test) DF __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ifelse choices in a data.frame?
On 19/07/2009 5:17 PM, Mark Knecht wrote: Hi, In my data.frame I wanted to essentially write If(Test) c*d else c+d but that doesn't work. I found I could do it mathematically, but it seems forced and won't scale well for nested logic. I have two examples below writing columns e f, but I don't think the code is self-documenting as it depends on knowing that Test is a TRUE/FALSE. Is there a better way to do the following? Thanks, Mark DF - data.frame(cbind(a=1:4, b=1:2, c=1:8, d=1:16, e=0, f=0)) DF$Test - with(DF, a == b) DF$e = (DF$c*DF$d) * DF$Test + (DF$c+DF$d) * !DF$Test DF$f = with(DF, (c*d)*Test + (c+d)*!Test) Use ifelse(). DF$f - with(DF, ifelse(a == b, c*d, c+d)) Duncan Murdoch DF __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ifelse choices in a data.frame?
use ifelse: DF - data.frame(cbind(a=1:4, b=1:2, c=1:8, d=1:16, e=0, f=0)) DF$Test - with(DF, a == b) DF$e = (DF$c*DF$d) * DF$Test + (DF$c+DF$d) * !DF$Test DF$f = with(DF, (c*d)*Test + (c+d)*!Test) #or DF$f.1 - ifelse(DF$Test, DF$c * DF$d, DF$c + DF$d) head(DF) a b c d e f Test f.1 1 1 1 1 1 1 1 TRUE 1 2 2 2 2 2 4 4 TRUE 4 3 3 1 3 3 6 6 FALSE 6 4 4 2 4 4 8 8 FALSE 8 5 1 1 5 5 25 25 TRUE 25 6 2 2 6 6 36 36 TRUE 36 On Sun, Jul 19, 2009 at 5:17 PM, Mark Knechtmarkkne...@gmail.com wrote: Hi, In my data.frame I wanted to essentially write If(Test) c*d else c+d but that doesn't work. I found I could do it mathematically, but it seems forced and won't scale well for nested logic. I have two examples below writing columns e f, but I don't think the code is self-documenting as it depends on knowing that Test is a TRUE/FALSE. Is there a better way to do the following? Thanks, Mark DF - data.frame(cbind(a=1:4, b=1:2, c=1:8, d=1:16, e=0, f=0)) DF$Test - with(DF, a == b) DF$e = (DF$c*DF$d) * DF$Test + (DF$c+DF$d) * !DF$Test DF$f = with(DF, (c*d)*Test + (c+d)*!Test) DF __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ifelse choices in a data.frame?
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Mark Knecht Sent: Sunday, July 19, 2009 2:18 PM To: r-help Subject: [R] ifelse choices in a data.frame? Hi, In my data.frame I wanted to essentially write If(Test) c*d else c+d but that doesn't work. I found I could do it mathematically, but it seems forced and won't scale well for nested logic. I have two examples below writing columns e f, but I don't think the code is self-documenting as it depends on knowing that Test is a TRUE/FALSE. Is there a better way to do the following? Thanks, Mark DF - data.frame(cbind(a=1:4, b=1:2, c=1:8, d=1:16, e=0, f=0)) DF$Test - with(DF, a == b) DF$e = (DF$c*DF$d) * DF$Test + (DF$c+DF$d) * !DF$Test DF$f = with(DF, (c*d)*Test + (c+d)*!Test) DF Mark, Why can't you use ifelse() ? DF$g - with(DF, ifelse(Test,c*d,c+d)) Have I missed something in what you are doing? Dan Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ifelse choices in a data.frame?
That's much better. thanks! I was sure I'd looked at that but then for some reason stuck with just if. Again, as a total a newb I appreciate the help. - Mark On Sun, Jul 19, 2009 at 2:26 PM, Duncan Murdochmurd...@stats.uwo.ca wrote: On 19/07/2009 5:17 PM, Mark Knecht wrote: Hi, In my data.frame I wanted to essentially write If(Test) c*d else c+d but that doesn't work. I found I could do it mathematically, but it seems forced and won't scale well for nested logic. I have two examples below writing columns e f, but I don't think the code is self-documenting as it depends on knowing that Test is a TRUE/FALSE. Is there a better way to do the following? Thanks, Mark DF - data.frame(cbind(a=1:4, b=1:2, c=1:8, d=1:16, e=0, f=0)) DF$Test - with(DF, a == b) DF$e = (DF$c*DF$d) * DF$Test + (DF$c+DF$d) * !DF$Test DF$f = with(DF, (c*d)*Test + (c+d)*!Test) Use ifelse(). DF$f - with(DF, ifelse(a == b, c*d, c+d)) Duncan Murdoch DF __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (-8)^(1/3) == NaN?
On 20/07/2009, at 9:13 AM, jim holtman wrote: If the power that a number is being raised to is integer, then is does evaluate honoring the unary minus. (-2) ^ 5 #integer power [1] -32 (-2) ^ 5.1 [1] NaN snip I was vaguely aware of this ... but it now triggers in my mind the question of how the ^ function decides when the exponent is an integer. A bit of experimentation seems to indicate that, e.g., (-2)^x ``works'' if (and only if?) round(x)==x returns TRUE. Note that (-2)^x may NOT ``work'' in some cases were all.equal(x,round (x)) returns TRUE. Young players should also be aware of the following trap. It can happen that n + epsilon ``is an integer'' according to my rule, but m + epsilon is NOT an integer according to this rule. Where m and n are both integers. E.g.: eps - 0.4e-15 x - 5+eps x==round(x) [1] TRUE y - 3+eps y==round(y) [1] FALSE This is of course due to the exigencies of how n and m are represented in floating point arithmetic. Not too deep once you're aware of the problem, but it can still be a ``gotcha'' if one is not alert. (Be alert. The world needs more lerts!) cheers, Rolf Turner P. S. Perhaps young players should be reminded at this point that is.integer() is no help here. This function tells you about the ***storage mode*** of its argument. Only. R. T. ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] combining 4 vectors into one
Hi, How can I perform the following y1y2y3y4 1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0 into ... y 1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0 Please cc me on reply as I subscribe to the digest. Thanks! Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (-8)^(1/3) == NaN?
It also works for raising a number to a negative integer: (-3)^(-3) [1] -0.03703704 On Sun, Jul 19, 2009 at 6:23 PM, Rolf Turnerr.tur...@auckland.ac.nz wrote: On 20/07/2009, at 9:13 AM, jim holtman wrote: If the power that a number is being raised to is integer, then is does evaluate honoring the unary minus. (-2) ^ 5 #integer power [1] -32 (-2) ^ 5.1 [1] NaN snip I was vaguely aware of this ... but it now triggers in my mind the question of how the ^ function decides when the exponent is an integer. A bit of experimentation seems to indicate that, e.g., (-2)^x ``works'' if (and only if?) round(x)==x returns TRUE. Note that (-2)^x may NOT ``work'' in some cases were all.equal(x,round(x)) returns TRUE. Young players should also be aware of the following trap. It can happen that n + epsilon ``is an integer'' according to my rule, but m + epsilon is NOT an integer according to this rule. Where m and n are both integers. E.g.: eps - 0.4e-15 x - 5+eps x==round(x) [1] TRUE y - 3+eps y==round(y) [1] FALSE This is of course due to the exigencies of how n and m are represented in floating point arithmetic. Not too deep once you're aware of the problem, but it can still be a ``gotcha'' if one is not alert. (Be alert. The world needs more lerts!) cheers, Rolf Turner P. S. Perhaps young players should be reminded at this point that is.integer() is no help here. This function tells you about the ***storage mode*** of its argument. Only. R. T. ## Attention:This e-mail message is privileged and confidential. If you are not theintended recipient please delete the message and notify the sender.Any views or opinions presented are solely those of the author. This e-mail has been scanned and cleared by MailMarshalwww.marshalsoftware.com ## -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combining 4 vectors into one
Try this: x y1 y2 y3 y4 1 1 0 3 2 2 0 1 2 1 3 3 2 0 1 4 0 5 1 0 as.vector(t(x)) [1] 1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0 On Sun, Jul 19, 2009 at 7:25 PM, Christopher Desjardinscddesjard...@gmail.com wrote: Hi, How can I perform the following y1 y2 y3 y4 1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0 into ... y 1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0 Please cc me on reply as I subscribe to the digest. Thanks! Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combining 4 vectors into one
Thanks. That worked. Chris On 7/19/09 6:41 PM, jim holtman wrote: Try this: x y1 y2 y3 y4 1 1 0 3 2 2 0 1 2 1 3 3 2 0 1 4 0 5 1 0 as.vector(t(x)) [1] 1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0 On Sun, Jul 19, 2009 at 7:25 PM, Christopher Desjardinscddesjard...@gmail.com wrote: Hi, How can I perform the following y1y2y3y4 1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0 into ... y 1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0 Please cc me on reply as I subscribe to the digest. Thanks! Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can I use mcnemar.test for 3*3 tables (or is there a bug in the command?)
On Sun, 19 Jul 2009, Peter Dalgaard wrote: Charles C. Berry wrote: The test mcnemar.test() constructs is one of symmetry, which is equivalent to marginal homogenity in hierarchical log-linear models as I recall from Bishop, Fienberg, and Holland's 1975 opus on count data. Umm, er... Symmetry in the 3x3 table is a 3DF hypothesis, whereas marginal homogeneity has 2DF, so unless I'm missing a fine point in the requirement of hierarchical log-linear, I'd say that one implies the other, but not the other way around. Right, symmetry equals marginal homogenity plus 'quasi-symmetry' - a condition on the odds-ratios of a two way table and here that condition uses one degree of freedom. But, representing marginal homogenity in log-linear models gets sticky without that quasi-symmetry condition. --- Taking m_{ij} to be the expected cell frequencies in a two way table, the log-linear model for the two way table is log m_{ij} = \mu + \mu_{1(i)} + \mu_{2(j)} + \mu_{12(ij)} with side conditions that any of the subscripted \mu terms sums to zero over any of its subscripts. In the notation here, \mu is an intercept, \mu_1 terms are row effects, \mu_2 terms are column effects, and \mu_{12} terms are interactions of the row and columns. The parenthical terms (i), (j), or (ij) index the row, column, or cell. In the case of the 3 x 3 table, there are 1, 2, 2, and 4 degrees of freedom respectively for each of the sets of terms in the saturated log-linear model. --- Marginal homogenity says m_{i+} = m_{+i}, all i, taking m_{ij} to be the expected cell frequencies and the {i+} notation to indicate summation over the missing subscript. --- Trying to set up a log-linear model for marginal homogeneity would lead you to equate the row and column effects: log m_{ij} = \mu + \mu_{1(i)} + \mu_{1(j)} + \mu_{12(ij)} but this does not imply marginal homogenity given the side conditions unless the \mu_{12(ij)} obey additional constraints which also implies symmetry. E.g., you can easily check that the following two matrices have the same homogeneous margins, but only one is symmetric 3 2 1 2 3 2 1 2 3 3 1 2 3 3 1 0 3 3 If you want to represent this last table as m_{ij} = \exp(\mu + \mu_{1(i)} + \mu_{1(j)} + \mu_{12(ij)}) you cannot get there with the side conditions that are imposed on \mu_{12}. You need additional terms. Chuck -- O__ Peter Dalgaard ??ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - (p.dalga...@biostat.ku.dk) FAX: (+45) ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 35327907 Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Arules questions. I need some help please
Question 2a) I am also working with arules package and I have the following problem let suppose the matrix b like: b-matrix(c(1,1,1,1,1,1,0,0,1,1,1,1,0,0,1,1,0,1,1,1,1,1,1,1),nrow=6) rownames(b)=c(T1, T2, T3, T4, T5, T6) colnames(b)=c(It1, It2, It3, It4) bt-as(b, transactions) rules-apriori(bt, parameter = list(maxlen=2)) k-inspect(rules) k ##we obtain lhs rhs support confidence lift 1 {}= {It4} 1.000 1 1.0 2 {}= {It1} 1.000 1 1.0 3 {It3} = {It2} 0.500 1 1.5 4 {It3} = {It4} 0.500 1 1.0 5 {It3} = {It1} 0.500 1 1.0 6 {It2} = {It4} 0.667 1 1.0 7 {It2} = {It1} 0.667 1 1.0 8 {It4} = {It1} 1.000 1 1.0 9 {It1} = {It4} 1.000 1 1.0 WRITE(rules) ##we obtain rules support confidence lift 1 {} = {It4} 1 1 1 2 {} = {It1} 1 1 1 3 {It3} = {It2} 0.5 1 1.5 4 {It3} = {It4} 0.5 1 1 5 {It3} = {It1} 0.5 1 1 6 {It2} = {It4} 0.667 1 1 7 {It2} = {It1} 0.667 1 1 8 {It4} = {It1} 1 1 1 9 {It1} = {It4} 1 1 1 I want to convert this in a matrix or data frame and the result should be something like this ##we obtain rules support confidence lift 1 {}={It4} 1.000 11.0 2 {}={It1} 1.000 11.0 3 {It3}={It2} 0.500 11.5 4 {It3}={It4} 0.500 11.0 5 {It3}={It1} 0.500 11.0 6 {It2}={It4} 0.667 11.0 7 {It2}={It1} 0.667 11.0 8 {It4}={It1} 1.000 11.0 9 {It1}={It4} 1.000 11.0 R data.frame(rules = labels(rules), quality(rules)) rules support confidence lift 1{} = {It4} 1.000 1 1.0 2{} = {It1} 1.000 1 1.0 3 {It3} = {It2} 0.500 1 1.5 4 {It3} = {It4} 0.500 1 1.0 5 {It3} = {It1} 0.500 1 1.0 6 {It2} = {It4} 0.667 1 1.0 7 {It2} = {It1} 0.667 1 1.0 8 {It4} = {It1} 1.000 1 1.0 9 {It1} = {It4} 1.000 1 1.0 In another hand is it possible to obtain all the rules where lHS=rhs?. In our last example that means to obtain lhs rhs support confidence lift 8 {It4} = {It1} 1.000 1 1.0 9 {It1} = {It4} 1.000 1 1.0 This is a little more tricky. We can create a new set of rules with rhs and lhs reversed and then see if these reversed rules match some of the original rules. rulesRev - new(rules, rhs=lhs(rules), lhs=rhs(rules)) m - match(rules, rulesRev, nomatch=0) inspect(rules[m,]) lhs rhs support confidence lift 1 {It1} = {It4} 1 11 2 {It4} = {It1} 1 11 Maybe there is an easier way. I have to think about it. -Michael Thanks again Alberto -- Michael Hahsler email: mich...@hahsler.net web: http://michael.hahsler.net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can I use mcnemar.test for 3*3 tables (or is there a bug in the command?)
On Jul 19, 2009, at 6:09 PM, Tal Galili wrote: Hello Charles, Thank you for the detail reply. I am still left with the leading question which is: which test should I use when analyzing the 3 by 3 matrix I have? The mcnemar.test or the mh_test? Is the one necessarily better then the other? Please define better. (for example for sparser matrices ?) That does not help. What about: mh_test(as.table(matrix(1:16,4))) It returns a very significant result: chi-squared = 11.4098, df = 3, p-value = 0.009704 Where as mcnemar.test(matrix(1:16,4)), didn't: McNemar's chi-squared = 11.5495, df = 6, p-value = 0.0728 So which one is right ? And now ... define right. (from the looks of it, the mh_test is doing much better) Perhaps from the perspective of a statistically naive reviewer. Should the strategy be to try and use both methods, and start digging when one doesn't sit well with the other? I am reminded of Jim Holtam's tag line: What problem are you trying to solve? Thanks, Tal On Sun, Jul 19, 2009 at 10:26 PM, Charles C. Berry cbe...@tajo.ucsd.edu wrote: On Sun, 19 Jul 2009, Tal Galili wrote: Hello David,Thank you for your answer. Do you know then what does the mcnemar.test do in the case of a 3*3 table ? print(mcnemar.test) will show you what it does. Because the results for the simple example I gave are rather different (P value of 0.053 VS 0.73) The test mcnemar.test() constructs is one of symmetry, which is equivalent to marginal homogenity in hierarchical log-linear models as I recall from Bishop, Fienberg, and Holland's 1975 opus on count data. Stuart-Maxwell uses the dispersion matrix of marginal difference. These are two different tests. I suspect that Stuart-Maxwell is less susceptible to continuity issues in very sparse tables, which may account for the difference you see here. In case the mcnemar can't really handle a 3*3 matrix (or more), shouldn't there be an error massage for this case? (if so, who should I turn to, in order to report this?) Well, the code is pretty straightforward and mcnemar.test(matrix(1:16,4)) returns 11.5495 which is correct. It looks like there is nothing to report. 3,1,5), ncol = 3 Chuck Thanks again, Tal On Sun, Jul 19, 2009 at 3:47 PM, David Freedman 3.14da...@gmail.com wrote: There is a function mh_test in the coin package. library(coin) mh_test(tt) The documentation states, The null hypothesis of independence of row and column totals is tested. The corresponding test for binary factors x and y is known as McNemar test. For larger tables, Stuarts W0 statistic (Stuart, 1955, Agresti, 2002, page 422, also known as Stuart-Maxwell test) is computed. hth, david freedman Tal Galili wrote: Hello all, I wish to perform a mcnemar test for a 3 by 3 matrix. By running the slandered R command I am getting a result but I am not sure I am getting the correct one. Here is an example code: (tt - as.table(t(matrix(c(1,4,1, 0,5,5, 3,1,5), ncol = 3 mcnemar.test(tt, correct=T) #And I get: McNemar's Chi-squared test data: tt McNemar's chi-squared = 7.6667, df = 3, p-value = *0.05343* Now I was wondering if the test I just performed is the correct one. From looking at the Wikipedia article on mcnemar ( http://en.wikipedia.org/wiki/McNemar's_test), it is said that: The Stuart-Maxwell testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm is different generalization of the McNemar test, used for testing marginal homogeneity in a square table with more than two rows/columns From searching for a Stuart-Maxwell testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm in google, I found an algorithm here: http://www.m-hikari.com/ams/ams-password-2009/ams-password9-12-2009/abbasiAMS9-12-2009.pdf From running this algorithm I am getting a different P value, here is the (somewhat ugly) code I produced for this: get.d - function(xx) { length1 - dim(xx)[1] ret1 - margin.table(xx,1) - margin.table(xx,2) return(ret1) } get.s - function(xx) { the.s - xx for( i in 1:dim(xx)[1]) { for(j in 1:dim(xx)[2]) { if(i == j) { the.s[i,j] - margin.table(xx,1)[i] + margin.table(xx,2) [i] - 2*xx[i,i] } else { the.s[i,j] - -(xx[i,j] + xx[j,i]) } } } return(the.s) } chi.statistic - t(get.d(tt)[-3]) %*% solve(get.s(tt)[-3,-3]) %*% get.d(tt)[-3] paste(the P value:, pchisq(chi.statistic, 2)) #and the result was: the P value: 0.268384371053358 So to summarize my questions: 1) can I use mcnemar.test for 3*3 (or more) tables ? 2) if so, what test is being performed ( Stuart-Maxwell http://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm) ? 3) Do you have a recommended link to an explanation of the algorithm employed? Thanks, Tal -- -- My contact
Re: [R] Solving two nonlinear equations with two knowns
I try to use integrate command to get theoretical mean and variance, and then solve for a and b. I have an error as follows, Iteration: 0 ||F(x0)||: 2.7096 Error in integrate(InverseF1, tau, 1, mu = mu2, sigma = sigma2, a = a, : non-finite function value Any suggestion? #R code: mu2=0.4 sigma2=4 tau=0.75 mean.diff=2 var.ratio=3 InverseF1-function(u,mu,sigma,a,b,tau) { qnorm(u,mean=mu,sd=sigma)+a*(abs(u-tau))^b*(utau) } InverseF2-function(u,mu,sigma) { qnorm(u,mean=mu,sd=sigma) } part1-function(u,mu,sigma,a,b,tau) { (InverseF2(u,mu,sigma)+a*(abs(u-tau))^b*(utau))^2 } part2-function(u,mu,sigma) { InverseF2(u,mu,sigma)^2 } parameter- function(cons) { a-cons[1] b-cons[2] f - rep(NA, 2) EX-integrate(InverseF2,0,tau,mu=mu2,sigma=sigma2)$value+integrate(InverseF1,tau,1,mu=mu2,sigma=sigma2,a=a,b=b,tau=tau)$value EY-integrate(InverseF2,0,1,mu=mu2,sigma=sigma2)$value VarX-integrate(part2,0,tau,mu=mu2,sigma=sigma2)$value+integrate(part1,tau,1,mu=mu2,sigma=sigma2,a=a,b=b,tau=tau)$value-EX^2 VarY-integrate(part2,0,1,mu=mu2,sigma=sigma2)$value-EY^2 f[1] - abs(EX - EY) - mean.diff f[2] - sqrt(VarX) - sqrt(var.ratio) * sqrt(VarY) f } library(BB) c0-c(3,1) ans1 - dfsane(par=c0, fn=parameter, method=3) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Index and dummy
Dear R-helpers I have 2 variables x1=rgamma(6000, 2, 1) and x2=rgamma(6000, 3,2). I have to sort (descending) each one and split it into groups. After this each two groups must be merged into one until all population becomes one group. A dummy vector must be created for each group (8, 4, 2, 1) being equal to 1 if the individual (i) belongs to the group and equal to 0, otherwise. What I have done was: id=(6000) x1sort=sort(x1, decreasing=TRUE) x1g8_1=x1sort[1:750] x1g8_2=x1sort[751:1500] x1g8_3=x1sort[1501:2250] x1g8_4=x1sort[2251:3000] x1g8_5=x1sort[3001:3750] x1g8_6=x1sort[3751:4500] x1g8_7=x1sort[4501:5250] x1g8_8=x1sort[5251:6000] x1g4_1=c(x1g8_1, x1g8_2) x1g4_2=c(x1g8_3, x1g8_4) x1g4_3=c(x1g8_5, x1g8_6) x1g4_4=c(x1g8_7, x1g8_8) x1g2_1=c(x1g4_1, x1g4_2) x1g2_2=c(x1g4_3, x1g4_4) x1ng=c(x1g2_1, x1g2_2) After this I did the dummy vector (the example is for group4) dum= replace(matrix(0, 4, 1), cbind(4, 1), 0) # matrix of zeros dummy=lapply(1:4, function(i) replace(dum, cbind(i), 1))# 4 dummy vectors s=split(dummy, 1:4) ss=rename.vars(s, c(1, 2, 3, 4), c(dx14_1, dx14_2, dx14_3, dx14_4)) The problem is when I split into groups each group only identifies 750 individuals(in the case of x1g8 for instance) only assumes i=1, ..., 750 and I need to keep i=1, , 6000. Also my option to dummy vectors don't seem to work because I get 4 vectors with the number one (1) in each different variable and not only one. So, I need some help on how should I make to keep i=1, ..., 6000 and how to create a dummy vector that assumes only the value one (1) when some I belongs to some group. Thank you so much. Ana __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re gression for loop test HELP! URGENT!
Hi everyone! I'm new to R, and I'm stuck on a problem I don't know how to approach. I have calculated a regression in the form of M ~ D + O + S, and I would like to take this regression and test it with other samples, 5 at a time(5 meaning 5 set, each consisting M, D, O, and S of a specific date). I assume I'll need a for loop. Right now, My data of M, D, O, and S are all stored in separate txt files, but should I just put them into a table or something? And then how would I calculate the error of how well the test samples fit the original regression? This is for my internship, so it's very urgent. THANKS A LOT! RBeginner -- View this message in context: http://www.nabble.com/Regression-for-loop-test-HELP%21-URGENT%21-tp24562766p24562766.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] HELP: BRUGS/WinBUGS/RBUGS Response is a combination of random variables
Hi, Is there anyone know if BUGS language allows the combination of variables as response such as Y[i] - a*X1[i]+b*X2[i] Y[i] ~ dnorm(c,d) It seems doesn't work in my model. The problem is between two ##. The error message is modelCheck(BayesBioMarker.BUGS) model is syntactically correct modelData(paste(BUGS_data.txt,sep=)) data loaded modelCompile(numChains=1) multiple definitions of node bm[1] === model { for(iter in 1:numSubj){ bmd[iter] ~ dnorm(u[trt[iter]+1],sdbmd.precision); ### bm[iter] - inprod(biomarker[iter,1:4], bta[1:4]) bm[iter] ~ dnorm(bmu[iter], sdbm.precision); ### bmu[iter] - (1-delta[trt[iter]+1])*u[trt[iter]+1]+ delta[trt[iter]+1]*(u[1]+u[2])/2; } for(iter in 1:4){ bta[iter] ~ dnorm(0, 0.01); } for (iter in 1:2){ delta[iter] ~ dbeta(0.0001,0.0001); u[iter] ~ dnorm(0,01); } sdbmd.precision ~ dgamma(0.1,0.1); sdbm.precision ~ dgamma(0.1,0.1); } === Data list(biomarker= structure(.Data= c(-1.6332505600E-01, -9.9820335000E-02, 4.0692602900E-01, -8.265000E-03, -6.6739468400E-01, -1.9444291700E-01, -2.5035248500E-01, -1.592600E-02, 2.5951119500E-01, 3.1943077100E-01, 2.1915456300E-01, -7.06E-04, -1.1407237740E+00, -4.7155522000E-01, -1.4495155440E+00, 1.703600E-02, -1.9105523700E-01, -5.698021E-03, -8.4883829700E-01, 4.939000E-03), .Dim=c(5, 4)), bmd=c(-4.944000E-03, 3.520700E-02, 2.597500E-02, 2.739400E-02, 7.501000E-03), trt=c(1.00E+00, 0.00E+00, 0.00E+00, 1.00E+00, 1.00E+00), numSubj=5.00E+00) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] what is meaning of the bubbles in boxplots?
Hi ,everyone , I draw some boxplot figure with the command boxplot.But in the figure,there are some bubbles at the top part of the figure. Can anyone tell me what the correct meaning of these bubbles?and how to remove it? -- TANG Jie Email: totang...@gmail.com Tel: 0086-2154896104 Shanghai Typhoon Institute,China [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re gression for loop test HELP! URGENT!
for a beginner, it's probably even easier to it by hand if it is just five datasets. bind the 5 datasets together in one dataset and create and index variable (1 to 5) for each of the observations according to the dataset the obersvation comes from then run five regressions using reg1=lm(M~D+O+S,subset=c(index==1)) . . . reg5=lm(M~D+O+S,subset=c(index==5)) and then predict from each regression predict(reg1,newdata=data.frame(D,O,S)) . . . predict(reg5,newdata=data.frame(D,O,S)) You can then assess how well the prediction from each of the datasets fits the respective other datasets... Daniel - cuncta stricte discussurus - -Ursprüngliche Nachricht- Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im Auftrag von Rbeginner Gesendet: Sunday, July 19, 2009 9:49 PM An: r-help@r-project.org Betreff: [R] Re gression for loop test HELP! URGENT! Hi everyone! I'm new to R, and I'm stuck on a problem I don't know how to approach. I have calculated a regression in the form of M ~ D + O + S, and I would like to take this regression and test it with other samples, 5 at a time(5 meaning 5 set, each consisting M, D, O, and S of a specific date). I assume I'll need a for loop. Right now, My data of M, D, O, and S are all stored in separate txt files, but should I just put them into a table or something? And then how would I calculate the error of how well the test samples fit the original regression? This is for my internship, so it's very urgent. THANKS A LOT! RBeginner -- View this message in context: http://www.nabble.com/Regression-for-loop-test-HELP%21-URGENT%21-tp24562766p 24562766.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with as.POSIXct on dates object
Dear R-helpers, I have a problem converting an object made with the 'chron' function to a POSIXct object: # Make date based on DOY dat - chron(dates=232, origin.=c(month=1, day=1, year=2008)) dat #[1] 08/20/08 # Converting to POSIXct uses current timezone (Sydney): as.POSIXct(dat) #[1] 2008-08-20 10:00:00 EST # Setting GMT timezone has no effect? as.POSIXct(dat, tz=GMT) #[1] 2008-08-20 10:00:00 EST # But to POSIXlt works fine: as.POSIXlt(dat, tz=GMT) #[1] 2008-08-20 GMT Is this behavior expected? If so, can you explain why? thanks for your help, Remko - Remko Duursma Post-Doctoral Fellow Centre for Plants and the Environment University of Western Sydney Hawkesbury Campus Richmond NSW 2753 Dept of Biological Science Macquarie University North Ryde NSW 2109 Australia Mobile: +61 (0)422 096908 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with as.POSIXct on dates object
as.POSIXct.dates does not make use of tz: as.POSIXct.dates function (x, ...) { if (inherits(x, dates)) { z - attr(x, origin) x - as.numeric(x) * 86400 if (length(z) == 3L is.numeric(z)) x - x + as.numeric(ISOdate(z[3L], z[1L], z[2L], 0)) return(structure(x, class = c(POSIXt, POSIXct))) } else stop(gettextf('%s' is not a \dates\ object, deparse(substitute(x } environment: namespace:base On Sun, Jul 19, 2009 at 11:30 PM, Remko Duursmaremkoduur...@gmail.com wrote: Dear R-helpers, I have a problem converting an object made with the 'chron' function to a POSIXct object: # Make date based on DOY dat - chron(dates=232, origin.=c(month=1, day=1, year=2008)) dat #[1] 08/20/08 # Converting to POSIXct uses current timezone (Sydney): as.POSIXct(dat) #[1] 2008-08-20 10:00:00 EST # Setting GMT timezone has no effect? as.POSIXct(dat, tz=GMT) #[1] 2008-08-20 10:00:00 EST # But to POSIXlt works fine: as.POSIXlt(dat, tz=GMT) #[1] 2008-08-20 GMT Is this behavior expected? If so, can you explain why? thanks for your help, Remko - Remko Duursma Post-Doctoral Fellow Centre for Plants and the Environment University of Western Sydney Hawkesbury Campus Richmond NSW 2753 Dept of Biological Science Macquarie University North Ryde NSW 2109 Australia Mobile: +61 (0)422 096908 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with as.POSIXct on dates object
as.POSIXct.dates does not make use of tz: Ok, but it is supposed to, right? Or maybe the documentation can be updated, because ?as.POSIXct does seem to imply the timezone is used (as it is for other methods of as.POSIXct). thanks, Remko - Remko Duursma Post-Doctoral Fellow Centre for Plants and the Environment University of Western Sydney Hawkesbury Campus Richmond NSW 2753 Dept of Biological Science Macquarie University North Ryde NSW 2109 Australia Mobile: +61 (0)422 096908 On Mon, Jul 20, 2009 at 1:41 PM, Gabor Grothendieckggrothendi...@gmail.com wrote: as.POSIXct.dates does not make use of tz: as.POSIXct.dates function (x, ...) { if (inherits(x, dates)) { z - attr(x, origin) x - as.numeric(x) * 86400 if (length(z) == 3L is.numeric(z)) x - x + as.numeric(ISOdate(z[3L], z[1L], z[2L], 0)) return(structure(x, class = c(POSIXt, POSIXct))) } else stop(gettextf('%s' is not a \dates\ object, deparse(substitute(x } environment: namespace:base On Sun, Jul 19, 2009 at 11:30 PM, Remko Duursmaremkoduur...@gmail.com wrote: Dear R-helpers, I have a problem converting an object made with the 'chron' function to a POSIXct object: # Make date based on DOY dat - chron(dates=232, origin.=c(month=1, day=1, year=2008)) dat #[1] 08/20/08 # Converting to POSIXct uses current timezone (Sydney): as.POSIXct(dat) #[1] 2008-08-20 10:00:00 EST # Setting GMT timezone has no effect? as.POSIXct(dat, tz=GMT) #[1] 2008-08-20 10:00:00 EST # But to POSIXlt works fine: as.POSIXlt(dat, tz=GMT) #[1] 2008-08-20 GMT Is this behavior expected? If so, can you explain why? thanks for your help, Remko - Remko Duursma Post-Doctoral Fellow Centre for Plants and the Environment University of Western Sydney Hawkesbury Campus Richmond NSW 2753 Dept of Biological Science Macquarie University North Ryde NSW 2109 Australia Mobile: +61 (0)422 096908 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sampling of non-overlapping intervals of variable length
Thanks Chuck. Ups, did not think of the problem in that way. That did exactly what I needed. I have another complication to this problem: I do not only have one vector of 1:1e^6 but several vectors of different length, say 5. Initially, my intervals are distributed over those 5 vectors and the ranges of those 5 vectors in a specific way (and you might have guessed by now that I would like to do something like a permutation test). Because I have this additional level, I guess I could do something like: 1)Sample the 5 vectors with probabilities proportional to the frequencies of the intial intervals on these vectors. 2)For each sampled vector: apply Chucks solution. ? Thanks a lot. Hadassa On Sun, Jul 19, 2009 at 11:13 PM, Charles C. Berrycbe...@tajo.ucsd.edu wrote: On Sun, 19 Jul 2009, Hadassa Brunschwig wrote: Hi I am not sure what you mean by sampling an index of a group of intervals. I will try to give an example: Let's assume I have a vector 1:100. Let's say I have 10 intervals of different but known length, say, c(4,6,11,2,8,14,7,2,18,32). For simulation purposes I have to sample those 10 intervals 1000 times. The requirement is, however, that they should be of those lengths and should not be overlapping. In short, I would like to obtain a 10x1000 matrix with sampled intervals. Something like this: lens - c(4,6,11,2,8,14,7,2,18,32) perm.lens - sample(lens) sort(sample(1e06-sum(lens)+length(lens),length(lens)))+cumsum(c(0,head(perm.lens,-1))) [1] 15424 261927 430276 445976 451069 546578 656123 890494 939714 969643 The vector above gives the starting points for the intervals whose lengths are perm.lens. I'd bet every introductory combinatorics book has a problem or example in which the expression for the number of ways in which K ordered objects can be assigned to I groups consisting of n_i adjacent objects each is constructed. The construction is along the lines of the calculation above. HTH, Chuck Thanks Hadassa On Sun, Jul 19, 2009 at 9:48 PM, David Winsemiusdwinsem...@comcast.net wrote: On Jul 19, 2009, at 1:05 PM, Hadassa Brunschwig wrote: Hi, I hope I am not repeating a question which has been posed already. I am trying to do the following in the most efficient way: I would like to sample from a finite (large) set of integers n non-overlapping intervals, where each interval i has a different, set length L_i (which is the number of integers in the interval). I had the idea to sample recursively on a vector with the already chosen intervals discarded but that seems to be too complicated. It might be ridiculously easy if you sampled on an index of a group of intervals. Why not pose the question in the form of example data.frames or other classes of R objects? Specification of the desired output would be essential. I think further specification of the sampling strategy would also help because I am unable to understand what sort of probability model you are hoping to apply. Any suggestions on that? Thanks a lot. Hadassa -- Hadassa Brunschwig PhD Student Department of Statistics David Winsemius, MD Heritage Laboratories West Hartford, CT -- Hadassa Brunschwig PhD Student Department of Statistics The Hebrew University of Jerusalem http://www.stat.huji.ac.il __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 -- Hadassa Brunschwig PhD Student Department of Statistics The Hebrew University of Jerusalem http://www.stat.huji.ac.il __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re gression for loop test HELP! URGENT!
Thanks for the suggestion, but I think I might not have made myself very clear. I actually have about 2000 sets of M, D, O, and S, so it's probably not efficient to make each a separate set and then index. I've put everything into a data frame, so I would like to test how well the regression fit for each group, which consists of 5 sets of M, D, O, and S. Since I'll need to test it for about 400 groups, I thought a for loop might be necessary. Any suggestions? I'm just not especially sure how to do the for loop. Daniel Malter wrote: for a beginner, it's probably even easier to it by hand if it is just five datasets. bind the 5 datasets together in one dataset and create and index variable (1 to 5) for each of the observations according to the dataset the obersvation comes from then run five regressions using reg1=lm(M~D+O+S,subset=c(index==1)) . . . reg5=lm(M~D+O+S,subset=c(index==5)) and then predict from each regression predict(reg1,newdata=data.frame(D,O,S)) . . . predict(reg5,newdata=data.frame(D,O,S)) You can then assess how well the prediction from each of the datasets fits the respective other datasets... Daniel - cuncta stricte discussurus - -Ursprüngliche Nachricht- Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im Auftrag von Rbeginner Gesendet: Sunday, July 19, 2009 9:49 PM An: r-help@r-project.org Betreff: [R] Re gression for loop test HELP! URGENT! Hi everyone! I'm new to R, and I'm stuck on a problem I don't know how to approach. I have calculated a regression in the form of M ~ D + O + S, and I would like to take this regression and test it with other samples, 5 at a time(5 meaning 5 set, each consisting M, D, O, and S of a specific date). I assume I'll need a for loop. Right now, My data of M, D, O, and S are all stored in separate txt files, but should I just put them into a table or something? And then how would I calculate the error of how well the test samples fit the original regression? This is for my internship, so it's very urgent. THANKS A LOT! RBeginner -- View this message in context: http://www.nabble.com/Regression-for-loop-test-HELP%21-URGENT%21-tp24562766p 24562766.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Regression-for-loop-test-HELP%21-URGENT%21-tp24562766p24563579.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sampling of non-overlapping intervals of variable length
Another possibility, if the total length of your intervals is small in comparison to the big interval is to choose the starting points of all your intervals randomly and to dismiss the entire set if some of the intervals overlap. Most probably you will not have too many such cases (assuming, as stated above, that the total length of all the intervals is a small proportion of the length of the big interval). --- On Mon, 20/7/09, Hadassa Brunschwig hadassa.brunsch...@mail.huji.ac.il wrote: From: Hadassa Brunschwig hadassa.brunsch...@mail.huji.ac.il Subject: Re: [R] Sampling of non-overlapping intervals of variable length To: Charles C. Berry cbe...@tajo.ucsd.edu Cc: r-help@r-project.org Received: Monday, 20 July, 2009, 3:08 PM Thanks Chuck. Ups, did not think of the problem in that way. That did exactly what I needed. I have another complication to this problem: I do not only have one vector of 1:1e^6 but several vectors of different length, say 5. Initially, my intervals are distributed over those 5 vectors and the ranges of those 5 vectors in a specific way (and you might have guessed by now that I would like to do something like a permutation test). Because I have this additional level, I guess I could do something like: 1)Sample the 5 vectors with probabilities proportional to the frequencies of the intial intervals on these vectors. 2)For each sampled vector: apply Chucks solution. ? Thanks a lot. Hadassa On Sun, Jul 19, 2009 at 11:13 PM, Charles C. Berrycbe...@tajo.ucsd.edu wrote: On Sun, 19 Jul 2009, Hadassa Brunschwig wrote: Hi I am not sure what you mean by sampling an index of a group of intervals. I will try to give an example: Let's assume I have a vector 1:100. Let's say I have 10 intervals of different but known length, say, c(4,6,11,2,8,14,7,2,18,32). For simulation purposes I have to sample those 10 intervals 1000 times. The requirement is, however, that they should be of those lengths and should not be overlapping. In short, I would like to obtain a 10x1000 matrix with sampled intervals. Something like this: lens - c(4,6,11,2,8,14,7,2,18,32) perm.lens - sample(lens) sort(sample(1e06-sum(lens)+length(lens),length(lens)))+cumsum(c(0,head(perm.lens,-1))) [1] 15424 261927 430276 445976 451069 546578 656123 890494 939714 969643 The vector above gives the starting points for the intervals whose lengths are perm.lens. I'd bet every introductory combinatorics book has a problem or example in which the expression for the number of ways in which K ordered objects can be assigned to I groups consisting of n_i adjacent objects each is constructed. The construction is along the lines of the calculation above. HTH, Chuck Thanks Hadassa On Sun, Jul 19, 2009 at 9:48 PM, David Winsemiusdwinsem...@comcast.net wrote: On Jul 19, 2009, at 1:05 PM, Hadassa Brunschwig wrote: Hi, I hope I am not repeating a question which has been posed already. I am trying to do the following in the most efficient way: I would like to sample from a finite (large) set of integers n non-overlapping intervals, where each interval i has a different, set length L_i (which is the number of integers in the interval). I had the idea to sample recursively on a vector with the already chosen intervals discarded but that seems to be too complicated. It might be ridiculously easy if you sampled on an index of a group of intervals. Why not pose the question in the form of example data.frames or other classes of R objects? Specification of the desired output would be essential. I think further specification of the sampling strategy would also help because I am unable to understand what sort of probability model you are hoping to apply. Any suggestions on that? Thanks a lot. Hadassa -- Hadassa Brunschwig PhD Student Department of Statistics David Winsemius, MD Heritage Laboratories West Hartford, CT -- Hadassa Brunschwig PhD Student Department of Statistics The Hebrew University of Jerusalem http://www.stat.huji.ac.il __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 -- Hadassa Brunschwig PhD Student Department of Statistics The Hebrew University of Jerusalem http://www.stat.huji.ac.il __
[R] Regression for loop test HELP! URGENT!
Hi everyone! I'm new to R, and I've sent this message as a non-member, but since it's pretty urgent, I'm sending it again now I'm on the mailing list (Thanks Daniel for your suggestion nevertheless). I have calculated a regression in the form of M ~ D + O + S, and I would like to take this regression and test it with other samples, 5 sets of M, D, O, and S at a time(I actually have 2000 sets, so it's probably not efficient to make each a separate set and then index). Since I'll need to test the regression for 400 groups, I thought a for loop might be necessary. I've put everything into a data frame already. Can anyone tell me how to write the code? I'm especially not sure about how to do the for loop. And then how would I calculate the error of how well the test samples fit the original regression? This is for my internship, so it's very urgent. THANKS A LOT! RBeginner [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.