Re: [R] two questions - function help and 32vs64 bit sessions
Hi Duncan, I tried your suggestion, but no luck. The first error is no surprise, it just confirms the address is lost. The second line suggests it worked, but it didn't. The session is still remembering the original address. tools::startDynamicHelp(FALSE) # shut it down Warning message: In file(out, wt) : cannot open file 'C:\Users\fowlerm\AppData\Local\Temp\1\RtmpK09I4B\Rhttpd26c04742884': No such file or directory tools::startDynamicHelp(TRUE) # start it up starting httpd help server ... done -Original Message- From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com] Sent: July 16, 2014 10:47 AM To: Fowler, Mark; r-h...@stat.math.ethz.ch Subject: Re: [R] two questions - function help and 32vs64 bit sessions On 14/07/2014, 11:42 AM, Fowler, Mark wrote: Hello, Two unrelated questions, and neither urgent. Windows 7, R 3.0.1. Using R Console, no fancy interface. The function help ultimately becomes lost to a session kept running for extended periods (days). I.e. with a new session if you invoke the Help menu 'R functions (txt)...' it activates the html help and goes to the named function page. This will work fine for at least a day, but typically the next day invoking the help menu in this fashion will fail, as R looks for a temporary address it creates on your computer. This gets lost, possibly due to network administration activity. So then I save and start another session with same Rdata. Trivial enough but irritating. Anybody know how to restore the 'link' without ending and restarting the session? I never have sessions that last that long, so I haven't tried this, but I'd expect you could restart the help system in this way: tools::startDynamicHelp(FALSE) # shut it down tools::startDynamicHelp(TRUE) # start it up Duncan Murdoch I have a mix of 32-bit and 64-bit requirements, with 64 the default. I became used to starting R sessions directly from the appropriate Rdata workspaces. With the latest version I need to start from the generic icon and then load the workspace if I want 32-bit. Anybody know a way to make the Rdata files keep track of which bit version they work with, or some trick that accomplishes the same objective? The 32-bit requirement is usually just RODBC, and the need for it is scattered over lots of workspaces. Again, trivial but a nuisance. A more pragmatic motive is to not oblige users of applications to think about it. Any way to make a session switch 'bits' with a source file? Mark Fowler Population Ecology Division Bedford Inst of Oceanography Dept Fisheries Oceans Dartmouth NS Canada B2Y 4A2 Tel. (902) 426-3529 Fax (902) 426-9710 Email mark.fow...@dfo-mpo.gc.ca mailto:mark.fow...@dfo-mpo.gc.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions - function help and 32vs64 bit sessions
On 21/07/2014 9:40 AM, Fowler, Mark wrote: Hi Duncan, I tried your suggestion, but no luck. The first error is no surprise, it just confirms the address is lost. The second line suggests it worked, but it didn't. The session is still remembering the original address. tools::startDynamicHelp(FALSE) # shut it down Warning message: In file(out, wt) : cannot open file 'C:\Users\fowlerm\AppData\Local\Temp\1\RtmpK09I4B\Rhttpd26c04742884': No such file or directory tools::startDynamicHelp(TRUE) # start it up starting httpd help server ... done This is a different error than I thought you described. I thought something had shut down the server, but it looks like something has deleted your temporary directory. You might be able to tell your OS not to do that (if it was the OS that did), or you can put your temporary directory in a location that is less likely to get deleted, e.g. by setting the TMPDIR environment variable to somewhere else before you start R. For example, I typically run with TMPDIR set to C:/temp when I'm debugging things. Duncan Murdoch -Original Message- From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com] Sent: July 16, 2014 10:47 AM To: Fowler, Mark; r-h...@stat.math.ethz.ch Subject: Re: [R] two questions - function help and 32vs64 bit sessions On 14/07/2014, 11:42 AM, Fowler, Mark wrote: Hello, Two unrelated questions, and neither urgent. Windows 7, R 3.0.1. Using R Console, no fancy interface. The function help ultimately becomes lost to a session kept running for extended periods (days). I.e. with a new session if you invoke the Help menu 'R functions (txt)...' it activates the html help and goes to the named function page. This will work fine for at least a day, but typically the next day invoking the help menu in this fashion will fail, as R looks for a temporary address it creates on your computer. This gets lost, possibly due to network administration activity. So then I save and start another session with same Rdata. Trivial enough but irritating. Anybody know how to restore the 'link' without ending and restarting the session? I never have sessions that last that long, so I haven't tried this, but I'd expect you could restart the help system in this way: tools::startDynamicHelp(FALSE) # shut it down tools::startDynamicHelp(TRUE) # start it up Duncan Murdoch I have a mix of 32-bit and 64-bit requirements, with 64 the default. I became used to starting R sessions directly from the appropriate Rdata workspaces. With the latest version I need to start from the generic icon and then load the workspace if I want 32-bit. Anybody know a way to make the Rdata files keep track of which bit version they work with, or some trick that accomplishes the same objective? The 32-bit requirement is usually just RODBC, and the need for it is scattered over lots of workspaces. Again, trivial but a nuisance. A more pragmatic motive is to not oblige users of applications to think about it. Any way to make a session switch 'bits' with a source file? Mark Fowler Population Ecology Division Bedford Inst of Oceanography Dept Fisheries Oceans Dartmouth NS Canada B2Y 4A2 Tel. (902) 426-3529 Fax (902) 426-9710 Email mark.fow...@dfo-mpo.gc.ca mailto:mark.fow...@dfo-mpo.gc.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions - function help and 32vs64 bit sessions
Hi Jim, Lost the address (Tuesday night is a major system scan for DFO, messes everybody up). Tried your suggestion, no luck. Also tried shutting down all Explorer browsers, still no luck. Maybe something related to configuration or environment differs between our systems. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jim Lemon Sent: July 15, 2014 12:17 AM To: r-help@r-project.org Subject: Re: [R] two questions - function help and 32vs64 bit sessions On Mon, 14 Jul 2014 01:42:54 PM Fowler, Mark wrote: Hello, Two unrelated questions, and neither urgent. Windows 7, R 3.0.1. Using R Console, no fancy interface. The function help ultimately becomes lost to a session kept running for extended periods (days). I.e. with a new session if you invoke the Help menu 'R functions (txt)...' it activates the html help and goes to the named function page. This will work fine for at least a day, but typically the next day invoking the help menu in this fashion will fail, as R looks for a temporary address it creates on your computer. This gets lost, possibly due to network administration activity. So then I save and start another session with same Rdata. Trivial enough but irritating. Anybody know how to restore the 'link' without ending and restarting the session? Hi Mark, I simply shut down the help browser. It will restart with a new IP address. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions - function help and 32vs64 bit sessions
On 14/07/2014, 11:42 AM, Fowler, Mark wrote: Hello, Two unrelated questions, and neither urgent. Windows 7, R 3.0.1. Using R Console, no fancy interface. The function help ultimately becomes lost to a session kept running for extended periods (days). I.e. with a new session if you invoke the Help menu 'R functions (txt)...' it activates the html help and goes to the named function page. This will work fine for at least a day, but typically the next day invoking the help menu in this fashion will fail, as R looks for a temporary address it creates on your computer. This gets lost, possibly due to network administration activity. So then I save and start another session with same Rdata. Trivial enough but irritating. Anybody know how to restore the 'link' without ending and restarting the session? I never have sessions that last that long, so I haven't tried this, but I'd expect you could restart the help system in this way: tools::startDynamicHelp(FALSE) # shut it down tools::startDynamicHelp(TRUE) # start it up Duncan Murdoch I have a mix of 32-bit and 64-bit requirements, with 64 the default. I became used to starting R sessions directly from the appropriate Rdata workspaces. With the latest version I need to start from the generic icon and then load the workspace if I want 32-bit. Anybody know a way to make the Rdata files keep track of which bit version they work with, or some trick that accomplishes the same objective? The 32-bit requirement is usually just RODBC, and the need for it is scattered over lots of workspaces. Again, trivial but a nuisance. A more pragmatic motive is to not oblige users of applications to think about it. Any way to make a session switch 'bits' with a source file? Mark Fowler Population Ecology Division Bedford Inst of Oceanography Dept Fisheries Oceans Dartmouth NS Canada B2Y 4A2 Tel. (902) 426-3529 Fax (902) 426-9710 Email mark.fow...@dfo-mpo.gc.ca mailto:mark.fow...@dfo-mpo.gc.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions - function help and 32vs64 bit sessions
Hi Jim, Thought I tried that. Just have one session running currently, started yesterday, help still linked. I'll wait for the link to expire and confirm. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jim Lemon Sent: July 15, 2014 12:17 AM To: r-help@r-project.org Subject: Re: [R] two questions - function help and 32vs64 bit sessions On Mon, 14 Jul 2014 01:42:54 PM Fowler, Mark wrote: Hello, Two unrelated questions, and neither urgent. Windows 7, R 3.0.1. Using R Console, no fancy interface. The function help ultimately becomes lost to a session kept running for extended periods (days). I.e. with a new session if you invoke the Help menu 'R functions (txt)...' it activates the html help and goes to the named function page. This will work fine for at least a day, but typically the next day invoking the help menu in this fashion will fail, as R looks for a temporary address it creates on your computer. This gets lost, possibly due to network administration activity. So then I save and start another session with same Rdata. Trivial enough but irritating. Anybody know how to restore the 'link' without ending and restarting the session? Hi Mark, I simply shut down the help browser. It will restart with a new IP address. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] two questions - function help and 32vs64 bit sessions
Hello, Two unrelated questions, and neither urgent. Windows 7, R 3.0.1. Using R Console, no fancy interface. The function help ultimately becomes lost to a session kept running for extended periods (days). I.e. with a new session if you invoke the Help menu 'R functions (txt)...' it activates the html help and goes to the named function page. This will work fine for at least a day, but typically the next day invoking the help menu in this fashion will fail, as R looks for a temporary address it creates on your computer. This gets lost, possibly due to network administration activity. So then I save and start another session with same Rdata. Trivial enough but irritating. Anybody know how to restore the 'link' without ending and restarting the session? I have a mix of 32-bit and 64-bit requirements, with 64 the default. I became used to starting R sessions directly from the appropriate Rdata workspaces. With the latest version I need to start from the generic icon and then load the workspace if I want 32-bit. Anybody know a way to make the Rdata files keep track of which bit version they work with, or some trick that accomplishes the same objective? The 32-bit requirement is usually just RODBC, and the need for it is scattered over lots of workspaces. Again, trivial but a nuisance. A more pragmatic motive is to not oblige users of applications to think about it. Any way to make a session switch 'bits' with a source file? Mark Fowler Population Ecology Division Bedford Inst of Oceanography Dept Fisheries Oceans Dartmouth NS Canada B2Y 4A2 Tel. (902) 426-3529 Fax (902) 426-9710 Email mark.fow...@dfo-mpo.gc.ca mailto:mark.fow...@dfo-mpo.gc.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions - function help and 32vs64 bit sessions
I don't know any definitive answer for either if your questions, but I have a comment that may explain why I have not encountered these issues. At one time I used to use RData files the way you are, but I discovered the value of re-running my analysis scripts from scratch regularly... as in dozens of times per day. Whenever the run time gets too long to be compatible with that, I modify the script to conditionally either regenerate any mostly-verified intermediate data objects that are slow to generate and save them with saveRDS, or simply reload them using loadRDS. Then I run most of the time with a trigger variable set to simply reload the debugged data objects for subsequent analysis or output formatting. The goal here is to build reproducible analysis scripts, not mysterious RData files generated by an unknown sequence of commands. With this approach in mind, R sessions need not stay open long, and the script can have a check (using perhaps .Machine$sizeof.pointer) at the beginning that verifies your architecture before proceeding. You might also be able to put a shortcut or batch file in your working directory for projects that require specific versions/architectures of R that point to the correct one for that project. Then you just start R using that shortcut or script. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On July 14, 2014 9:42:54 AM PDT, Fowler, Mark mark.fow...@dfo-mpo.gc.ca wrote: Hello, Two unrelated questions, and neither urgent. Windows 7, R 3.0.1. Using R Console, no fancy interface. The function help ultimately becomes lost to a session kept running for extended periods (days). I.e. with a new session if you invoke the Help menu 'R functions (txt)...' it activates the html help and goes to the named function page. This will work fine for at least a day, but typically the next day invoking the help menu in this fashion will fail, as R looks for a temporary address it creates on your computer. This gets lost, possibly due to network administration activity. So then I save and start another session with same Rdata. Trivial enough but irritating. Anybody know how to restore the 'link' without ending and restarting the session? I have a mix of 32-bit and 64-bit requirements, with 64 the default. I became used to starting R sessions directly from the appropriate Rdata workspaces. With the latest version I need to start from the generic icon and then load the workspace if I want 32-bit. Anybody know a way to make the Rdata files keep track of which bit version they work with, or some trick that accomplishes the same objective? The 32-bit requirement is usually just RODBC, and the need for it is scattered over lots of workspaces. Again, trivial but a nuisance. A more pragmatic motive is to not oblige users of applications to think about it. Any way to make a session switch 'bits' with a source file? Mark Fowler Population Ecology Division Bedford Inst of Oceanography Dept Fisheries Oceans Dartmouth NS Canada B2Y 4A2 Tel. (902) 426-3529 Fax (902) 426-9710 Email mark.fow...@dfo-mpo.gc.ca mailto:mark.fow...@dfo-mpo.gc.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions - function help and 32vs64 bit sessions
On Mon, 14 Jul 2014 01:42:54 PM Fowler, Mark wrote: Hello, Two unrelated questions, and neither urgent. Windows 7, R 3.0.1. Using R Console, no fancy interface. The function help ultimately becomes lost to a session kept running for extended periods (days). I.e. with a new session if you invoke the Help menu 'R functions (txt)...' it activates the html help and goes to the named function page. This will work fine for at least a day, but typically the next day invoking the help menu in this fashion will fail, as R looks for a temporary address it creates on your computer. This gets lost, possibly due to network administration activity. So then I save and start another session with same Rdata. Trivial enough but irritating. Anybody know how to restore the 'link' without ending and restarting the session? Hi Mark, I simply shut down the help browser. It will restart with a new IP address. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions about xyplot
Hi, To answer your second question you can do something like this: p-xyplot(dvy ~ sessidx | case, group = numph, data=d66df, col = c(1:4), layout=c(1, 3), xlab= Sessions, ylab = Number of Seconds, type=l) update(p, panel=function(...){ panel.xyplot(...) panel.abline(v=6.5) panel.abline(v=12.5) panel.abline(v=18.5) } ) By setting the group parameter in xyplot to be numph, xyplot will plot different lines for each group of numph you have in each case. For your first question, did you mean you want a text to display the mean in each panel? Richard On Mon, Sep 23, 2013 at 10:41 AM, William Shadish wshad...@ucmerced.edu wrote: Dear R helpers, I am generating three artificial short interrupted time series datasets (single-case designs; call them Case 1, Case 2, Case 3) and then plotting them in xyplot. I will put the entire code below so you can reproduce. I have been unable to figure out how to do two things. 1. Each time series has 24 time points divided into four phases. Call them phases A, B, C, D for convenience. I have used running() to compute the means of the observations in each of these four parts; and I saved these as objects called mn1 (for Case 1), mn2 (Case 2) and mn3 (Case 3). So mn1 contains the four means for A, B, C, D phases for Case 1, etc. I want to insert these means into the xyplot in the appropriate place. For instance, insert the first mean from mn1 into phase A of Case 1, the second mean into phase B of Case 1, and so forth until insert the fourth mean from mn3 into phase D of Case 3. Ideally, it would insert something like M = 49.02 or Xbar = 49.02 into phase A for Case 1. 2. The xyplot code I use creates a line connecting the data points, and that line is continuous over the entire graph. I would like to have the lines be discontinuous between phases. Phase changes are indicated by panel.abline() in the code, and occur at time points 6.5, 12.5, and 18.5. So, for example, I would like a line connecting the datapoints from 1 to 6, then 7-12, then 13-18, then 19-24 (but not including 6-7, 12-13, and 18-19). I appreciate any help you might be able to offer. Will Shadish Here is the code: # library(gtools) library(lattice) ###g = .66 z - rnorm(24, mean = 0, sd = 10) w - rnorm(24, mean = 0, sd = 10) ###change mean = to vary the effect size tm - rnorm(6, mean = 10, sd = 10) b - rep(0,6) c - rep(1,6) tmt - c(b,tm,b,tm) for (t in 2:24) z[t] - 0.25 * z[t - 1] + w[t] dvy - 50 + z + tmt jid - rep(1,24) sid - rep(1,24) pid - rep(1,24) dvid - rep(1,24) desvar - rep(1,24) dvdir - rep(0,24) sessidx - c(1:24) m - rep(1,6) n - rep(2,6) o - rep(3,6) p - rep(4,6) numph - c(m,n,o,p) phasebtm - c(b,c,b,c) d1 - cbind(jid,sid,pid,dvid,desvar,dvdir,dvy,sessidx,numph,phasebtm) mn1 - running(dvy, width=6, by=6) #second dataset z - rnorm(24, mean = 0, sd = 10) w - rnorm(24, mean = 0, sd = 10) tm - rnorm(6, mean = 10, sd = 10) b - rep(0,6) c - rep(1,6) tmt - c(b,tm,b,tm) for (t in 2:24) z[t] - 0.25 * z[t - 1] + w[t] dvy - 50 + z + tmt jid - rep(1,24) sid - rep(1,24) pid - rep(2,24) dvid - rep(1,24) desvar - rep(1,24) dvdir - rep(0,24) sessidx - c(1:24) m - rep(1,6) n - rep(2,6) o - rep(3,6) p - rep(4,6) numph - c(m,n,o,p) phasebtm - c(b,c,b,c) d2 - cbind(jid,sid,pid,dvid,desvar,dvdir,dvy,sessidx,numph,phasebtm) mn2 - running(dvy, width=6, by=6) #third dataset z - rnorm(24, mean = 0, sd = 10) w - rnorm(24, mean = 0, sd = 10) tm - rnorm(6, mean = 10, sd = 10) b - rep(0,6) c - rep(1,6) tmt - c(b,tm,b,tm) for (t in 2:24) z[t] - 0.25 * z[t - 1] + w[t] dvy - 50 + z + tmt jid - rep(1,24) sid - rep(1,24) pid - rep(3,24) dvid - rep(1,24) desvar - rep(1,24) dvdir - rep(0,24) sessidx - c(1:24) m - rep(1,6) n - rep(2,6) o - rep(3,6) p - rep(4,6) numph - c(m,n,o,p) phasebtm - c(b,c,b,c) d3 - cbind(jid,sid,pid,dvid,desvar,dvdir,dvy,sessidx,numph,phasebtm) mn3 - running(dvy, width=6, by=6) #concatenate d1 d2 d3 d66 - rbind(d1, d2, d3) d66df - as.data.frame(d66) d66df$case - ordered(d66df$pid, levels = c(1,2,3), labels = c(Case 3, Case 2, Case 1)) p-xyplot(dvy ~ sessidx | case, data=d66df, layout=c(1, 3), xlab= Sessions, ylab = Number of Seconds, type=l) update(p, panel=function(...){ panel.xyplot(...) panel.abline(v=6.5) panel.abline(v=12.5) panel.abline(v=18.5) } ) -- William R. Shadish Distinguished Professor Founding Faculty Mailing Address: William R. Shadish University of California School of Social Sciences, Humanities and Arts 5200 North Lake Rd Merced CA 95343 Physical/Delivery Address: University of California Merced ATTN: William Shadish School of Social Sciences, Humanities and Arts Facilities Services Building A 5200 North Lake Rd. Merced, CA 95343 209-228-4372 voice 209-228-4007 fax (communal fax:
[R] two questions about xyplot
Dear R helpers, I am generating three artificial short interrupted time series datasets (single-case designs; call them Case 1, Case 2, Case 3) and then plotting them in xyplot. I will put the entire code below so you can reproduce. I have been unable to figure out how to do two things. 1. Each time series has 24 time points divided into four phases. Call them phases A, B, C, D for convenience. I have used running() to compute the means of the observations in each of these four parts; and I saved these as objects called mn1 (for Case 1), mn2 (Case 2) and mn3 (Case 3). So mn1 contains the four means for A, B, C, D phases for Case 1, etc. I want to insert these means into the xyplot in the appropriate place. For instance, insert the first mean from mn1 into phase A of Case 1, the second mean into phase B of Case 1, and so forth until insert the fourth mean from mn3 into phase D of Case 3. Ideally, it would insert something like M = 49.02 or Xbar = 49.02 into phase A for Case 1. 2. The xyplot code I use creates a line connecting the data points, and that line is continuous over the entire graph. I would like to have the lines be discontinuous between phases. Phase changes are indicated by panel.abline() in the code, and occur at time points 6.5, 12.5, and 18.5. So, for example, I would like a line connecting the datapoints from 1 to 6, then 7-12, then 13-18, then 19-24 (but not including 6-7, 12-13, and 18-19). I appreciate any help you might be able to offer. Will Shadish Here is the code: # library(gtools) library(lattice) ###g = .66 z - rnorm(24, mean = 0, sd = 10) w - rnorm(24, mean = 0, sd = 10) ###change mean = to vary the effect size tm - rnorm(6, mean = 10, sd = 10) b - rep(0,6) c - rep(1,6) tmt - c(b,tm,b,tm) for (t in 2:24) z[t] - 0.25 * z[t - 1] + w[t] dvy - 50 + z + tmt jid - rep(1,24) sid - rep(1,24) pid - rep(1,24) dvid - rep(1,24) desvar - rep(1,24) dvdir - rep(0,24) sessidx - c(1:24) m - rep(1,6) n - rep(2,6) o - rep(3,6) p - rep(4,6) numph - c(m,n,o,p) phasebtm - c(b,c,b,c) d1 - cbind(jid,sid,pid,dvid,desvar,dvdir,dvy,sessidx,numph,phasebtm) mn1 - running(dvy, width=6, by=6) #second dataset z - rnorm(24, mean = 0, sd = 10) w - rnorm(24, mean = 0, sd = 10) tm - rnorm(6, mean = 10, sd = 10) b - rep(0,6) c - rep(1,6) tmt - c(b,tm,b,tm) for (t in 2:24) z[t] - 0.25 * z[t - 1] + w[t] dvy - 50 + z + tmt jid - rep(1,24) sid - rep(1,24) pid - rep(2,24) dvid - rep(1,24) desvar - rep(1,24) dvdir - rep(0,24) sessidx - c(1:24) m - rep(1,6) n - rep(2,6) o - rep(3,6) p - rep(4,6) numph - c(m,n,o,p) phasebtm - c(b,c,b,c) d2 - cbind(jid,sid,pid,dvid,desvar,dvdir,dvy,sessidx,numph,phasebtm) mn2 - running(dvy, width=6, by=6) #third dataset z - rnorm(24, mean = 0, sd = 10) w - rnorm(24, mean = 0, sd = 10) tm - rnorm(6, mean = 10, sd = 10) b - rep(0,6) c - rep(1,6) tmt - c(b,tm,b,tm) for (t in 2:24) z[t] - 0.25 * z[t - 1] + w[t] dvy - 50 + z + tmt jid - rep(1,24) sid - rep(1,24) pid - rep(3,24) dvid - rep(1,24) desvar - rep(1,24) dvdir - rep(0,24) sessidx - c(1:24) m - rep(1,6) n - rep(2,6) o - rep(3,6) p - rep(4,6) numph - c(m,n,o,p) phasebtm - c(b,c,b,c) d3 - cbind(jid,sid,pid,dvid,desvar,dvdir,dvy,sessidx,numph,phasebtm) mn3 - running(dvy, width=6, by=6) #concatenate d1 d2 d3 d66 - rbind(d1, d2, d3) d66df - as.data.frame(d66) d66df$case - ordered(d66df$pid, levels = c(1,2,3), labels = c(Case 3, Case 2, Case 1)) p-xyplot(dvy ~ sessidx | case, data=d66df, layout=c(1, 3), xlab= Sessions, ylab = Number of Seconds, type=l) update(p, panel=function(...){ panel.xyplot(...) panel.abline(v=6.5) panel.abline(v=12.5) panel.abline(v=18.5) } ) -- William R. Shadish Distinguished Professor Founding Faculty Mailing Address: William R. Shadish University of California School of Social Sciences, Humanities and Arts 5200 North Lake Rd Merced CA 95343 Physical/Delivery Address: University of California Merced ATTN: William Shadish School of Social Sciences, Humanities and Arts Facilities Services Building A 5200 North Lake Rd. Merced, CA 95343 209-228-4372 voice 209-228-4007 fax (communal fax: be sure to include cover sheet) wshad...@ucmerced.edu http://faculty.ucmerced.edu/wshadish/index.htm http://psychology.ucmerced.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions about xyplot
Dear Richard, your solution to the second question worked like a charm. Thanks! So much to learn about this stuff, but at least it is fun. On the first question, yes, I want a text to display the mean in each of the 12 panels. Will On 9/23/2013 11:23 AM, Richard Kwock wrote: Hi, To answer your second question you can do something like this: p-xyplot(dvy ~ sessidx | case, group = numph, data=d66df, col = c(1:4), layout=c(1, 3), xlab= Sessions, ylab = Number of Seconds, type=l) update(p, panel=function(...){ panel.xyplot(...) panel.abline(v=6.5) panel.abline(v=12.5) panel.abline(v=18.5) } ) By setting the group parameter in xyplot to be numph, xyplot will plot different lines for each group of numph you have in each case. For your first question, did you mean you want a text to display the mean in each panel? Richard On Mon, Sep 23, 2013 at 10:41 AM, William Shadish wshad...@ucmerced.edu wrote: Dear R helpers, I am generating three artificial short interrupted time series datasets (single-case designs; call them Case 1, Case 2, Case 3) and then plotting them in xyplot. I will put the entire code below so you can reproduce. I have been unable to figure out how to do two things. 1. Each time series has 24 time points divided into four phases. Call them phases A, B, C, D for convenience. I have used running() to compute the means of the observations in each of these four parts; and I saved these as objects called mn1 (for Case 1), mn2 (Case 2) and mn3 (Case 3). So mn1 contains the four means for A, B, C, D phases for Case 1, etc. I want to insert these means into the xyplot in the appropriate place. For instance, insert the first mean from mn1 into phase A of Case 1, the second mean into phase B of Case 1, and so forth until insert the fourth mean from mn3 into phase D of Case 3. Ideally, it would insert something like M = 49.02 or Xbar = 49.02 into phase A for Case 1. 2. The xyplot code I use creates a line connecting the data points, and that line is continuous over the entire graph. I would like to have the lines be discontinuous between phases. Phase changes are indicated by panel.abline() in the code, and occur at time points 6.5, 12.5, and 18.5. So, for example, I would like a line connecting the datapoints from 1 to 6, then 7-12, then 13-18, then 19-24 (but not including 6-7, 12-13, and 18-19). I appreciate any help you might be able to offer. Will Shadish Here is the code: # library(gtools) library(lattice) ###g = .66 z - rnorm(24, mean = 0, sd = 10) w - rnorm(24, mean = 0, sd = 10) ###change mean = to vary the effect size tm - rnorm(6, mean = 10, sd = 10) b - rep(0,6) c - rep(1,6) tmt - c(b,tm,b,tm) for (t in 2:24) z[t] - 0.25 * z[t - 1] + w[t] dvy - 50 + z + tmt jid - rep(1,24) sid - rep(1,24) pid - rep(1,24) dvid - rep(1,24) desvar - rep(1,24) dvdir - rep(0,24) sessidx - c(1:24) m - rep(1,6) n - rep(2,6) o - rep(3,6) p - rep(4,6) numph - c(m,n,o,p) phasebtm - c(b,c,b,c) d1 - cbind(jid,sid,pid,dvid,desvar,dvdir,dvy,sessidx,numph,phasebtm) mn1 - running(dvy, width=6, by=6) #second dataset z - rnorm(24, mean = 0, sd = 10) w - rnorm(24, mean = 0, sd = 10) tm - rnorm(6, mean = 10, sd = 10) b - rep(0,6) c - rep(1,6) tmt - c(b,tm,b,tm) for (t in 2:24) z[t] - 0.25 * z[t - 1] + w[t] dvy - 50 + z + tmt jid - rep(1,24) sid - rep(1,24) pid - rep(2,24) dvid - rep(1,24) desvar - rep(1,24) dvdir - rep(0,24) sessidx - c(1:24) m - rep(1,6) n - rep(2,6) o - rep(3,6) p - rep(4,6) numph - c(m,n,o,p) phasebtm - c(b,c,b,c) d2 - cbind(jid,sid,pid,dvid,desvar,dvdir,dvy,sessidx,numph,phasebtm) mn2 - running(dvy, width=6, by=6) #third dataset z - rnorm(24, mean = 0, sd = 10) w - rnorm(24, mean = 0, sd = 10) tm - rnorm(6, mean = 10, sd = 10) b - rep(0,6) c - rep(1,6) tmt - c(b,tm,b,tm) for (t in 2:24) z[t] - 0.25 * z[t - 1] + w[t] dvy - 50 + z + tmt jid - rep(1,24) sid - rep(1,24) pid - rep(3,24) dvid - rep(1,24) desvar - rep(1,24) dvdir - rep(0,24) sessidx - c(1:24) m - rep(1,6) n - rep(2,6) o - rep(3,6) p - rep(4,6) numph - c(m,n,o,p) phasebtm - c(b,c,b,c) d3 - cbind(jid,sid,pid,dvid,desvar,dvdir,dvy,sessidx,numph,phasebtm) mn3 - running(dvy, width=6, by=6) #concatenate d1 d2 d3 d66 - rbind(d1, d2, d3) d66df - as.data.frame(d66) d66df$case - ordered(d66df$pid, levels = c(1,2,3), labels = c(Case 3, Case 2, Case 1)) p-xyplot(dvy ~ sessidx | case, data=d66df, layout=c(1, 3), xlab= Sessions, ylab = Number of Seconds, type=l) update(p, panel=function(...){ panel.xyplot(...) panel.abline(v=6.5) panel.abline(v=12.5) panel.abline(v=18.5) } ) -- William R. Shadish Distinguished Professor Founding Faculty Mailing Address: William R. Shadish University of California School of Social Sciences, Humanities and Arts 5200 North Lake Rd Merced CA 95343 Physical/Delivery Address: University of California Merced ATTN: William Shadish
Re: [R] two questions about xyplot
Hi, Getting text to show on the panel plots is a bit trickier, but doable. # append to the dataset the mean for each group and line d66df_mns - cbind(d66df, Means = c(rep(c(mn1, mn2, mn3), each = 6))) # set the y_lim to extend a bit further above the graph to allow for the means to be displayed p-xyplot(dvy ~ sessidx | case, group = numph, data=d66df_mns, col = c(1:4), layout=c(1, 3), xlab= Sessions, ylab = Number of Seconds, ylim = c(min(d66df_mns$dvy), 110), type=l) # pass in the means as an argument to the panel function update(p, panel=function(x, y, means = d66df_mns$Means, ... ){ # print(list(...)) # this will store the groups index by subscript into a variable grps - list(...)$groups[list(...)$subscript] unique_indices - !duplicated(grps) # this will get the mean for each panel and for each line mean_1 - (means[list(...)$subscript][unique_indices]) print(mean_1) print(x[unique_indices]) panel.xyplot(x, y, ... ) panel.abline(v=6.5) panel.abline(v=12.5) panel.abline(v=18.5) panel.abline(v=18.5) # print the mean values here. panel.text(x[unique_indices], 100, paste(M = , round(mean_1, 2)), adj = c(0,0)) } ) If you are working in lattice a lot, print(list(...)) is a handy function that will show you what parameters values you are passing in as ... in the panel function. Hope that helps. Richard On Mon, Sep 23, 2013 at 11:23 AM, Richard Kwock richardkw...@gmail.com wrote: Hi, To answer your second question you can do something like this: p-xyplot(dvy ~ sessidx | case, group = numph, data=d66df, col = c(1:4), layout=c(1, 3), xlab= Sessions, ylab = Number of Seconds, type=l) update(p, panel=function(...){ panel.xyplot(...) panel.abline(v=6.5) panel.abline(v=12.5) panel.abline(v=18.5) } ) By setting the group parameter in xyplot to be numph, xyplot will plot different lines for each group of numph you have in each case. For your first question, did you mean you want a text to display the mean in each panel? Richard On Mon, Sep 23, 2013 at 10:41 AM, William Shadish wshad...@ucmerced.edu wrote: Dear R helpers, I am generating three artificial short interrupted time series datasets (single-case designs; call them Case 1, Case 2, Case 3) and then plotting them in xyplot. I will put the entire code below so you can reproduce. I have been unable to figure out how to do two things. 1. Each time series has 24 time points divided into four phases. Call them phases A, B, C, D for convenience. I have used running() to compute the means of the observations in each of these four parts; and I saved these as objects called mn1 (for Case 1), mn2 (Case 2) and mn3 (Case 3). So mn1 contains the four means for A, B, C, D phases for Case 1, etc. I want to insert these means into the xyplot in the appropriate place. For instance, insert the first mean from mn1 into phase A of Case 1, the second mean into phase B of Case 1, and so forth until insert the fourth mean from mn3 into phase D of Case 3. Ideally, it would insert something like M = 49.02 or Xbar = 49.02 into phase A for Case 1. 2. The xyplot code I use creates a line connecting the data points, and that line is continuous over the entire graph. I would like to have the lines be discontinuous between phases. Phase changes are indicated by panel.abline() in the code, and occur at time points 6.5, 12.5, and 18.5. So, for example, I would like a line connecting the datapoints from 1 to 6, then 7-12, then 13-18, then 19-24 (but not including 6-7, 12-13, and 18-19). I appreciate any help you might be able to offer. Will Shadish Here is the code: # library(gtools) library(lattice) ###g = .66 z - rnorm(24, mean = 0, sd = 10) w - rnorm(24, mean = 0, sd = 10) ###change mean = to vary the effect size tm - rnorm(6, mean = 10, sd = 10) b - rep(0,6) c - rep(1,6) tmt - c(b,tm,b,tm) for (t in 2:24) z[t] - 0.25 * z[t - 1] + w[t] dvy - 50 + z + tmt jid - rep(1,24) sid - rep(1,24) pid - rep(1,24) dvid - rep(1,24) desvar - rep(1,24) dvdir - rep(0,24) sessidx - c(1:24) m - rep(1,6) n - rep(2,6) o - rep(3,6) p - rep(4,6) numph - c(m,n,o,p) phasebtm - c(b,c,b,c) d1 - cbind(jid,sid,pid,dvid,desvar,dvdir,dvy,sessidx,numph,phasebtm) mn1 - running(dvy, width=6, by=6) #second dataset z - rnorm(24, mean = 0, sd = 10) w - rnorm(24, mean = 0, sd = 10) tm - rnorm(6, mean = 10, sd = 10) b - rep(0,6) c - rep(1,6) tmt - c(b,tm,b,tm) for (t in 2:24) z[t] - 0.25 * z[t - 1] + w[t] dvy - 50 + z + tmt jid - rep(1,24) sid - rep(1,24) pid - rep(2,24) dvid - rep(1,24) desvar - rep(1,24) dvdir - rep(0,24) sessidx - c(1:24) m - rep(1,6) n - rep(2,6) o - rep(3,6) p - rep(4,6) numph - c(m,n,o,p) phasebtm - c(b,c,b,c) d2 -
Re: [R] two questions about xyplot
Richard, This worked perfectly (adding # before update). Thank you so much for your help. I've bought a couple of books on R Graphics so I can learn this stuff better. Will On 9/23/2013 1:08 PM, Richard Kwock wrote: Hi, Getting text to show on the panel plots is a bit trickier, but doable. # append to the dataset the mean for each group and line d66df_mns - cbind(d66df, Means = c(rep(c(mn1, mn2, mn3), each = 6))) # set the y_lim to extend a bit further above the graph to allow for the means to be displayed p-xyplot(dvy ~ sessidx | case, group = numph, data=d66df_mns, col = c(1:4), layout=c(1, 3), xlab= Sessions, ylab = Number of Seconds, ylim = c(min(d66df_mns$dvy), 110), type=l) # pass in the means as an argument to the panel function update(p, panel=function(x, y, means = d66df_mns$Means, ... ){ # print(list(...)) # this will store the groups index by subscript into a variable grps - list(...)$groups[list(...)$subscript] unique_indices - !duplicated(grps) # this will get the mean for each panel and for each line mean_1 - (means[list(...)$subscript][unique_indices]) print(mean_1) print(x[unique_indices]) panel.xyplot(x, y, ... ) panel.abline(v=6.5) panel.abline(v=12.5) panel.abline(v=18.5) panel.abline(v=18.5) # print the mean values here. panel.text(x[unique_indices], 100, paste(M = , round(mean_1, 2)), adj = c(0,0)) } ) If you are working in lattice a lot, print(list(...)) is a handy function that will show you what parameters values you are passing in as ... in the panel function. Hope that helps. Richard On Mon, Sep 23, 2013 at 11:23 AM, Richard Kwock richardkw...@gmail.com wrote: Hi, To answer your second question you can do something like this: p-xyplot(dvy ~ sessidx | case, group = numph, data=d66df, col = c(1:4), layout=c(1, 3), xlab= Sessions, ylab = Number of Seconds, type=l) update(p, panel=function(...){ panel.xyplot(...) panel.abline(v=6.5) panel.abline(v=12.5) panel.abline(v=18.5) } ) By setting the group parameter in xyplot to be numph, xyplot will plot different lines for each group of numph you have in each case. For your first question, did you mean you want a text to display the mean in each panel? Richard On Mon, Sep 23, 2013 at 10:41 AM, William Shadish wshad...@ucmerced.edu wrote: Dear R helpers, I am generating three artificial short interrupted time series datasets (single-case designs; call them Case 1, Case 2, Case 3) and then plotting them in xyplot. I will put the entire code below so you can reproduce. I have been unable to figure out how to do two things. 1. Each time series has 24 time points divided into four phases. Call them phases A, B, C, D for convenience. I have used running() to compute the means of the observations in each of these four parts; and I saved these as objects called mn1 (for Case 1), mn2 (Case 2) and mn3 (Case 3). So mn1 contains the four means for A, B, C, D phases for Case 1, etc. I want to insert these means into the xyplot in the appropriate place. For instance, insert the first mean from mn1 into phase A of Case 1, the second mean into phase B of Case 1, and so forth until insert the fourth mean from mn3 into phase D of Case 3. Ideally, it would insert something like M = 49.02 or Xbar = 49.02 into phase A for Case 1. 2. The xyplot code I use creates a line connecting the data points, and that line is continuous over the entire graph. I would like to have the lines be discontinuous between phases. Phase changes are indicated by panel.abline() in the code, and occur at time points 6.5, 12.5, and 18.5. So, for example, I would like a line connecting the datapoints from 1 to 6, then 7-12, then 13-18, then 19-24 (but not including 6-7, 12-13, and 18-19). I appreciate any help you might be able to offer. Will Shadish Here is the code: # library(gtools) library(lattice) ###g = .66 z - rnorm(24, mean = 0, sd = 10) w - rnorm(24, mean = 0, sd = 10) ###change mean = to vary the effect size tm - rnorm(6, mean = 10, sd = 10) b - rep(0,6) c - rep(1,6) tmt - c(b,tm,b,tm) for (t in 2:24) z[t] - 0.25 * z[t - 1] + w[t] dvy - 50 + z + tmt jid - rep(1,24) sid - rep(1,24) pid - rep(1,24) dvid - rep(1,24) desvar - rep(1,24) dvdir - rep(0,24) sessidx - c(1:24) m - rep(1,6) n - rep(2,6) o - rep(3,6) p - rep(4,6) numph - c(m,n,o,p) phasebtm - c(b,c,b,c) d1 - cbind(jid,sid,pid,dvid,desvar,dvdir,dvy,sessidx,numph,phasebtm) mn1 - running(dvy, width=6, by=6) #second dataset z - rnorm(24, mean = 0, sd = 10) w - rnorm(24, mean = 0, sd = 10) tm - rnorm(6, mean = 10, sd = 10) b - rep(0,6) c - rep(1,6) tmt - c(b,tm,b,tm) for (t in 2:24) z[t] - 0.25 * z[t - 1] + w[t] dvy - 50 + z + tmt jid - rep(1,24) sid - rep(1,24) pid - rep(2,24) dvid - rep(1,24) desvar - rep(1,24) dvdir
[R] Two questions about R2BayesX package
Dear All, I have two questions regarding the use of the R2BayesX package for Bayesian analysis. First, is it possible to generate predictions based on the fitted model? According to Gelman and Hill (2007, pp. 361-363), there are at least two ways to do this in BUGS: (1) generate additional data points with the dependent variable coded as missing (and all the independent variables fixed at desirable levels) and let BUGS fill in the values; (2) use R to combine the estimated BUGS results and data value to get the new predictions. Can these be done with R2BayesX? Can someone offer some examples? Second, I wonder whether R2BayesX can estimate grouped logistic regression models. One example is the Surgical example in the BUGS example collection ( http://mathstat.helsinki.fi/openbugs/Examples/Surgical.html) where the dependent variable consists (1) the number of deaths and (2) the number of total patients. Many thanks for the help. Best, Shige Reference Gelman, A., and J. Hill. 2007. *Data analysis using regression and multilevel/hierarchical models*. Cambridge, England: Cambridge University Press New York. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] two questions about character manipulation
Dear all, I want to manipulate a character string such as ex-cbind(data$response1,data$response2) in R in two ways: 1) extracting the response1 portion of ex 2) replacing $ with . I am wondering that is it possible efficiently doing these in R? Best Ozgur -- View this message in context: http://r.789695.n4.nabble.com/two-questions-about-character-manipulation-tp4643292.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions about character manipulation
On Sun, Sep 16, 2012 at 3:35 PM, Özgür Asar oa...@metu.edu.tr wrote: Dear all, I want to manipulate a character string such as ex-cbind(data$response1,data$response2) in R in two ways: 1) extracting the response1 portion of ex I'm not sure what you mean by portion -- if you just want response1 why do you need to process ex? You probably wind up wanting to use gsub() and putting in for things which aren't response1, but again, it seems impractical... 2) replacing $ with . gsub($, ., ex, fixed = TRUE) Cheers, Michael I am wondering that is it possible efficiently doing these in R? Best Ozgur -- View this message in context: http://r.789695.n4.nabble.com/two-questions-about-character-manipulation-tp4643292.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions about character manipulation
Hello, Try the following. 1) pattern - response. m - regexpr(pattern, ex) #gregexpr to get all response regmatches(ex, m) 2) gsub(\\$, \\., ex) Hope this helps, Rui Barradas Em 16-09-2012 15:35, Özgür Asar escreveu: Dear all, I want to manipulate a character string such as ex-cbind(data$response1,data$response2) in R in two ways: 1) extracting the response1 portion of ex 2) replacing $ with . I am wondering that is it possible efficiently doing these in R? Best Ozgur -- View this message in context: http://r.789695.n4.nabble.com/two-questions-about-character-manipulation-tp4643292.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions about character manipulation
Dear Rui Barradas and Michael Weylandt, Many thanks for your replies. My second question is solved now. But I think I did not expressed my first wish in a clear way Indeed, in ex-cbind(data$response1,data$response2), I want to extract the variable name between $ and , (corresponds to response1 in this example) and the one between $ and ) (corresponds to response2). These symbols ($, ,, ) ) are always same, but the names (response1, response2) might change from data to data. Best Ozgur -- View this message in context: http://r.789695.n4.nabble.com/two-questions-about-character-manipulation-tp4643292p4643301.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions about character manipulation
Hello, This should do it. You can collapse the first two instructions, but I've left it like this for clarity. s - unlist(strsplit(ex, [,)[:blank:]])) s - gsub(^.*\\$, , s) s[nchar(s) 0] Rui Barradas Em 16-09-2012 17:26, Özgür Asar escreveu: Dear Rui Barradas and Michael Weylandt, Many thanks for your replies. My second question is solved now. But I think I did not expressed my first wish in a clear way Indeed, in ex-cbind(data$response1,data$response2), I want to extract the variable name between $ and , (corresponds to response1 in this example) and the one between $ and ) (corresponds to response2). These symbols ($, ,, ) ) are always same, but the names (response1, response2) might change from data to data. Best Ozgur -- View this message in context: http://r.789695.n4.nabble.com/two-questions-about-character-manipulation-tp4643292p4643301.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions about character manipulation
Hi, Try this: ex-cbind(data$response1,data$response2) gsub(.*\\(.*\\$(.*)\\,.*\\$.*\\),\\1,ex) #[1] response1 unlist(strsplit(gsub(.*\\(.*\\$(.*)\\,.*\\$(.*)\\),\\1 \\2,ex), )) #[1] response1 response2 A.K. - Original Message - From: Özgür Asar oa...@metu.edu.tr To: r-help@r-project.org Cc: Sent: Sunday, September 16, 2012 12:26 PM Subject: Re: [R] two questions about character manipulation Dear Rui Barradas and Michael Weylandt, Many thanks for your replies. My second question is solved now. But I think I did not expressed my first wish in a clear way Indeed, in ex-cbind(data$response1,data$response2), I want to extract the variable name between $ and , (corresponds to response1 in this example) and the one between $ and ) (corresponds to response2). These symbols ($, ,, ) ) are always same, but the names (response1, response2) might change from data to data. Best Ozgur -- View this message in context: http://r.789695.n4.nabble.com/two-questions-about-character-manipulation-tp4643292p4643301.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions re: the use of lattice (Q1 SOLVED, not Q2)
Dear friends, Many thanks to Jim (Holtman) and David (Carlson) for their quick responses: Q1 is now solved. There are two almost equivalent ways for doing this. They follow: library(lattice) z - rbind(cbind(z, 0), cbind(z, 20), cbind(z, 40)) z - cbind(z, rnorm(n = nrow(z))) z - as.data.frame(z) names(z) - c(Method, sigma, INU, Error) sigma - as.numeric(levels(z$sigma)) sigmaExprList - lapply(sigma, function(s) bquote(italic(sigma) == . (s))) sigmaExpr - as.expression(sigmaExprList) z$Method - factor(z$Method, levels = c(BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar)) bwplot(Error~Method | sigma, data = z[z[,INU] == 0, ],scales=list(rot=90), horiz = F, xlab = Method, ylab = Relative Error, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, layout = c(1, 5), col = red) # On the other hand, if we do not want to change z$Method in perpetuity, we could do the following: z - cbind(rep(c(BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar), each = 250), rep(c(5, 10, 20, 30, 50), each = 50)) z - rbind(cbind(z, 0), cbind(z, 20), cbind(z, 40)) z - cbind(z, rnorm(n = nrow(z))) z - as.data.frame(z) names(z) - c(Method, sigma, INU, Error) sigma - as.numeric(levels(z$sigma)) sigmaExprList - lapply(sigma, function(s) bquote(italic(sigma) == . (s))) sigmaExpr - as.expression(sigmaExprList) bwplot(Error~factor(Method, levels = unique(Method)) | sigma, data = z[z [,INU] == 0, ],scales=list(rot=90), horiz = F, xlab = Method, ylab = Relative Error, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, layout = c(1, 5), col = red) # However, I am unable to solve Q2. Actually, even more basic is the fact that I can not get Box-Whisker plots without anything else. David's suggestion of using useOuterStrips appears reasonable, but as I said, I can not get anything meaningful even before then. # Try: bwplot(Error~Methods | sigma + INU, data = z,scales=list(rot=90)) Any suggestions? Many thanks again! Best wishes, Ranjan GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM?, MSN? Messenger, Yahoo!? Messenger, ICQ?, Google Talk? and most webmails __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions re: the use of lattice (Q1 SOLVED, not Q2)
On 2012-07-22 09:02, Ranjan Maitra wrote: Dear friends, Many thanks to Jim (Holtman) and David (Carlson) for their quick responses: Q1 is now solved. There are two almost equivalent ways for doing this. They follow: library(lattice) z - rbind(cbind(z, 0), cbind(z, 20), cbind(z, 40)) z - cbind(z, rnorm(n = nrow(z))) z - as.data.frame(z) names(z) - c(Method, sigma, INU, Error) sigma - as.numeric(levels(z$sigma)) sigmaExprList - lapply(sigma, function(s) bquote(italic(sigma) == . (s))) sigmaExpr - as.expression(sigmaExprList) z$Method - factor(z$Method, levels = c(BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar)) bwplot(Error~Method | sigma, data = z[z[,INU] == 0, ],scales=list(rot=90), horiz = F, xlab = Method, ylab = Relative Error, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, layout = c(1, 5), col = red) # On the other hand, if we do not want to change z$Method in perpetuity, we could do the following: z - cbind(rep(c(BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar), each = 250), rep(c(5, 10, 20, 30, 50), each = 50)) z - rbind(cbind(z, 0), cbind(z, 20), cbind(z, 40)) z - cbind(z, rnorm(n = nrow(z))) z - as.data.frame(z) names(z) - c(Method, sigma, INU, Error) sigma - as.numeric(levels(z$sigma)) sigmaExprList - lapply(sigma, function(s) bquote(italic(sigma) == . (s))) sigmaExpr - as.expression(sigmaExprList) bwplot(Error~factor(Method, levels = unique(Method)) | sigma, data = z[z [,INU] == 0, ],scales=list(rot=90), horiz = F, xlab = Method, ylab = Relative Error, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, layout = c(1, 5), col = red) # However, I am unable to solve Q2. Actually, even more basic is the fact that I can not get Box-Whisker plots without anything else. David's suggestion of using useOuterStrips appears reasonable, but as I said, I can not get anything meaningful even before then. # Try: bwplot(Error~Methods | sigma + INU, data = z,scales=list(rot=90)) Any suggestions? [I had to dig back to see what your Q2 was. It's good to keep context.] Try this: p - bwplot(Error~Method | sigma + INU, data = z, scales = list(rot=90), horiz = FALSE, layout = c(5,3), col = red) require(latticeExtra) useOuterStrips(p, strip = your sigma-strip function, strip.left = strip.custom( var.name = INU, sep = = , strip.names = TRUE) ) Peter Ehlers Many thanks again! Best wishes, Ranjan GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM?, MSN? Messenger, Yahoo!? Messenger, ICQ?, Google Talk? and most webmails __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions re: the use of lattice (Q1 SOLVED, not Q2)
On Sun, 22 Jul 2012 15:04:36 -0700 Peter Ehlers ehl...@ucalgary.ca wrote: On 2012-07-22 09:02, Ranjan Maitra wrote: Dear friends, Many thanks to Jim (Holtman) and David (Carlson) for their quick responses: Q1 is now solved. There are two almost equivalent ways for doing this. They follow: library(lattice) z - rbind(cbind(z, 0), cbind(z, 20), cbind(z, 40)) z - cbind(z, rnorm(n = nrow(z))) z - as.data.frame(z) names(z) - c(Method, sigma, INU, Error) sigma - as.numeric(levels(z$sigma)) sigmaExprList - lapply(sigma, function(s) bquote(italic(sigma) == . (s))) sigmaExpr - as.expression(sigmaExprList) z$Method - factor(z$Method, levels = c(BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar)) bwplot(Error~Method | sigma, data = z[z[,INU] == 0, ],scales=list(rot=90), horiz = F, xlab = Method, ylab = Relative Error, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, layout = c(1, 5), col = red) # On the other hand, if we do not want to change z$Method in perpetuity, we could do the following: z - cbind(rep(c(BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar), each = 250), rep(c(5, 10, 20, 30, 50), each = 50)) z - rbind(cbind(z, 0), cbind(z, 20), cbind(z, 40)) z - cbind(z, rnorm(n = nrow(z))) z - as.data.frame(z) names(z) - c(Method, sigma, INU, Error) sigma - as.numeric(levels(z$sigma)) sigmaExprList - lapply(sigma, function(s) bquote(italic(sigma) == . (s))) sigmaExpr - as.expression(sigmaExprList) bwplot(Error~factor(Method, levels = unique(Method)) | sigma, data = z[z [,INU] == 0, ],scales=list(rot=90), horiz = F, xlab = Method, ylab = Relative Error, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, layout = c(1, 5), col = red) # However, I am unable to solve Q2. Actually, even more basic is the fact that I can not get Box-Whisker plots without anything else. David's suggestion of using useOuterStrips appears reasonable, but as I said, I can not get anything meaningful even before then. # Try: bwplot(Error~Methods | sigma + INU, data = z,scales=list(rot=90)) Any suggestions? [I had to dig back to see what your Q2 was. It's good to keep context.] Try this: p - bwplot(Error~Method | sigma + INU, data = z, scales = list(rot=90), horiz = FALSE, layout = c(5,3), col = red) require(latticeExtra) useOuterStrips(p, strip = your sigma-strip function, strip.left = strip.custom( var.name = INU, sep = = , strip.names = TRUE) ) This works!! Thanks very much!! May I also ask how I can put in a % after the INU value? i.e., I want the labels to be INU = 0%, INU = 20%, INU = 40% (instead of INU = 0, INU = 20 and INU = 40). Many thanks again! Ranjan FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions re: the use of lattice (Q1 SOLVED, not Q2)
On 2012-07-22 15:58, Ranjan Maitra wrote: On Sun, 22 Jul 2012 15:04:36 -0700 Peter Ehlers ehl...@ucalgary.ca wrote: On 2012-07-22 09:02, Ranjan Maitra wrote: Dear friends, Many thanks to Jim (Holtman) and David (Carlson) for their quick responses: Q1 is now solved. There are two almost equivalent ways for doing this. They follow: library(lattice) z - rbind(cbind(z, 0), cbind(z, 20), cbind(z, 40)) z - cbind(z, rnorm(n = nrow(z))) z - as.data.frame(z) names(z) - c(Method, sigma, INU, Error) sigma - as.numeric(levels(z$sigma)) sigmaExprList - lapply(sigma, function(s) bquote(italic(sigma) == . (s))) sigmaExpr - as.expression(sigmaExprList) z$Method - factor(z$Method, levels = c(BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar)) bwplot(Error~Method | sigma, data = z[z[,INU] == 0, ],scales=list(rot=90), horiz = F, xlab = Method, ylab = Relative Error, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, layout = c(1, 5), col = red) # On the other hand, if we do not want to change z$Method in perpetuity, we could do the following: z - cbind(rep(c(BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar), each = 250), rep(c(5, 10, 20, 30, 50), each = 50)) z - rbind(cbind(z, 0), cbind(z, 20), cbind(z, 40)) z - cbind(z, rnorm(n = nrow(z))) z - as.data.frame(z) names(z) - c(Method, sigma, INU, Error) sigma - as.numeric(levels(z$sigma)) sigmaExprList - lapply(sigma, function(s) bquote(italic(sigma) == . (s))) sigmaExpr - as.expression(sigmaExprList) bwplot(Error~factor(Method, levels = unique(Method)) | sigma, data = z[z [,INU] == 0, ],scales=list(rot=90), horiz = F, xlab = Method, ylab = Relative Error, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, layout = c(1, 5), col = red) # However, I am unable to solve Q2. Actually, even more basic is the fact that I can not get Box-Whisker plots without anything else. David's suggestion of using useOuterStrips appears reasonable, but as I said, I can not get anything meaningful even before then. # Try: bwplot(Error~Methods | sigma + INU, data = z,scales=list(rot=90)) Any suggestions? [I had to dig back to see what your Q2 was. It's good to keep context.] Try this: p - bwplot(Error~Method | sigma + INU, data = z, scales = list(rot=90), horiz = FALSE, layout = c(5,3), col = red) require(latticeExtra) useOuterStrips(p, strip = your sigma-strip function, strip.left = strip.custom( var.name = INU, sep = = , strip.names = TRUE) ) This works!! Thanks very much!! May I also ask how I can put in a % after the INU value? i.e., I want the labels to be INU = 0%, INU = 20%, INU = 40% (instead of INU = 0, INU = 20 and INU = 40). Define: INUExpr - paste0(INU = , c(0,20,40), %) Then useOuterStrips(p, strip = your sigma-strip function, strip.left = your strip function with var.name = INUExpr[which.panel] ) Peter Ehlers Many thanks again! Ranjan FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions re: the use of lattice (Q1 SOLVED, not Q2)
[I had to dig back to see what your Q2 was. It's good to keep context.] Try this: p - bwplot(Error~Method | sigma + INU, data = z, scales = list(rot=90), horiz = FALSE, layout = c(5,3), col = red) require(latticeExtra) useOuterStrips(p, strip = your sigma-strip function, strip.left = strip.custom( var.name = INU, sep = = , strip.names = TRUE) ) This works!! Thanks very much!! May I also ask how I can put in a % after the INU value? i.e., I want the labels to be INU = 0%, INU = 20%, INU = 40% (instead of INU = 0, INU = 20 and INU = 40). Define: INUExpr - paste0(INU = , c(0,20,40), %) Then useOuterStrips(p, strip = your sigma-strip function, strip.left = your strip function with var.name = INUExpr[which.panel] ) Peter Ehlers Thanks, Peter! I have been trying things like this all afternoon. But i still get the same problem: The following, from your earlier e-mail works (cut-and-paste should be fine): library(lattice) library(latticeExtra) z - cbind(rep(c(BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar), each = 250), rep(c(5, 10, 20, 30, 50), each = 50)) z - rbind(cbind(z, 0), cbind(z, 20), cbind(z, 40)) z - cbind(z, rnorm(n = nrow(z))) z - as.data.frame(z) names(z) - c(Method, sigma, INU, Error) sigma - as.numeric(levels(z$sigma)) sigmaExprList - lapply(sigma, function(s) bquote(italic(sigma) == . (s))) sigmaExpr - as.expression(sigmaExprList) p - bwplot(Error~factor(Method, levels = unique(Method)) | sigma + INU, data = z, scales = list(rot=90), horiz = FALSE, layout = c(3,5), col = red) require(latticeExtra) useOuterStrips(p, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, strip.custom( var.name = INU, sep = = , strip.names = TRUE)) However, this (edition of the latter) does not: INUExpr - paste0(INU = , c(0,20,40), %) useOuterStrips(p, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, strip.left = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.custom( var.name = INUExpr[which.panel], strip.names = TRUE, ...) } ) Clearly, I am doing something wrong in defining the strip function. (It does not help that i am not completely at home with it.) Can you or someone please help some more? Many thanks again! Best wishes, Ranjan FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions re: the use of lattice (Q1 SOLVED, not Q2)
On 2012-07-22 18:03, Ranjan Maitra wrote: [I had to dig back to see what your Q2 was. It's good to keep context.] Try this: p - bwplot(Error~Method | sigma + INU, data = z, scales = list(rot=90), horiz = FALSE, layout = c(5,3), col = red) require(latticeExtra) useOuterStrips(p, strip = your sigma-strip function, strip.left = strip.custom( var.name = INU, sep = = , strip.names = TRUE) ) This works!! Thanks very much!! May I also ask how I can put in a % after the INU value? i.e., I want the labels to be INU = 0%, INU = 20%, INU = 40% (instead of INU = 0, INU = 20 and INU = 40). Define: INUExpr - paste0(INU = , c(0,20,40), %) Then useOuterStrips(p, strip = your sigma-strip function, strip.left = your strip function with var.name = INUExpr[which.panel] ) Peter Ehlers Thanks, Peter! I have been trying things like this all afternoon. But i still get the same problem: The following, from your earlier e-mail works (cut-and-paste should be fine): library(lattice) library(latticeExtra) z - cbind(rep(c(BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar), each = 250), rep(c(5, 10, 20, 30, 50), each = 50)) z - rbind(cbind(z, 0), cbind(z, 20), cbind(z, 40)) z - cbind(z, rnorm(n = nrow(z))) z - as.data.frame(z) names(z) - c(Method, sigma, INU, Error) sigma - as.numeric(levels(z$sigma)) sigmaExprList - lapply(sigma, function(s) bquote(italic(sigma) == . (s))) sigmaExpr - as.expression(sigmaExprList) p - bwplot(Error~factor(Method, levels = unique(Method)) | sigma + INU, data = z, scales = list(rot=90), horiz = FALSE, layout = c(3,5), col = red) require(latticeExtra) useOuterStrips(p, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, strip.custom( var.name = INU, sep = = , strip.names = TRUE)) However, this (edition of the latter) does not: INUExpr - paste0(INU = , c(0,20,40), %) useOuterStrips(p, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, strip.left = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.custom( var.name = INUExpr[which.panel], strip.names = TRUE, ...) } ) Clearly, I am doing something wrong in defining the strip function. (It does not help that i am not completely at home with it.) Can you or someone please help some more? Actually, it would work if you made strip.left a more accurate copy of your strip function. But I hadn't taken a close enough look at your strip function. I don't think it's needed at all. Try this: p - bwplot(Error ~ Method | sigma + INU, data = z, scales = list(rot=90), horiz = FALSE, layout = c(5,3), col = red) useOuterStrips(p, strip = strip.custom( factor.levels = sigmaExpr), strip.left = strip.custom( factor.levels = INUExpr) ) BTW, you don't need the italic() on sigma - it does nothing in plotmath. Peter Ehlers Many thanks again! Best wishes, Ranjan FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions re: the use of lattice (Q1 SOLVED, not Q2)
On Sun, 22 Jul 2012 18:58:39 -0700 Peter Ehlers ehl...@ucalgary.ca wrote: On 2012-07-22 18:03, Ranjan Maitra wrote: [I had to dig back to see what your Q2 was. It's good to keep context.] Try this: p - bwplot(Error~Method | sigma + INU, data = z, scales = list(rot=90), horiz = FALSE, layout = c(5,3), col = red) require(latticeExtra) useOuterStrips(p, strip = your sigma-strip function, strip.left = strip.custom( var.name = INU, sep = = , strip.names = TRUE) ) This works!! Thanks very much!! May I also ask how I can put in a % after the INU value? i.e., I want the labels to be INU = 0%, INU = 20%, INU = 40% (instead of INU = 0, INU = 20 and INU = 40). Define: INUExpr - paste0(INU = , c(0,20,40), %) Then useOuterStrips(p, strip = your sigma-strip function, strip.left = your strip function with var.name = INUExpr[which.panel] ) Peter Ehlers Thanks, Peter! I have been trying things like this all afternoon. But i still get the same problem: The following, from your earlier e-mail works (cut-and-paste should be fine): library(lattice) library(latticeExtra) z - cbind(rep(c(BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar), each = 250), rep(c(5, 10, 20, 30, 50), each = 50)) z - rbind(cbind(z, 0), cbind(z, 20), cbind(z, 40)) z - cbind(z, rnorm(n = nrow(z))) z - as.data.frame(z) names(z) - c(Method, sigma, INU, Error) sigma - as.numeric(levels(z$sigma)) sigmaExprList - lapply(sigma, function(s) bquote(italic(sigma) == . (s))) sigmaExpr - as.expression(sigmaExprList) p - bwplot(Error~factor(Method, levels = unique(Method)) | sigma + INU, data = z, scales = list(rot=90), horiz = FALSE, layout = c(3,5), col = red) require(latticeExtra) useOuterStrips(p, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, strip.custom( var.name = INU, sep = = , strip.names = TRUE)) However, this (edition of the latter) does not: INUExpr - paste0(INU = , c(0,20,40), %) useOuterStrips(p, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, strip.left = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.custom( var.name = INUExpr[which.panel], strip.names = TRUE, ...) } ) Clearly, I am doing something wrong in defining the strip function. (It does not help that i am not completely at home with it.) Can you or someone please help some more? Actually, it would work if you made strip.left a more accurate copy of your strip function. But I hadn't taken a close enough look at your strip function. I don't think it's needed at all. Try this: p - bwplot(Error ~ Method | sigma + INU, data = z, scales = list(rot=90), horiz = FALSE, layout = c(5,3), col = red) useOuterStrips(p, strip = strip.custom( factor.levels = sigmaExpr), strip.left = strip.custom( factor.levels = INUExpr) ) BTW, you don't need the italic() on sigma - it does nothing in plotmath. Yes, thanks for this and also the corrections. One last question (I guess): my sigmas are no longer in the order of 5, 10, 20, 30, 50. How do I fix that? Thanks again!! Ranjan FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions re: the use of lattice (Q1 SOLVED, not Q2)
On 2012-07-22 19:09, Ranjan Maitra wrote: On Sun, 22 Jul 2012 18:58:39 -0700 Peter Ehlers ehl...@ucalgary.ca wrote: On 2012-07-22 18:03, Ranjan Maitra wrote: [I had to dig back to see what your Q2 was. It's good to keep context.] Try this: p - bwplot(Error~Method | sigma + INU, data = z, scales = list(rot=90), horiz = FALSE, layout = c(5,3), col = red) require(latticeExtra) useOuterStrips(p, strip = your sigma-strip function, strip.left = strip.custom( var.name = INU, sep = = , strip.names = TRUE) ) This works!! Thanks very much!! May I also ask how I can put in a % after the INU value? i.e., I want the labels to be INU = 0%, INU = 20%, INU = 40% (instead of INU = 0, INU = 20 and INU = 40). Define: INUExpr - paste0(INU = , c(0,20,40), %) Then useOuterStrips(p, strip = your sigma-strip function, strip.left = your strip function with var.name = INUExpr[which.panel] ) Peter Ehlers Thanks, Peter! I have been trying things like this all afternoon. But i still get the same problem: The following, from your earlier e-mail works (cut-and-paste should be fine): library(lattice) library(latticeExtra) z - cbind(rep(c(BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar), each = 250), rep(c(5, 10, 20, 30, 50), each = 50)) z - rbind(cbind(z, 0), cbind(z, 20), cbind(z, 40)) z - cbind(z, rnorm(n = nrow(z))) z - as.data.frame(z) names(z) - c(Method, sigma, INU, Error) sigma - as.numeric(levels(z$sigma)) sigmaExprList - lapply(sigma, function(s) bquote(italic(sigma) == . (s))) sigmaExpr - as.expression(sigmaExprList) p - bwplot(Error~factor(Method, levels = unique(Method)) | sigma + INU, data = z, scales = list(rot=90), horiz = FALSE, layout = c(3,5), col = red) require(latticeExtra) useOuterStrips(p, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, strip.custom( var.name = INU, sep = = , strip.names = TRUE)) However, this (edition of the latter) does not: INUExpr - paste0(INU = , c(0,20,40), %) useOuterStrips(p, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, strip.left = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.custom( var.name = INUExpr[which.panel], strip.names = TRUE, ...) } ) Clearly, I am doing something wrong in defining the strip function. (It does not help that i am not completely at home with it.) Can you or someone please help some more? Actually, it would work if you made strip.left a more accurate copy of your strip function. But I hadn't taken a close enough look at your strip function. I don't think it's needed at all. Try this: p - bwplot(Error ~ Method | sigma + INU, data = z, scales = list(rot=90), horiz = FALSE, layout = c(5,3), col = red) useOuterStrips(p, strip = strip.custom( factor.levels = sigmaExpr), strip.left = strip.custom( factor.levels = INUExpr) ) BTW, you don't need the italic() on sigma - it does nothing in plotmath. Yes, thanks for this and also the corrections. One last question (I guess): my sigmas are no longer in the order of 5, 10, 20, 30, 50. How do I fix that? Just reset the levels of z$sigma (and also redefine sigmaExpr): z$sigma - factor(z$sigma, levels = c(5,10,20,30,50)) # new levels order sigmaExprList - lapply(as.numeric(levels(z$sigma)), function(s) bquote(sigma == .(s))) sigmaExpr - as.expression(sigmaExprList) INUExpr - paste0(INU = , c(0,20,40), %) p - bwplot(Error ~ Method | sigma + INU, data = z, scales = list(rot=90), horiz = FALSE, layout = c(5,3), col = red) useOuterStrips(p, strip = strip.custom( factor.levels = sigmaExpr), strip.left = strip.custom( factor.levels = INUExpr) ) Peter Ehlers Thanks again!! Ranjan FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop!
Re: [R] two questions re: the use of lattice (Q1 SOLVED, not Q2)
Just reset the levels of z$sigma (and also redefine sigmaExpr): z$sigma - factor(z$sigma, levels = c(5,10,20,30,50)) # new levels order sigmaExprList - lapply(as.numeric(levels(z$sigma)), function(s) bquote(sigma == .(s))) sigmaExpr - as.expression(sigmaExprList) INUExpr - paste0(INU = , c(0,20,40), %) p - bwplot(Error ~ Method | sigma + INU, data = z, scales = list(rot=90), horiz = FALSE, layout = c(5,3), col = red) useOuterStrips(p, strip = strip.custom( factor.levels = sigmaExpr), strip.left = strip.custom( factor.levels = INUExpr) ) One last question: how do I draw a line h = 0, lty =2 through each plot? Thanks a lot, this has been quite a learning experience for me wrt lattice! Ranjan GET FREE 5GB EMAIL - Check out spam free email with many cool features! Visit http://www.inbox.com/email to find out more! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions re: the use of lattice (Q1 SOLVED, not Q2)
There's a typo below. It's Deepayan Sarkar. -- Bert On Sun, Jul 22, 2012 at 9:55 PM, Bert Gunter bgun...@gene.com wrote: inline. -- Bert On Sun, Jul 22, 2012 at 8:26 PM, Ranjan Maitra maitra.mbox.igno...@inbox.com wrote: Just reset the levels of z$sigma (and also redefine sigmaExpr): z$sigma - factor(z$sigma, levels = c(5,10,20,30,50)) # new levels order sigmaExprList - lapply(as.numeric(levels(z$sigma)), function(s) bquote(sigma == .(s))) sigmaExpr - as.expression(sigmaExprList) INUExpr - paste0(INU = , c(0,20,40), %) p - bwplot(Error ~ Method | sigma + INU, data = z, scales = list(rot=90), horiz = FALSE, layout = c(5,3), col = red) useOuterStrips(p, strip = strip.custom( factor.levels = sigmaExpr), strip.left = strip.custom( factor.levels = INUExpr) ) One last question: how do I draw a line h = 0, lty =2 through each plot? ?panel.abline If you don't know how to use this in a panel function, time to start doing your own homework. Panel functions are central to the trellis paradigm, so an honest effort to learn these details will be worth the effort. A good place to start is the examples in the above help file. There are also numerous online tutorials and websites. Google. Also check Seepyan Sarkar's (lattice's author) web page. -- Bert Thanks a lot, this has been quite a learning experience for me wrt lattice! Ranjan GET FREE 5GB EMAIL - Check out spam free email with many cool features! Visit http://www.inbox.com/email to find out more! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions re: the use of lattice (Q1 SOLVED, not Q2)
On Sun, 22 Jul 2012 22:05:14 -0700 Bert Gunter gunter.ber...@gene.com wrote: On Sun, Jul 22, 2012 at 8:26 PM, Ranjan Maitra maitra.mbox.igno...@inbox.com wrote: Just reset the levels of z$sigma (and also redefine sigmaExpr): z$sigma - factor(z$sigma, levels = c(5,10,20,30,50)) # new levels order sigmaExprList - lapply(as.numeric(levels(z$sigma)), function(s) bquote(sigma == .(s))) sigmaExpr - as.expression(sigmaExprList) INUExpr - paste0(INU = , c(0,20,40), %) p - bwplot(Error ~ Method | sigma + INU, data = z, scales = list(rot=90), horiz = FALSE, layout = c(5,3), col = red) useOuterStrips(p, strip = strip.custom( factor.levels = sigmaExpr), strip.left = strip.custom( factor.levels = INUExpr) ) One last question: how do I draw a line h = 0, lty =2 through each plot? ?panel.abline If you don't know how to use this in a panel function, time to start doing your own homework. Panel functions are central to the trellis paradigm, so an honest effort to learn these details will be worth the effort. A good place to start is the examples in the above help file. There are also numerous online tutorials and websites. Google. Also check Deepyan Sarkar's (lattice's author) web page. Thanks a lot for your tip! I have always wondered where to get more help on figuring out about these things so it will be very useful for me: definitely, it will be helpful for me to figure out how to do what you suggest in a panel function. Best wishes, Ranjan FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! Visit http://www.inbox.com/photosharing to find out more! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] two questions re: the use of lattice
Dear friends, I have two questions regarding the use of lattice. First some code: ## begin code z - cbind(rep(c(BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar), each = 250), rep(c(5, 10, 20, 30, 50), each = 50)) z - rbind(cbind(z, 0), cbind(z, 20), cbind(z, 40)) z - cbind(z, rnorm(n = nrow(z))) z - as.data.frame(z) names(z) - c(Method, sigma, INU, Error) sigma - as.numeric(levels(z$sigma)) sigmaExprList - lapply(sigma, function(s) bquote(italic(sigma) == . (s))) sigmaExpr - as.expression(sigmaExprList) bwplot(Error~Method | sigma, data = z[z[,INU] == 0,],scales=list (rot=90), horiz = F, xlab = Method, ylab = Relative Error, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, layout = c(5,1), col = red) ## end code Question 1: how do I force the display of the Method in the plotting to be in the same order (i.e., in the order of BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar) as the input. As you may notice, it puts them in its own merry order (I suspect in ascii alphabetical order, but that conjecture is based entirely on my very few sample attempts). Question 2: I want to have 3x5 plots of the respective boxplots. Something like: Error ~ Method | sigma + INU? But I want the labels for the sigma and the INU to be only in the column and the rows (vertically here) as appropriate, in order to save plotting space. How do I go about doing this? Please reply through the mailing list so that others may also benefit. In any case, many thanks again for reading and for any help and pointers! Best wishes, Ranjan -- Important Notice: This mailbox is ignored: e-mails are set to be deleted on receipt. For those needing to send personal or professional e-mail, please use appropriate addresses. FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions re: the use of lattice
Answer to you first question, try this at the start of bwplot to specify ordering: bwplot(Error~factor(Method, levels = unique(Method)) On Sat, Jul 21, 2012 at 2:42 PM, Ranjan Maitra maitra.mbox.igno...@inbox.com wrote: Dear friends, I have two questions regarding the use of lattice. First some code: ## begin code z - cbind(rep(c(BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar), each = 250), rep(c(5, 10, 20, 30, 50), each = 50)) z - rbind(cbind(z, 0), cbind(z, 20), cbind(z, 40)) z - cbind(z, rnorm(n = nrow(z))) z - as.data.frame(z) names(z) - c(Method, sigma, INU, Error) sigma - as.numeric(levels(z$sigma)) sigmaExprList - lapply(sigma, function(s) bquote(italic(sigma) == . (s))) sigmaExpr - as.expression(sigmaExprList) bwplot(Error~Method | sigma, data = z[z[,INU] == 0,],scales=list (rot=90), horiz = F, xlab = Method, ylab = Relative Error, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, layout = c(5,1), col = red) ## end code Question 1: how do I force the display of the Method in the plotting to be in the same order (i.e., in the order of BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar) as the input. As you may notice, it puts them in its own merry order (I suspect in ascii alphabetical order, but that conjecture is based entirely on my very few sample attempts). Question 2: I want to have 3x5 plots of the respective boxplots. Something like: Error ~ Method | sigma + INU? But I want the labels for the sigma and the INU to be only in the column and the rows (vertically here) as appropriate, in order to save plotting space. How do I go about doing this? Please reply through the mailing list so that others may also benefit. In any case, many thanks again for reading and for any help and pointers! Best wishes, Ranjan -- Important Notice: This mailbox is ignored: e-mails are set to be deleted on receipt. For those needing to send personal or professional e-mail, please use appropriate addresses. FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions re: the use of lattice
Run this before the bwplot() command: z$Method - factor(z$Method, levels = c(BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar)) I don't have an answer for the 2nd question. Seems like it must be possible. -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of jim holtman Sent: Saturday, July 21, 2012 5:57 PM To: Ranjan Maitra Cc: r-help@r-project.org Subject: Re: [R] two questions re: the use of lattice Answer to you first question, try this at the start of bwplot to specify ordering: bwplot(Error~factor(Method, levels = unique(Method)) On Sat, Jul 21, 2012 at 2:42 PM, Ranjan Maitra maitra.mbox.igno...@inbox.com wrote: Dear friends, I have two questions regarding the use of lattice. First some code: ## begin code z - cbind(rep(c(BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar), each = 250), rep(c(5, 10, 20, 30, 50), each = 50)) z - rbind(cbind(z, 0), cbind(z, 20), cbind(z, 40)) z - cbind(z, rnorm(n = nrow(z))) z - as.data.frame(z) names(z) - c(Method, sigma, INU, Error) sigma - as.numeric(levels(z$sigma)) sigmaExprList - lapply(sigma, function(s) bquote(italic(sigma) == . (s))) sigmaExpr - as.expression(sigmaExprList) bwplot(Error~Method | sigma, data = z[z[,INU] == 0,],scales=list (rot=90), horiz = F, xlab = Method, ylab = Relative Error, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, layout = c(5,1), col = red) ## end code Question 1: how do I force the display of the Method in the plotting to be in the same order (i.e., in the order of BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar) as the input. As you may notice, it puts them in its own merry order (I suspect in ascii alphabetical order, but that conjecture is based entirely on my very few sample attempts). Question 2: I want to have 3x5 plots of the respective boxplots. Something like: Error ~ Method | sigma + INU? But I want the labels for the sigma and the INU to be only in the column and the rows (vertically here) as appropriate, in order to save plotting space. How do I go about doing this? Please reply through the mailing list so that others may also benefit. In any case, many thanks again for reading and for any help and pointers! Best wishes, Ranjan -- Important Notice: This mailbox is ignored: e-mails are set to be deleted on receipt. For those needing to send personal or professional e-mail, please use appropriate addresses. FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions re: the use of lattice
Take a look at useOuterStrips() in package latticeExtra. --- David -Original Message- From: David L Carlson [mailto:dcarl...@tamu.edu] Sent: Saturday, July 21, 2012 6:51 PM To: 'jim holtman'; 'Ranjan Maitra' Cc: 'r-help@r-project.org' Subject: RE: [R] two questions re: the use of lattice Run this before the bwplot() command: z$Method - factor(z$Method, levels = c(BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar)) I don't have an answer for the 2nd question. Seems like it must be possible. -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of jim holtman Sent: Saturday, July 21, 2012 5:57 PM To: Ranjan Maitra Cc: r-help@r-project.org Subject: Re: [R] two questions re: the use of lattice Answer to you first question, try this at the start of bwplot to specify ordering: bwplot(Error~factor(Method, levels = unique(Method)) On Sat, Jul 21, 2012 at 2:42 PM, Ranjan Maitra maitra.mbox.igno...@inbox.com wrote: Dear friends, I have two questions regarding the use of lattice. First some code: ## begin code z - cbind(rep(c(BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar), each = 250), rep(c(5, 10, 20, 30, 50), each = 50)) z - rbind(cbind(z, 0), cbind(z, 20), cbind(z, 40)) z - cbind(z, rnorm(n = nrow(z))) z - as.data.frame(z) names(z) - c(Method, sigma, INU, Error) sigma - as.numeric(levels(z$sigma)) sigmaExprList - lapply(sigma, function(s) bquote(italic(sigma) == . (s))) sigmaExpr - as.expression(sigmaExprList) bwplot(Error~Method | sigma, data = z[z[,INU] == 0,],scales=list (rot=90), horiz = F, xlab = Method, ylab = Relative Error, strip = function(which.given, which.panel, var.name, strip.levels = FALSE, strip.names = TRUE, ...) { strip.default(which.given, which.panel, var.name = sigmaExpr[which.panel], strip.levels = FALSE, strip.names = TRUE, ...) }, layout = c(5,1), col = red) ## end code Question 1: how do I force the display of the Method in the plotting to be in the same order (i.e., in the order of BIC, ICL, s_v, Q_v, sig-q, s_lsk, s_lML, s_mlsk, s_mlML, s_la8, s_haar) as the input. As you may notice, it puts them in its own merry order (I suspect in ascii alphabetical order, but that conjecture is based entirely on my very few sample attempts). Question 2: I want to have 3x5 plots of the respective boxplots. Something like: Error ~ Method | sigma + INU? But I want the labels for the sigma and the INU to be only in the column and the rows (vertically here) as appropriate, in order to save plotting space. How do I go about doing this? Please reply through the mailing list so that others may also benefit. In any case, many thanks again for reading and for any help and pointers! Best wishes, Ranjan -- Important Notice: This mailbox is ignored: e-mails are set to be deleted on receipt. For those needing to send personal or professional e-mail, please use appropriate addresses. FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Two Questions
Sorry for the somewhat nondescript subject line, but I have two questions: 1.What is a really good book on R for a nonprogrammer? 2. How do I open more than one R Graphics: Device 2(ACTIVE). That what is the R command that I can use to keep more than one plot open. I am running a script from a book on Chemometrics that results in more than one graph during the execution, but it seems that R deletes each graph when the script calls for the next plot. Thanks in advance Stephen P. Molnar, Ph.D. Life is a fuzzy set Foundation for Chemistry Stochastic and multivariate http://www.FoundationForChemistry.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two Questions
On 20/04/2011 9:23 AM, Stephen P Molnar wrote: Sorry for the somewhat nondescript subject line, but I have two questions: 1.What is a really good book on R for a nonprogrammer? Any book that teaches you the basics of programming would be good, it doesn't need to be about R. If you want to use R and remain as a nonprogrammer, you will not have any easy time. 2. How do I open more than one R Graphics: Device 2(ACTIVE). That what is the R command that I can use to keep more than one plot open. I am running a script from a book on Chemometrics that results in more than one graph during the execution, but it seems that R deletes each graph when the script calls for the next plot. dev.new() will open a new plot window, and subsequent plotting commands will be drawn there. dev.set() lets you switch back to drawing on the original one. Duncan Murdoch Thanks in advance Stephen P. Molnar, Ph.D. Life is a fuzzy set Foundation for Chemistry Stochastic and multivariate http://www.FoundationForChemistry.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two Questions
On Apr 20, 2011, at 9:23 AM, Stephen P Molnar wrote: Sorry for the somewhat nondescript subject line, but I have two questions: 1.What is a really good book on R for a nonprogrammer? 2. How do I open more than one R Graphics: Device 2(ACTIVE). That what is the R command that I can use to keep more than one plot open. You can have more than one device available, but you need to address them serially. Only one device can receive input at a time. ?Devices ?dev.set I am running a script from a book on Chemometrics that results in more than one graph during the execution, but it seems that R deletes each graph when the script calls for the next plot. More likely you are seeing one graph displayed at a time on the screen device. On my screen device the cmd-left-arrow will bring up prior plots to a depth of 15 earlier results. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two Questions
From: Stephen P Molnar s.mol...@sbcglobal.net Subject: [R] Two Questions To: R-help r-help@r-project.org Received: Wednesday, April 20, 2011, 9:23 AM 1. What is a really good book on R for a nonprogrammer? Have a look at the books listed on the R website. Books by Peter Dalgaard, Phil Spector, Michael Crawley John Verzani are all possibilities. Also haved a look at the Contributed Documentation page on the site. It has some very useful material. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two Questions
When running a large number of commands from a script that produces multiple plots it is often best to send the plots to the pdf device (or other system) that you can then page through after it is finished. You could also specify par(ask=TRUE) then you would be prompted before changing the plot (but other code would not execute either). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Stephen P Molnar Sent: Wednesday, April 20, 2011 7:23 AM To: R-help Subject: [R] Two Questions Sorry for the somewhat nondescript subject line, but I have two questions: 1.What is a really good book on R for a nonprogrammer? 2. How do I open more than one R Graphics: Device 2(ACTIVE). That what is the R command that I can use to keep more than one plot open. I am running a script from a book on Chemometrics that results in more than one graph during the execution, but it seems that R deletes each graph when the script calls for the next plot. Thanks in advance Stephen P. Molnar, Ph.D. Life is a fuzzy set Foundation for Chemistry Stochastic and multivariate http://www.FoundationForChemistry.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Two questions about metacharacter in regexprs and function return
for the script, please kindly see the script below. At line 10 and line 13, my problems occurs. The first one is I try to retrieve the gene official name from a column of a table. The pattern of official name is something starting with gene_name. For detail problems, please see the according lines. Any suggestions are appreciated example of matching source (extract the Nnat, sometime it would be the character N/A): AB004048|MM8;NCBI Build 36|transcript|chr2|157251580|157253958|ExemplarFor 'AB004048'; gene_id '18111'; transcript_id 'AB004048'; gene_name 'Nnat'; alt '5730414I02Rik|AW107673|Peg5'; neuronatin|http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=genecmd=Retrievedopt=full_reportlist_uids=18111; #obtain the exprs matrix for cluster analysis #ask questions DEG_files - grep(bak, dir());#pay attention to the filenames exprs_files - grep(copy, dir()); protein - c(); assign_exprs - function(files, protein) { #use to find the DEGs or exprs for cmeans clustering for(i in 1:length(files)) { microarray_data - read.csv(file = files[i], header = T, sep = \t); microarray_data[, 7] - gsub(([\\s\\S]+gene_name '(\\w*)';.+), \\2, microarray_data[, 7], perl = T);#why [\\w]* cannot workable? also the [(\\w*)(N/A)] cannot be workable. assign(files[i], microarray_data, envir=.GlobalEnv); #get(dir()[i]() can obtain the data of interest.`variable_names` can also work protein - c(protein, get(files[i])[, 7]); #used for obtain all the DEGs only } #return protein; #why this line is not workable? assign(all_protein, protein, envir=.GlobalEnv); } exist_to_cluster_exprs - function(x, cluster_exprs, all) { if(exists(all, x[1])){ #exists function cluster_exprs - cbind(cluster_exprs, x); } #return cluster_exprs; } assign_exprs(dir()[DEG_files], protein); all_protein - unique(all_protein); assign_exprs(dir()[exprs_files], protein); for(i in 1:2) { apply(get(dir()[exprs_files[i]]), 1, exist_to_cluster_exprs, cluster_exprs, all); #assign(paste(exprs_files()[i], exprs_data), cluster_exprs[, c(2, 3, 5, 7)]; exprs_data - cbind(exprs_data, cluster_exprs[, 3]); } exprs_data; -- View this message in context: http://r.789695.n4.nabble.com/Two-questions-about-metacharacter-in-regexprs-and-function-return-tp3432692p3432692.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Two questions about metacharacter in regexprs and function return
for the script, please kindly see the script below. At line 10 and line 13, my problems occurs. The first one is I try to retrieve the gene official name from a column of a table. The pattern of official name is something starting with gene_name. For detail problems, please see the according lines. Any suggestions are appreciated example of matching source (extract the Nnat, sometime it would be the character N/A): AB004048|MM8;NCBI Build 36|transcript|chr2|157251580|157253958|ExemplarFor 'AB004048'; gene_id '18111'; transcript_id 'AB004048'; gene_name 'Nnat'; alt '5730414I02Rik|AW107673|Peg5'; neuronatin|http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=genecmd=Retrievedopt=full_reportlist_uids=18111; #obtain the exprs matrix for cluster analysis #ask questions DEG_files - grep(bak, dir());#pay attention to the filenames exprs_files - grep(copy, dir()); protein - c(); assign_exprs - function(files, protein) { #use to find the DEGs or exprs for cmeans clustering for(i in 1:length(files)) { microarray_data - read.csv(file = files[i], header = T, sep = \t); microarray_data[, 7] - gsub(([\\s\\S]+gene_name '(\\w*)';.+), \\2, microarray_data[, 7], perl = T);#why [\\w]* cannot workable? also the [(\\w*)(N/A)] cannot be workable. assign(files[i], microarray_data, envir=.GlobalEnv); #get(dir()[i]() can obtain the data of interest.`variable_names` can also work protein - c(protein, get(files[i])[, 7]); #used for obtain all the DEGs only } #return protein; #why this line is not workable? assign(all_protein, protein, envir=.GlobalEnv); } exist_to_cluster_exprs - function(x, cluster_exprs, all) { if(exists(all, x[1])){ #exists function cluster_exprs - cbind(cluster_exprs, x); } #return cluster_exprs; } assign_exprs(dir()[DEG_files], protein); all_protein - unique(all_protein); assign_exprs(dir()[exprs_files], protein); for(i in 1:2) { apply(get(dir()[exprs_files[i]]), 1, exist_to_cluster_exprs, cluster_exprs, all); #assign(paste(exprs_files()[i], exprs_data), cluster_exprs[, c(2, 3, 5, 7)]; exprs_data - cbind(exprs_data, cluster_exprs[, 3]); } exprs_data; -- View this message in context: http://r.789695.n4.nabble.com/Two-questions-about-metacharacter-in-regexprs-and-function-return-tp3433342p3433342.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions
This sounds interesting, thank you. I'll have a look. jason Dr. Iasonas Lamprianou Assistant Professor (Educational Research and Evaluation) Department of Education Sciences European University-Cyprus P.O. Box 22006 1516 Nicosia Cyprus Tel.: +357-22-713178 Fax: +357-22-590539 Honorary Research Fellow Department of Education The University of Manchester Oxford Road, Manchester M13 9PL, UK Tel. 0044 161 275 3485 iasonas.lampria...@manchester.ac.uk --- On Wed, 8/9/10, Greg Snow greg.s...@imail.org wrote: From: Greg Snow greg.s...@imail.org Subject: RE: [R] two questions To: Iasonas Lamprianou lampria...@yahoo.com, juan xiong xiongjuan2...@gmail.com, Dennis Murphy djmu...@gmail.com Cc: r-help@r-project.org r-help@r-project.org Date: Wednesday, 8 September, 2010, 17:41 Have you considered doing a permutation test on the interaction? Here is an article that gives the general procedure for a couple of algorithms and a comparison of how well they do: Anderson, Marti J and Legendre, Pierre; An Empirical Comparison of Permutation Methods for Tests of Partial Regression Coefficients in a Linear Model. J. Statist. Comput. Simul., 1999, vol 62, pp. 271-303. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Iasonas Lamprianou Sent: Tuesday, September 07, 2010 12:25 AM To: juan xiong; Dennis Murphy Cc: r-help@r-project.org Subject: Re: [R] two questions By the way, ordinal regression would require huge datasets because my dependent variable has around 20 different responses... but again, one might say that with so many ordinal responses, it is as if we have a linear/interval variable, right? I just hoped that there would be a two-way kruskal-wallis or something like that. On the other hand, what is going to happen if I (1) bootstrap data from all cells of my design and average the rank ordering of the data of every cell? And then (2) do the same but using data from a uniform/normal distribution so that I assume that there is no difference between the cells? From point (1) I will find the statistical value and from point (2) the expectation and then with a third step (3) I can run a chi-square on the observed/expected values. Would this be reasonable? But again, how can I distinguish between main and interaction effects? Dr. Iasonas Lamprianou Assistant Professor (Educational Research and Evaluation) Department of Education Sciences European University-Cyprus P.O. Box 22006 1516 Nicosia Cyprus Tel.: +357-22-713178 Fax: +357-22-590539 Honorary Research Fellow Department of Education The University of Manchester Oxford Road, Manchester M13 9PL, UK Tel. 0044 161 275 3485 iasonas.lampria...@manchester.ac.uk --- On Tue, 7/9/10, Dennis Murphy djmu...@gmail.com wrote: From: Dennis Murphy djmu...@gmail.com Subject: Re: [R] two questions To: juan xiong xiongjuan2...@gmail.com Cc: David Winsemius dwinsem...@comcast.net, r-help@r-project.org, Iasonas Lamprianou lampria...@yahoo.com Date: Tuesday, 7 September, 2010, 4:47 Hi: On Mon, Sep 6, 2010 at 5:26 PM, juan xiong xiongjuan2...@gmail.com wrote: Maybe Friedman test The Friedman test corresponds to randomized complete block designs, not general two-way classifications. David's advice is sound, but also investigate proportional odds models (e.g., lrm in Prof. Harrell's rms package) in case the 'usual' approach comes up short. It would be helpful to know the number of response categories and some idea of the number of cities-of-birth under study, though... HTH, Dennis On Mon, Sep 6, 2010 at 4:47 PM, David Winsemius dwinsem...@comcast.netwrote: The usual least-squares methods are fairly robust to departures from normality. Furthermore, it is the residuals that are assumed to be normally distributed (not the marginal distributions that you are probably looking at) , so it does not sound as though you have yet examined the data properly. Tell us what the descriptive stats (say the means, variance, 10th and 90th percentiles) are on the residuals within cells cross- classified by the gender and city-of-birth variables (say the means, variance, 10th and 90th percentiles). On Sep 6, 2010, at 4:34 PM, Iasonas Lamprianou wrote: Dear friends, two questions (1) does anyone know if there are any non-parametric equivalents of the two-way ANOVA in R? I have an ordinal non-normally distributed dependent variable and two factors (gender and city of birth). Normally, one would try a two-way anova
Re: [R] two questions
Have you considered doing a permutation test on the interaction? Here is an article that gives the general procedure for a couple of algorithms and a comparison of how well they do: Anderson, Marti J and Legendre, Pierre; An Empirical Comparison of Permutation Methods for Tests of Partial Regression Coefficients in a Linear Model. J. Statist. Comput. Simul., 1999, vol 62, pp. 271-303. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Iasonas Lamprianou Sent: Tuesday, September 07, 2010 12:25 AM To: juan xiong; Dennis Murphy Cc: r-help@r-project.org Subject: Re: [R] two questions By the way, ordinal regression would require huge datasets because my dependent variable has around 20 different responses... but again, one might say that with so many ordinal responses, it is as if we have a linear/interval variable, right? I just hoped that there would be a two-way kruskal-wallis or something like that. On the other hand, what is going to happen if I (1) bootstrap data from all cells of my design and average the rank ordering of the data of every cell? And then (2) do the same but using data from a uniform/normal distribution so that I assume that there is no difference between the cells? From point (1) I will find the statistical value and from point (2) the expectation and then with a third step (3) I can run a chi-square on the observed/expected values. Would this be reasonable? But again, how can I distinguish between main and interaction effects? Dr. Iasonas Lamprianou Assistant Professor (Educational Research and Evaluation) Department of Education Sciences European University-Cyprus P.O. Box 22006 1516 Nicosia Cyprus Tel.: +357-22-713178 Fax: +357-22-590539 Honorary Research Fellow Department of Education The University of Manchester Oxford Road, Manchester M13 9PL, UK Tel. 0044 161 275 3485 iasonas.lampria...@manchester.ac.uk --- On Tue, 7/9/10, Dennis Murphy djmu...@gmail.com wrote: From: Dennis Murphy djmu...@gmail.com Subject: Re: [R] two questions To: juan xiong xiongjuan2...@gmail.com Cc: David Winsemius dwinsem...@comcast.net, r-help@r-project.org, Iasonas Lamprianou lampria...@yahoo.com Date: Tuesday, 7 September, 2010, 4:47 Hi: On Mon, Sep 6, 2010 at 5:26 PM, juan xiong xiongjuan2...@gmail.com wrote: Maybe Friedman test The Friedman test corresponds to randomized complete block designs, not general two-way classifications. David's advice is sound, but also investigate proportional odds models (e.g., lrm in Prof. Harrell's rms package) in case the 'usual' approach comes up short. It would be helpful to know the number of response categories and some idea of the number of cities-of-birth under study, though... HTH, Dennis On Mon, Sep 6, 2010 at 4:47 PM, David Winsemius dwinsem...@comcast.netwrote: The usual least-squares methods are fairly robust to departures from normality. Furthermore, it is the residuals that are assumed to be normally distributed (not the marginal distributions that you are probably looking at) , so it does not sound as though you have yet examined the data properly. Tell us what the descriptive stats (say the means, variance, 10th and 90th percentiles) are on the residuals within cells cross- classified by the gender and city-of-birth variables (say the means, variance, 10th and 90th percentiles). On Sep 6, 2010, at 4:34 PM, Iasonas Lamprianou wrote: Dear friends, two questions (1) does anyone know if there are any non-parametric equivalents of the two-way ANOVA in R? I have an ordinal non-normally distributed dependent variable and two factors (gender and city of birth). Normally, one would try a two-way anova, but if R has any non-parametric equivalents, that might be great. There is an entire task view page on robust methods if you decide to press on with this quest. (2) Also, if the interaction of gender and city of birth is statistically significant, which post-hoc tests should I run? How many cities are we talking about? Thanks Jason Dr. Iasonas Lamprianou -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch
Re: [R] two questions
thanks for the response we are talking about 7 cities. If I run a two-way anova, I find the residuals skewed and non-normal. I'll try the rlm method and see what happens. Thanks to all of you for the support. Dr. Iasonas Lamprianou Assistant Professor (Educational Research and Evaluation) Department of Education Sciences European University-Cyprus P.O. Box 22006 1516 Nicosia Cyprus Tel.: +357-22-713178 Fax: +357-22-590539 Honorary Research Fellow Department of Education The University of Manchester Oxford Road, Manchester M13 9PL, UK Tel. 0044 161 275 3485 iasonas.lampria...@manchester.ac.uk --- On Tue, 7/9/10, Dennis Murphy djmu...@gmail.com wrote: From: Dennis Murphy djmu...@gmail.com Subject: Re: [R] two questions To: juan xiong xiongjuan2...@gmail.com Cc: David Winsemius dwinsem...@comcast.net, r-help@r-project.org, Iasonas Lamprianou lampria...@yahoo.com Date: Tuesday, 7 September, 2010, 4:47 Hi: On Mon, Sep 6, 2010 at 5:26 PM, juan xiong xiongjuan2...@gmail.com wrote: Maybe Friedman test The Friedman test corresponds to randomized complete block designs, not general two-way classifications. David's advice is sound, but also investigate proportional odds models (e.g., lrm in Prof. Harrell's rms package) in case the 'usual' approach comes up short. It would be helpful to know the number of response categories and some idea of the number of cities-of-birth under study, though... HTH, Dennis On Mon, Sep 6, 2010 at 4:47 PM, David Winsemius dwinsem...@comcast.netwrote: The usual least-squares methods are fairly robust to departures from normality. Furthermore, it is the residuals that are assumed to be normally distributed (not the marginal distributions that you are probably looking at) , so it does not sound as though you have yet examined the data properly. Tell us what the descriptive stats (say the means, variance, 10th and 90th percentiles) are on the residuals within cells cross-classified by the gender and city-of-birth variables (say the means, variance, 10th and 90th percentiles). On Sep 6, 2010, at 4:34 PM, Iasonas Lamprianou wrote: Dear friends, two questions (1) does anyone know if there are any non-parametric equivalents of the two-way ANOVA in R? I have an ordinal non-normally distributed dependent variable and two factors (gender and city of birth). Normally, one would try a two-way anova, but if R has any non-parametric equivalents, that might be great. There is an entire task view page on robust methods if you decide to press on with this quest. (2) Also, if the interaction of gender and city of birth is statistically significant, which post-hoc tests should I run? How many cities are we talking about? Thanks Jason Dr. Iasonas Lamprianou -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions
By the way, ordinal regression would require huge datasets because my dependent variable has around 20 different responses... but again, one might say that with so many ordinal responses, it is as if we have a linear/interval variable, right? I just hoped that there would be a two-way kruskal-wallis or something like that. On the other hand, what is going to happen if I (1) bootstrap data from all cells of my design and average the rank ordering of the data of every cell? And then (2) do the same but using data from a uniform/normal distribution so that I assume that there is no difference between the cells? From point (1) I will find the statistical value and from point (2) the expectation and then with a third step (3) I can run a chi-square on the observed/expected values. Would this be reasonable? But again, how can I distinguish between main and interaction effects? Dr. Iasonas Lamprianou Assistant Professor (Educational Research and Evaluation) Department of Education Sciences European University-Cyprus P.O. Box 22006 1516 Nicosia Cyprus Tel.: +357-22-713178 Fax: +357-22-590539 Honorary Research Fellow Department of Education The University of Manchester Oxford Road, Manchester M13 9PL, UK Tel. 0044 161 275 3485 iasonas.lampria...@manchester.ac.uk --- On Tue, 7/9/10, Dennis Murphy djmu...@gmail.com wrote: From: Dennis Murphy djmu...@gmail.com Subject: Re: [R] two questions To: juan xiong xiongjuan2...@gmail.com Cc: David Winsemius dwinsem...@comcast.net, r-help@r-project.org, Iasonas Lamprianou lampria...@yahoo.com Date: Tuesday, 7 September, 2010, 4:47 Hi: On Mon, Sep 6, 2010 at 5:26 PM, juan xiong xiongjuan2...@gmail.com wrote: Maybe Friedman test The Friedman test corresponds to randomized complete block designs, not general two-way classifications. David's advice is sound, but also investigate proportional odds models (e.g., lrm in Prof. Harrell's rms package) in case the 'usual' approach comes up short. It would be helpful to know the number of response categories and some idea of the number of cities-of-birth under study, though... HTH, Dennis On Mon, Sep 6, 2010 at 4:47 PM, David Winsemius dwinsem...@comcast.netwrote: The usual least-squares methods are fairly robust to departures from normality. Furthermore, it is the residuals that are assumed to be normally distributed (not the marginal distributions that you are probably looking at) , so it does not sound as though you have yet examined the data properly. Tell us what the descriptive stats (say the means, variance, 10th and 90th percentiles) are on the residuals within cells cross-classified by the gender and city-of-birth variables (say the means, variance, 10th and 90th percentiles). On Sep 6, 2010, at 4:34 PM, Iasonas Lamprianou wrote: Dear friends, two questions (1) does anyone know if there are any non-parametric equivalents of the two-way ANOVA in R? I have an ordinal non-normally distributed dependent variable and two factors (gender and city of birth). Normally, one would try a two-way anova, but if R has any non-parametric equivalents, that might be great. There is an entire task view page on robust methods if you decide to press on with this quest. (2) Also, if the interaction of gender and city of birth is statistically significant, which post-hoc tests should I run? How many cities are we talking about? Thanks Jason Dr. Iasonas Lamprianou -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions
Hi: On Mon, Sep 6, 2010 at 5:26 PM, juan xiong xiongjuan2...@gmail.com wrote: Maybe Friedman test The Friedman test corresponds to randomized complete block designs, not general two-way classifications. David's advice is sound, but also investigate proportional odds models (e.g., lrm in Prof. Harrell's rms package) in case the 'usual' approach comes up short. It would be helpful to know the number of response categories and some idea of the number of cities-of-birth under study, though... HTH, Dennis On Mon, Sep 6, 2010 at 4:47 PM, David Winsemius dwinsem...@comcast.net wrote: The usual least-squares methods are fairly robust to departures from normality. Furthermore, it is the residuals that are assumed to be normally distributed (not the marginal distributions that you are probably looking at) , so it does not sound as though you have yet examined the data properly. Tell us what the descriptive stats (say the means, variance, 10th and 90th percentiles) are on the residuals within cells cross-classified by the gender and city-of-birth variables (say the means, variance, 10th and 90th percentiles). On Sep 6, 2010, at 4:34 PM, Iasonas Lamprianou wrote: Dear friends, two questions (1) does anyone know if there are any non-parametric equivalents of the two-way ANOVA in R? I have an ordinal non-normally distributed dependent variable and two factors (gender and city of birth). Normally, one would try a two-way anova, but if R has any non-parametric equivalents, that might be great. There is an entire task view page on robust methods if you decide to press on with this quest. (2) Also, if the interaction of gender and city of birth is statistically significant, which post-hoc tests should I run? How many cities are we talking about? Thanks Jason Dr. Iasonas Lamprianou -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] two questions
Dear friends, two questions (1) does anyone know if there are any non-parametric equivalents of the two-way ANOVA in R? I have an ordinal non-normally distributed dependent variable and two factors (gender and city of birth). Normally, one would try a two-way anova, but if R has any non-parametric equivalents, that might be great. (2) Also, if the interaction of gender and city of birth is statistically significant, which post-hoc tests should I run? Thanks Jason Dr. Iasonas Lamprianou Assistant Professor (Educational Research and Evaluation) Department of Education Sciences European University-Cyprus P.O. Box 22006 1516 Nicosia Cyprus Tel.: +357-22-713178 Fax: +357-22-590539 Honorary Research Fellow Department of Education The University of Manchester Oxford Road, Manchester M13 9PL, UK Tel. 0044 161 275 3485 iasonas.lampria...@manchester.ac.uk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions
The usual least-squares methods are fairly robust to departures from normality. Furthermore, it is the residuals that are assumed to be normally distributed (not the marginal distributions that you are probably looking at) , so it does not sound as though you have yet examined the data properly. Tell us what the descriptive stats (say the means, variance, 10th and 90th percentiles) are on the residuals within cells cross-classified by the gender and city-of-birth variables (say the means, variance, 10th and 90th percentiles). On Sep 6, 2010, at 4:34 PM, Iasonas Lamprianou wrote: Dear friends, two questions (1) does anyone know if there are any non-parametric equivalents of the two-way ANOVA in R? I have an ordinal non-normally distributed dependent variable and two factors (gender and city of birth). Normally, one would try a two-way anova, but if R has any non- parametric equivalents, that might be great. There is an entire task view page on robust methods if you decide to press on with this quest. (2) Also, if the interaction of gender and city of birth is statistically significant, which post-hoc tests should I run? How many cities are we talking about? Thanks Jason Dr. Iasonas Lamprianou -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions
Maybe Friedman test On Mon, Sep 6, 2010 at 4:47 PM, David Winsemius dwinsem...@comcast.netwrote: The usual least-squares methods are fairly robust to departures from normality. Furthermore, it is the residuals that are assumed to be normally distributed (not the marginal distributions that you are probably looking at) , so it does not sound as though you have yet examined the data properly. Tell us what the descriptive stats (say the means, variance, 10th and 90th percentiles) are on the residuals within cells cross-classified by the gender and city-of-birth variables (say the means, variance, 10th and 90th percentiles). On Sep 6, 2010, at 4:34 PM, Iasonas Lamprianou wrote: Dear friends, two questions (1) does anyone know if there are any non-parametric equivalents of the two-way ANOVA in R? I have an ordinal non-normally distributed dependent variable and two factors (gender and city of birth). Normally, one would try a two-way anova, but if R has any non-parametric equivalents, that might be great. There is an entire task view page on robust methods if you decide to press on with this quest. (2) Also, if the interaction of gender and city of birth is statistically significant, which post-hoc tests should I run? How many cities are we talking about? Thanks Jason Dr. Iasonas Lamprianou -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Two questions on R and cairo/Cairo
On a headless Linux server running R 2.9.2 I would like to enable support for cairo, but capabilities(cairo) keeps on giving me FALSE. Is it possible what I am trying to do or can this only be achieved at R build time? I do not have administrative rights on this server. After compiling cairo-1.8.10 and its dependencies from source in my home directory, I finally managed to install the Cairo package (capital C). Although capabilities(cairo) still gives me FALSE, I am now able to produce graphics. Apart from convenience, is there any difference between using built-in support for cairo or using the Cairo package? Carsten __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions about PLOT
thanks for your reply. I have tried to use rseek.org.But still some problems. When I add axis(4) and axis(1,at=1:6,labels=gradeinfo$gradenam),the old tick or labels still are there as shown in the figure,how could I delete them( the old tick information in x-axis and left y axis ) My script is shown as below : plot(avegrp,type='l',lty=1,col='black',lwd=4,xlab=xxlab,ylab=yylab,labels=TRUE) axis(2) axis(1,at=1:6,labels=gradeinfo$gradenam, tick=FALSE) par(new=T) plot(sdgrp,type='l',lty=3,col='red1',xlab=xxlab,ylab=yylab,lwd=4,labels=TRUE) axis(1,at=1:6,labels=gradeinfo$gradenam, tick=FALSE) #axis(2, col = gold, lty = 2, lwd = 0.5) axis(4) legend(topright, legend, lty=llty, lwd=llwd,col =lgcol) 2010/6/1 Jannis bt_jan...@yahoo.de I would wote this question one of the most often asked questions here on that list ;-). Try searching the help archiwe (www.rseek.org) and you will find solutions. I would guess that you need to use something like: axis(4) as the sides of the plot are always numbered from bottom,left,top,right HTH Jannis Jie TANG schrieb: here ,I want to plot two lines in one figure.But I have two problems 1) how to move one of the y-axis to be the right ? I tried to the commandaxis(2),But I failed. 2) how to add the axis information correctly.Since I have use the cmommand axis(1,at=1:6,labels=gradeinfo$gradenam) but it seems that the correct information that I want is superposition with the old axis information.What can i do ? the script and figure is shown as below .thanks .:) outflnm-paste(Outdic,meansd.jpg,sep=/) jpeg(file=outflnm, bg=transparent) legend-c(average error,stand quare error) lgcol-c(black,red1) par(las=1) yylab-c(forecast error) xxlab-c(typhoon class) llty-c(1,3) llwd-c(4,4) #par(bg='yellow') plot(avegrp,type='l',lty=1,col='black',lwd=4,xlab=xxlab,ylab=yylab) par(new=T) plot(sdgrp,type='l',lty=3,col='red1',xlab=xxlab,ylab=yylab,lwd=4) #axis(2, col = gold, lty = 2, lwd = 0.5) legend(topright, legend, lty=llty, lwd=llwd,col =lgcol) axis(1,at=1:6,labels=gradeinfo$gradenam) dev.off() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- TANG Jie Email: totang...@gmail.com Tel: 0086-2154896104 Shanghai Typhoon Institute,China __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions about PLOT
It may not be the nicest solution, but my suggestion should work. Have you tried plot(type=n,...), plotting the axes with axis(), and plotting the data with lines()? Ivan Le 6/1/2010 10:10, Jie TANG a écrit : thanks for your reply. I have tried to use rseek.org.But still some problems. When I add axis(4) and axis(1,at=1:6,labels=gradeinfo$gradenam),the old tick or labels still are there as shown in the figure,how could I delete them( the old tick information in x-axis and left y axis ) My script is shown as below : plot(avegrp,type='l',lty=1,col='black',lwd=4,xlab=xxlab,ylab=yylab,labels=TRUE) axis(2) axis(1,at=1:6,labels=gradeinfo$gradenam, tick=FALSE) par(new=T) plot(sdgrp,type='l',lty=3,col='red1',xlab=xxlab,ylab=yylab,lwd=4,labels=TRUE) axis(1,at=1:6,labels=gradeinfo$gradenam, tick=FALSE) #axis(2, col = gold, lty = 2, lwd = 0.5) axis(4) legend(topright, legend, lty=llty, lwd=llwd,col =lgcol) 2010/6/1 Jannisbt_jan...@yahoo.de I would wote this question one of the most often asked questions here on that list ;-). Try searching the help archiwe (www.rseek.org) and you will find solutions. I would guess that you need to use something like: axis(4) as the sides of the plot are always numbered from bottom,left,top,right HTH Jannis Jie TANG schrieb: here ,I want to plot two lines in one figure.But I have two problems 1) how to move one of the y-axis to be the right ? I tried to the commandaxis(2),But I failed. 2) how to add the axis information correctly.Since I have use the cmommand axis(1,at=1:6,labels=gradeinfo$gradenam) but it seems that the correct information that I want is superposition with the old axis information.What can i do ? the script and figure is shown as below .thanks .:) outflnm-paste(Outdic,meansd.jpg,sep=/) jpeg(file=outflnm, bg=transparent) legend-c(average error,stand quare error) lgcol-c(black,red1) par(las=1) yylab-c(forecast error) xxlab-c(typhoon class) llty-c(1,3) llwd-c(4,4) #par(bg='yellow') plot(avegrp,type='l',lty=1,col='black',lwd=4,xlab=xxlab,ylab=yylab) par(new=T) plot(sdgrp,type='l',lty=3,col='red1',xlab=xxlab,ylab=yylab,lwd=4) #axis(2, col = gold, lty = 2, lwd = 0.5) legend(topright, legend, lty=llty, lwd=llwd,col =lgcol) axis(1,at=1:6,labels=gradeinfo$gradenam) dev.off() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions about PLOT
On 06/01/2010 12:44 AM, Jie TANG wrote: here ,I want to plot two lines in one figure.But I have two problems 1) how to move one of the y-axis to be the right ? I tried to the commandaxis(2),But I failed. 2) how to add the axis information correctly.Since I have use the cmommand axis(1,at=1:6,labels=gradeinfo$gradenam) but it seems that the correct information that I want is superposition with the old axis information.What can i do ? the script and figure is shown as below .thanks .:) outflnm-paste(Outdic,meansd.jpg,sep=/) jpeg(file=outflnm, bg=transparent) legend-c(average error,stand quare error) lgcol-c(black,red1) par(las=1) yylab-c(forecast error) xxlab-c(typhoon class) llty-c(1,3) llwd-c(4,4) #par(bg='yellow') plot(avegrp,type='l',lty=1,col='black',lwd=4,xlab=xxlab,ylab=yylab) par(new=T) plot(sdgrp,type='l',lty=3,col='red1',xlab=xxlab,ylab=yylab,lwd=4) #axis(2, col = gold, lty = 2, lwd = 0.5) legend(topright, legend, lty=llty, lwd=llwd,col =lgcol) axis(1,at=1:6,labels=gradeinfo$gradenam) dev.off() Hi Jie, First problem: # put this just after the jpeg device is opened # to leave extra space on the right par(mar=c(5,4,4,4) # then just before dev.off axis(4,...) Second problem: # don't display the default x axis plot(avegrp,type='l',lty=1,col='black',lwd=4, xlab=xxlab,ylab=yylab,xaxt=n) # then display your custom axis Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] two questions about PLOT
here ,I want to plot two lines in one figure.But I have two problems 1) how to move one of the y-axis to be the right ? I tried to the commandaxis(2),But I failed. 2) how to add the axis information correctly.Since I have use the cmommand axis(1,at=1:6,labels=gradeinfo$gradenam) but it seems that the correct information that I want is superposition with the old axis information.What can i do ? the script and figure is shown as below .thanks .:) outflnm-paste(Outdic,meansd.jpg,sep=/) jpeg(file=outflnm, bg=transparent) legend-c(average error,stand quare error) lgcol-c(black,red1) par(las=1) yylab-c(forecast error) xxlab-c(typhoon class) llty-c(1,3) llwd-c(4,4) #par(bg='yellow') plot(avegrp,type='l',lty=1,col='black',lwd=4,xlab=xxlab,ylab=yylab) par(new=T) plot(sdgrp,type='l',lty=3,col='red1',xlab=xxlab,ylab=yylab,lwd=4) #axis(2, col = gold, lty = 2, lwd = 0.5) legend(topright, legend, lty=llty, lwd=llwd,col =lgcol) axis(1,at=1:6,labels=gradeinfo$gradenam) dev.off() -- TANG Jie Email: totang...@gmail.com Tel: 0086-2154896104 Shanghai Typhoon Institute,China __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions about PLOT
Hi, Not sure it is the best solution, but I would create the layout of the plot part by part: plot(type=n) #does not plot axis(1, at=1:6,...) #set the x-axis at the bottom axis(4,...) #set the y-axis on the right. I'm not sure that's what you were looking for, didn't really understand it lines(avegrp,...) #plot your data And do not forget to provide sample data! HTH Ivan Le 5/31/2010 16:44, Jie TANG a écrit : here ,I want to plot two lines in one figure.But I have two problems 1) how to move one of the y-axis to be the right ? I tried to the commandaxis(2),But I failed. 2) how to add the axis information correctly.Since I have use the cmommand axis(1,at=1:6,labels=gradeinfo$gradenam) but it seems that the correct information that I want is superposition with the old axis information.What can i do ? the script and figure is shown as below .thanks .:) outflnm-paste(Outdic,meansd.jpg,sep=/) jpeg(file=outflnm, bg=transparent) legend-c(average error,stand quare error) lgcol-c(black,red1) par(las=1) yylab-c(forecast error) xxlab-c(typhoon class) llty-c(1,3) llwd-c(4,4) #par(bg='yellow') plot(avegrp,type='l',lty=1,col='black',lwd=4,xlab=xxlab,ylab=yylab) par(new=T) plot(sdgrp,type='l',lty=3,col='red1',xlab=xxlab,ylab=yylab,lwd=4) #axis(2, col = gold, lty = 2, lwd = 0.5) legend(topright, legend, lty=llty, lwd=llwd,col =lgcol) axis(1,at=1:6,labels=gradeinfo$gradenam) dev.off() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions about PLOT
I would wote this question one of the most often asked questions here on that list ;-). Try searching the help archiwe (www.rseek.org) and you will find solutions. I would guess that you need to use something like: axis(4) as the sides of the plot are always numbered from bottom,left,top,right HTH Jannis Jie TANG schrieb: here ,I want to plot two lines in one figure.But I have two problems 1) how to move one of the y-axis to be the right ? I tried to the commandaxis(2),But I failed. 2) how to add the axis information correctly.Since I have use the cmommand axis(1,at=1:6,labels=gradeinfo$gradenam) but it seems that the correct information that I want is superposition with the old axis information.What can i do ? the script and figure is shown as below .thanks .:) outflnm-paste(Outdic,meansd.jpg,sep=/) jpeg(file=outflnm, bg=transparent) legend-c(average error,stand quare error) lgcol-c(black,red1) par(las=1) yylab-c(forecast error) xxlab-c(typhoon class) llty-c(1,3) llwd-c(4,4) #par(bg='yellow') plot(avegrp,type='l',lty=1,col='black',lwd=4,xlab=xxlab,ylab=yylab) par(new=T) plot(sdgrp,type='l',lty=3,col='red1',xlab=xxlab,ylab=yylab,lwd=4) #axis(2, col = gold, lty = 2, lwd = 0.5) legend(topright, legend, lty=llty, lwd=llwd,col =lgcol) axis(1,at=1:6,labels=gradeinfo$gradenam) dev.off() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two Questions on R (call by reference and pre-compilation)
Thank all of you! On 05/05/2010 12:08 AM, Steve Lianoglou wrote: Hi, On Tue, May 4, 2010 at 5:05 PM, Ruihong Huang ruihong.hu...@wiwi.hu-berlin.de wrote: Hi All, I have two questions on R. Could you please explain them to me? Thank you! 1) When call a function, R typically copys the values to formal arguments (call by value). This is technically incorrect. As far as I know, R has copy-on-write semantics. It will only make a copy of the passed in object if you modify it within your function. This is very cost, if I would like to pass a huge data set to a function. Is there any situations that R doesn't copy the data, besides pass data in an environment object. This question comes up quite often, you could try searching the archives to get more info about that (using gmane might be helpful). Check out this SO thread as well: http://stackoverflow.com/questions/2603184/r-pass-by-reference 2) Does R pre-compile the object function to binary when running optim? I experienced the R optim is much slower than the MATLAB fmincon function. I don't know MATLAB has done any pre-compilation on the script for object function or not. But perhaps, we can increase R performance by some sort of pre-compilation during running time. If I had to guess, I'd guess that it doesn't, but let's see what the gurus say ... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two Questions on R (call by reference and pre-compilation)
As far as large data sets, I've just discovered readLines and writeLines functions. I'm using it now to read in single rows, calculate things on them, and then write a single row to a file. -- View this message in context: http://r.789695.n4.nabble.com/Two-Questions-on-R-call-by-reference-and-pre-compilation-tp2126314p2130631.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Two Questions on R (call by reference and pre-compilation)
Hi All, I have two questions on R. Could you please explain them to me? Thank you! 1) When call a function, R typically copys the values to formal arguments (call by value). This is very cost, if I would like to pass a huge data set to a function. Is there any situations that R doesn't copy the data, besides pass data in an environment object. 2) Does R pre-compile the object function to binary when running optim? I experienced the R optim is much slower than the MATLAB fmincon function. I don't know MATLAB has done any pre-compilation on the script for object function or not. But perhaps, we can increase R performance by some sort of pre-compilation during running time. Thanks in advance. Best Regards, Ruihong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two Questions on R (call by reference and pre-compilation)
Hi, On Tue, May 4, 2010 at 5:05 PM, Ruihong Huang ruihong.hu...@wiwi.hu-berlin.de wrote: Hi All, I have two questions on R. Could you please explain them to me? Thank you! 1) When call a function, R typically copys the values to formal arguments (call by value). This is technically incorrect. As far as I know, R has copy-on-write semantics. It will only make a copy of the passed in object if you modify it within your function. This is very cost, if I would like to pass a huge data set to a function. Is there any situations that R doesn't copy the data, besides pass data in an environment object. This question comes up quite often, you could try searching the archives to get more info about that (using gmane might be helpful). Check out this SO thread as well: http://stackoverflow.com/questions/2603184/r-pass-by-reference 2) Does R pre-compile the object function to binary when running optim? I experienced the R optim is much slower than the MATLAB fmincon function. I don't know MATLAB has done any pre-compilation on the script for object function or not. But perhaps, we can increase R performance by some sort of pre-compilation during running time. If I had to guess, I'd guess that it doesn't, but let's see what the gurus say ... -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two Questions on R (call by reference and pre-compilation)
On 04/05/2010 5:05 PM, Ruihong Huang wrote: Hi All, I have two questions on R. Could you please explain them to me? Thank you! 1) When call a function, R typically copys the values to formal arguments (call by value). This is very cost, if I would like to pass a huge data set to a function. Is there any situations that R doesn't copy the data, besides pass data in an environment object. R doesn't copy data unless it needs to, for example if your function modifies its copy. So don't worry about the cost, there usually isn't much of one. 2) Does R pre-compile the object function to binary when running optim? I experienced the R optim is much slower than the MATLAB fmincon function. I don't know MATLAB has done any pre-compilation on the script for object function or not. But perhaps, we can increase R performance by some sort of pre-compilation during running time. There's an experimental compiler, but I don't know if there's a predicted release date for it. R is not an easy language to compile. Duncan Murdoch Thanks in advance. Best Regards, Ruihong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
For psychologists like me (possibly for others) by far the most time-consuming detail is variable labels. I need them for just about every analysis I do. We can use special packages like Hmisc and its function spss.get to import the labels, but then nearly all the other packages don't respect the labels, even simple things like list. So I find myself either adding them back in at every step or making my own versions of the functions. People coming from SPSS just expect the output of basic functions like factanal to display the labels, or at least to have the option of doing so. Respecting/preserving variable labels in more core functions would be an enormous help for social scientists IMHO. What helped? Lots of things - r-seek and quick-R are my favourites, along with amazing people who reply to problems on r-help. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On 03/26/2010 02:58 PM, Steve Powell wrote: For psychologists like me (possibly for others) by far the most time-consuming detail is variable labels. I need them for just about every analysis I do. We can use special packages like Hmisc and its function spss.get to import the labels, but then nearly all the other packages don't respect the labels, even simple things like list. So I find myself either adding them back in at every step or making my own versions of the functions. People coming from SPSS just expect the output of basic functions like factanal to display the labels, or at least to have the option of doing so. Respecting/preserving variable labels in more core functions would be an enormous help for social scientists IMHO. Hi Steve, From another psychologist, this is one reason that I have been rewriting a number of functions to read and display the variable.labels attribute produced by the read.spss function in the foreign package. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Two questions, first about contingency tables, and second about table () and data.frame (), from a visually impaired user.
Hi all, I want to make a contingency table in R. I want to tabulate two variables, one as the independent and second as the dependent variable. The IV has two categories, namely, birth complications, and no birth complications. The frequency of birth complication category is fifty, and the frequency of no birth complication category is 34. The categories and frequencies of DV follows. Schizophrenic 28, depressed 26, normal 30. When I am trying to make a contingency table in R, using table(name of variable one,name of variable two), I am getting an error that all arguments must have the same length. I believe that there would be two rows and three columns according to categories of IV and DV, But I guess R wants a third row for IV. When I am trying my luck with data.frame(var1,var2), I receive an error arguments imply differing number of rows. Any suggestion on how I can make a contingency table using the data above? My second question is a result of my inability to see the screen. I want to know that what is the difference between the tables you can make using table () and data.frame (). When I think of a table in my mind, I think of horizontal rows and vertical columns presenting data on different variables. But I am not sure what type of tables data.frame () prints on the screen and what type of tables table () prints on the screen. and which function should I use when I want to make tables you are suppose to make in statistics. Thank you all, and sorry for such basic questions. faiz. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two questions, first about contingency tables, and second about table () and data.frame (), from a visually impaired user.
Oh, That's clearer now. You should use two equal vectors when using table. For example: a - c(a,b,a,b,a,b,a,b) b - c(1,1,1,1,2,2,2,2) table(a,b) also have a look at ?as.table and ?matrix Does that help ? Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Sat, Mar 13, 2010 at 10:24 PM, Faiz Rasool fai...@gmail.com wrote: Hi Tal, I am typing: birth =c(50,34)# frequencies of the two categories of the independent variable. mental.health=c(28,26,30)#frequencies of the three categories of the dependent variable. table(birth,mental.health) I get an error all arguments must have the same length when I use data.frame(birth,mental.health) I get arguments imply differing number of rows. There is no missing value, and the sample size is 84. Best regards, Faiz. - Original Message - *From:* Tal Galili tal.gal...@gmail.com *To:* Faiz Rasool fai...@gmail.com *Sent:* Sunday, March 14, 2010 1:04 AM *Subject:* Re: [R] Two questions, first about contingency tables, and second about table () and data.frame (), from a visually impaired user. Hi Faiz, Two ideas: 1) do you have any NA's ? 2) can you send an example of the code/ file? That might help people (like me) to answer. Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Sat, Mar 13, 2010 at 9:58 PM, Faiz Rasool fai...@gmail.com wrote: Hi all, I want to make a contingency table in R. I want to tabulate two variables, one as the independent and second as the dependent variable. The IV has two categories, namely, birth complications, and no birth complications. The frequency of birth complication category is fifty, and the frequency of no birth complication category is 34. The categories and frequencies of DV follows. Schizophrenic 28, depressed 26, normal 30. When I am trying to make a contingency table in R, using table(name of variable one,name of variable two), I am getting an error that all arguments must have the same length. I believe that there would be two rows and three columns according to categories of IV and DV, But I guess R wants a third row for IV. When I am trying my luck with data.frame(var1,var2), I receive an error arguments imply differing number of rows. Any suggestion on how I can make a contingency table using the data above? My second question is a result of my inability to see the screen. I want to know that what is the difference between the tables you can make using table () and data.frame (). When I think of a table in my mind, I think of horizontal rows and vertical columns presenting data on different variables. But I am not sure what type of tables data.frame () prints on the screen and what type of tables table () prints on the screen. and which function should I use when I want to make tables you are suppose to make in statistics. Thank you all, and sorry for such basic questions. faiz. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two questions, first about contingency tables, and second about table () and data.frame (), from a visually impaired user.
Dear Faiz, I believe that your basic issue is that you are trying to use frequencies directly. table() needs all the arguments to be of the same length, because it counts the frequencies from raw data. So for two variables, you need pairs of scores indicating whether there was a birth complication or not and the category of mental health. Coding birth as 0,1 for whether a complication is present (1) or absent (0) Coding mental health as 1=Normal, 2=Depressed, 3=Schizophrenic Here is an example: --- birth - c(rep(1,50),rep(0,34)) # using rep() to create 50 1s and 34 0s, etc. birth - factor(birth, levels=c(1,0), labels=c(Birth Complications,No Birth Complications)) mental.health - c(rep(1,30),rep(2,26),rep(3,28)) mental.health - factor(mental.health, levels=c(1,2,3), labels=c(Normal,Depressed,Schizophrenic)) table(mental.health,birth) --- Which should give you: birth mental.health Birth Complications No Birth Complications Normal 30 0 Depressed 20 6 Schizophrenic 0 28 You do not need to use factor(), the table would still work, they just wouldn't have the nice names. You can also put these variables into a data frame ### data.frame(birth,mental.health) ### You might also lookup ?xtabs It is useful for contigency tables, particularly when you have more than 2 variables. I hope that is clear. Best regards, Josh On Sat, Mar 13, 2010 at 11:58 AM, Faiz Rasool fai...@gmail.com wrote: Hi all, I want to make a contingency table in R. I want to tabulate two variables, one as the independent and second as the dependent variable. The IV has two categories, namely, birth complications, and no birth complications. The frequency of birth complication category is fifty, and the frequency of no birth complication category is 34. The categories and frequencies of DV follows. Schizophrenic 28, depressed 26, normal 30. When I am trying to make a contingency table in R, using table(name of variable one,name of variable two), I am getting an error that all arguments must have the same length. I believe that there would be two rows and three columns according to categories of IV and DV, But I guess R wants a third row for IV. When I am trying my luck with data.frame(var1,var2), I receive an error arguments imply differing number of rows. Any suggestion on how I can make a contingency table using the data above? My second question is a result of my inability to see the screen. I want to know that what is the difference between the tables you can make using table () and data.frame (). When I think of a table in my mind, I think of horizontal rows and vertical columns presenting data on different variables. But I am not sure what type of tables data.frame () prints on the screen and what type of tables table () prints on the screen. and which function should I use when I want to make tables you are suppose to make in statistics. Thank you all, and sorry for such basic questions. faiz. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Senior in Psychology University of California, Riverside http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
To me, as a biologist recycled to biostats, I have always worked with Excel and then SPSS and moving to R was difficult (and still is, since I am still learning). Being a self-taught person, I learn R looking for examples in Google, which many times takes me to Rwiki or other. I sometimes post questions and most of the answers were helpful, but I have found that sometimes the answers have been too short or didn´t give enough hints as to how to follow, and that has stopped me from asking again in order not to annoy experts. I have not answered too many questions from newbies but I have tried to explain as much as I could. Sometimes I find it better not to answer rather than just answering a short vague answer. Please, examples, examples, examples! I found most difficult the different data types, since I understand excel as a data frame with columns and rows, and that´s it. Then as someone has already commented, the class, mode and str functions helped a lot. But I think that to me, examples are the way to let people learn. From that, I moved to use loops, and am still nervous when people suggest ussing *apply functions, I can´t get down to use them!. I find loops more logical, and can´t see the way of moving them to *apply. Finally, I am not a Linux expert , and I cannot get round to install and organise a proper R directory and keep updated. I have once tried to use a package that needed the development R version and was only prepared for Linux R, but couldn´t keep the R-devel versions updated. Some more step-by-step would help sometimes. Thanks for a great tool! Date: Tue, 2 Mar 2010 12:44:23 -0600 From: keo.orms...@gmail.com To: landronim...@gmail.com CC: r-help@r-project.org; pbu...@pburns.seanet.com Subject: Re: [R] two questions for R beginners Liviu Andronic escribió: On Mon, Mar 1, 2010 at 11:49 PM, Liviu Andronic landronim...@gmail.com wrote: On 3/1/10, Keo Ormsby keo.orms...@gmail.com wrote: Perhaps my biggest problem was that I couldn't (and still haven't) seen *absolute beginners* documents. there was once a link posted on r-sig-teaching that would probably fit your needs, but I cannot find it now. OK, I found it. Below is an excerpt of that r-sig-teaching e-mail. Liviu On Thu, Jul 2, 2009 at 2:19 PM, Robert W. Hayden hay...@mv.mv.com wrote: I think such a website would be a real asset. It would be most useful if it either were restricted to intro. stats. OR organized so that materials for real beginners were easy to extract from all the materials for programmers and Ph.D. statisticians. As a relative beginner myself, I find the usual resources useless. In self defense, I created materials for my own beginning students: http://courses.statistics.com/software/R/Rhome.htm Hi Liviu, This is indeed the best site for introduction I have seen. Although it still assumes some things that at first might seem unintuitive to the absolute beginner I talk about. For instance, in the first page, it shows that you can do sqrt(x), where x can be a vector, and return a vector of the square roots of each number. Although this is high school matrix algebra, most users expect that the input to square root function to be a single number, not a matrix, as in Excel or a calculator. Other concepts that are not explicitly introduced are R workspace, the use of arguments in functions (with or without the =), etc. Others are things like diff(range(rainfall)) , where you have the output of one function used as the input to another, all in the same command line. All these things seem very basic, but can be difficult if you are trying to learn on your own with no prior experience in programming. I hope I am not sounding too difficult and contrarian, I am just trying to share my experience with starting with R, and in trying to convey this learning to my colleagues and students. In the end, I did find everything I needed to learn, and now I feel at ease with R, and I believe that almost anybody that can use Excel or something like it, could learn R. Thank you for the information, Best wishes, Keo. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Hotmail: Free, trusted and rich email service. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Patrick, 1. Implicit intercepts. Implicit intercepts are not too bad for the main model, but they creep in occasionally in strange places where they might not be expected. For example, in some of the variance structures specified in lme, (~x) automatically expands to (~1+x). Venables said in the Exegeses paper: For teaching purposes it would be useful to have a switch that required users to include the intercept term in formulae if it is needed. This would deï¬nitely help more students than it would hinder. In other words it should be possible to override the automatic intercept term. 2. Working with colors. There are a number of functions in R for working with colors and since colors can be specified by palette number, name, hexadecimal string, values between 0 and 1, or values between 0 and 256, things can be confusing. One problem is that not all functions accept the same type of arguments or produce the same type of return values. For example, the awkward need of t and conversion to [0,255] in adding alpha levels to a color: rgb(t(col2rgb(c(navy,maroon))),alpha=120,max=255) 3. Factors. R tries to convert everything that it possibly can into a factor. Except, occasionally, it doesn't try. Further, after sub-setting data so that some factor levels have no data, too many functions fail. I shouldn't need to use drop.levels from gdata package all over the place to keep automated scripts running smoothly. Let's not forget: R as.numeric(factor(c(NA,0,1))) [1] NA 1 2 4. is.list(list(1)[1]) [1] TRUE is.matrix(matrix(1)[1,]) [1] FALSE Ouch. Ouch. Ouch. 5. Most useful: apropos and Rseek. Best, Kevin On Thu, Feb 25, 2010 at 11:31 AM, Patrick Burns pbu...@pburns.seanet.comwrote: * What were your biggest misconceptions or stumbling blocks to getting up and running with R? * What documents helped you the most in this initial phase? I especially want to hear from people who are lazy and impatient. Feel free to write to me off-list. Definitely write off-list if you are just confirming what has been said on-list. -- Patrick Burns pbu...@pburns.seanet.com http://www.burns-stat.com (home of 'The R Inferno' and 'A Guide for the Unwilling S User') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Kevin Wright [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
I think Duncan's example of a list that is a matrix is a compelling argument not to do the change. A matrix that is a list with both names and dimnames *is* probably rare (but certainly imaginable). A matrix that is a list is not so rare, and the proposed double meaning of '$' would certainly be confusing in that case. Pat On 02/03/2010 17:55, Duncan Murdoch wrote: On 02/03/2010 11:53 AM, William Dunlap wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin Sent: Tuesday, March 02, 2010 3:46 AM To: Karl Ove Hufthammer; r-h...@stat.math.ethz.ch Subject: Re: [R] two questions for R beginners Please take what follows not as an ad hominem statement, but rather as an attempt to improve what is already an excellent program, that has been built as a result of many, many hours of dedicated work by many, many unpaid, unsung volunteers. It troubles me a bit that when a confusing aspect of R is pointed out the response is not to try to improve the language so as to avoid the confusion, but rather to state that the confusion is inherent in the language. I understand that to make changes that would avoid the confusing aspect of the language that has been discussed in this thread would take time and effort by an R wizard (which I am not), time and effort that would not be compensated in the traditional sense. This does not mean that we should not acknowledge the confusion. If we what R to be the de facto lingua franca of statistical analysis doesn't it make sense to strive for syntax that is as straight forward and consistent as possible? Whenever one changes the language that way old code will break. I think in this case not much code would break. Mostly when people have a matrix M and ask for M$column they'll get an error; the proposal is that they'll get the requested column. (It is possible to have a list with names that is also a matrix with dimnames, but I think that is a pretty unusual construction.) But I haven't been convinced that the proposal is a net improvement to the language. Duncan Murdoch The developers can, with a lot of effort, fix their own code, and perhaps even user-written code on CRAN, but code that thousands of users have written will break. There is a lot of code out there that was written by trial and error and by folks who no longer work at an institution: the code works but no one knows exactly why it works. Telling folks they need to change that code because we have a cleaner but different syntax now is not good. Why would one spend time writing a package that might stop working when R is upgraded? I think the solution is not to change current semantics but to write functions that behave better and encourage users to use them, gradually abandoning the old constructs. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com Again, please understand that my comment is made with deepest respect for the many people who have unselfishly contributed to the R project. Many thanks to each and every one of you. John Karl Ove Hufthammer k...@huftis.org 3/2/2010 4:00 AM On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch murd...@stats.uwo.ca wrote: Suppose X is a dataframe or a matrix. What would you expect to get from X[1]? What about as.vector(X), or as.numeric(X)? All this of course depends on type of object one is speaking of. There are plenty of surprises available, and it's best to use the most logical way of extracting. E.g., to extract the top-left element of a 2D structure (data frame or matrix), use 'X[1,1]'. Luckily, R provides some shortcuts. For example, you can write 'X[2,3]' on a data frame, just as if it was a matrix, even though the underlying structure is completely different. (This doesn't work on a normal list; there you have to type the whole 'X[[2]][3]'.) The behaviour of the 'as.' functions may sometimes be surprising, at least for me. For example, 'as.data.frame' on a named vector gives a single-column data frame, instead of a single-row data frame. (I'm not sure what's the recommended way of converting a named vector to row data frame, but 'as.data.frame(t(X))' works, even though both 'X' and 't(X)' looks like a row of numbers.) The point is that a dataframe is a list, and a matrix isn't. If users don't understand that, then they'll be confused somewhere. Making matrices more list-like in one respect will just move the confusion elsewhere. The solution is to understand the difference. My main problem is not understanding the difference, which is easy, but knowing which type of I have when I get the output a function in a package. If I know the object is a named vector or a matrix with column names, it's easy enough to type 'X[,colname]', and if it's a data frame one may use the shortcut 'X$colname'. Usually, it *is* documented what the return value of a function is, but just looking at the output is much
Re: [R] two questions for R beginners
John Sorkin jsor...@grecc.umaryland.edu napsal dne 01.03.2010 15:19:10: If it looks like a duck and quacks like a duck, it ought to behave like a duck. To the user a matrix and a dataframe look alike . . . except a dataframe can Well, matrix looks like a data.frame only on the first sight. mat-matrix(1:12, 3,4) dat-as.data.frame(mat) str(dat) 'data.frame': 3 obs. of 4 variables: $ V1: int 1 2 3 $ V2: int 4 5 6 $ V3: int 7 8 9 $ V4: int 10 11 12 str(mat) int [1:3, 1:4] 1 2 3 4 5 6 7 8 9 10 ... seems to me a pretty different look like. Regards Petr hold non-numeric values. Thus to the users, a matrix looks like a special case of a DF, or perhaps conversely. If you can address elements of one structure using a given syntax, you should be able to address elements of the other structure using the same syntax. To do otherwise leads to confusion and is counter intuitive. John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Petr PIKAL petr.pi...@precheza.cz 3/1/2010 8:57 AM Hi r-help-boun...@r-project.org napsal dne 01.03.2010 13:03:24: snip I understand that 2 dimensional rectangular matrix looks quite similar to data frame however it is only a vector with dimensions. As such it can have items of only one type (numeric, character, ...). And you can easily change dimensions of matrix. matrix-1:12 dim(matrix) - c(2,6) matrix dim(matrix) - c(2,2,3) matrix dim(matrix) -NULL matrix So rectangular structure of printed matrix is a kind of coincidence only, whereas rectangular structure of data frame is its main feature. Regards Petr -- Karl Ove Hufthammer Petr, I think that could be confusing! The way I see it is that a matrix is a special case of an array, whose dimension attribute is of length 2 (number of rows, number of columns); and row and column refer to the rectangular display which you see when R prints to matrix. And this, of course, derives directly from the historic rectangular view of a matrix when written down. When you went from dim(matrix)-c(2,6) to dim(matrix)-c(2,2,3) you stripped it of its special title of matrix and cast it out into the motley mob of arrays (some of whom are matrices, but matrix no longer is). So the rectangular structure of printed matrix is not a coincidence, but is its main feature! Ok. Point taken. However I feel that possibility to manipulate matrix/array dimensions by simple changing them as I showed above together with perceiving matrix as a **vector with dimensions** prevented me especially in early days from using matrices instead of data frames and vice versa. Consider cbind and rbind confusing results for vectors with unequal mode. Far to often we can see something like that cbind(1:2,letters[1:2]) [,1] [,2] [1,] 1 a [2,] 2 b instead of data.frame(1:2,letters[1:2]) X1.2 letters.1.2. 11a 22b and then a question why does not the result behave as expected. Each type of object has some features which is good for some type of manipulation/analysis/plotting bud quite detrimental for others. Regards Petr To come back to Karl's query about why $ works for a dataframe but not for a matrix, note that $ is the extractor for getting a named component of a list. So, Karl, when you did d=head(iris[1:4]) you created a dataframe: str(d) # 'data.frame': 6 obs. of 4 variables: # $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 # $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 # $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 # $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 (with named components Sepal.Length, ... , Petal.Width), and a dataframe is a special case of a general list. In a general list, the separate components can each be anything. In a dataframe, each component is a vector; the different vectors may be of different types (logical, numeric, ... ) but of course the elements of any single vector must be of the same type; and, in a dataframe, all the vectors must have the same length (otherwise it is a general list, not a dataframe). So, when you print a dataframe, R chooses to display it as a rectangular structure. On the other hand, when you print a general list, R displays it quite differently: d # Sepal.Length Sepal.Width Petal.Length Petal.Width # 1 5.1 3.5 1.4 0.2 # 2 4.9 3.0 1.4 0.2 # 3 4.7 3.2 1.3 0.2 # 4 4.6 3.1
Re: [R] two questions for R beginners
Petr, On the other hand . . . mat-matrix(1:12, 3,4) dat-as.data.frame(mat) mat [,1] [,2] [,3] [,4] [1,]147 10 [2,]258 11 [3,]369 12 dat V1 V2 V3 V4 1 1 4 7 10 2 2 5 8 11 3 3 6 9 12 What you are demonstrating by your example is the manner in which the data are organized deep in the guts of R, not the way people, especially R beginners visualize objects in their mind. When I think of the integer sixty-nine, I visualize 69, not 1000101 despite the fact that 69, as an integer is represented in the computer as 1000101. John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Petr PIKAL petr.pi...@precheza.cz 3/3/2010 9:44 AM John Sorkin jsor...@grecc.umaryland.edu napsal dne 01.03.2010 15:19:10: If it looks like a duck and quacks like a duck, it ought to behave like a duck. To the user a matrix and a dataframe look alike . . . except a dataframe can Well, matrix looks like a data.frame only on the first sight. mat-matrix(1:12, 3,4) dat-as.data.frame(mat) str(dat) 'data.frame': 3 obs. of 4 variables: $ V1: int 1 2 3 $ V2: int 4 5 6 $ V3: int 7 8 9 $ V4: int 10 11 12 str(mat) int [1:3, 1:4] 1 2 3 4 5 6 7 8 9 10 ... seems to me a pretty different look like. Regards Petr hold non-numeric values. Thus to the users, a matrix looks like a special case of a DF, or perhaps conversely. If you can address elements of one structure using a given syntax, you should be able to address elements of the other structure using the same syntax. To do otherwise leads to confusion and is counter intuitive. John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Petr PIKAL petr.pi...@precheza.cz 3/1/2010 8:57 AM Hi r-help-boun...@r-project.org napsal dne 01.03.2010 13:03:24: snip I understand that 2 dimensional rectangular matrix looks quite similar to data frame however it is only a vector with dimensions. As such it can have items of only one type (numeric, character, ...). And you can easily change dimensions of matrix. matrix-1:12 dim(matrix) - c(2,6) matrix dim(matrix) - c(2,2,3) matrix dim(matrix) -NULL matrix So rectangular structure of printed matrix is a kind of coincidence only, whereas rectangular structure of data frame is its main feature. Regards Petr -- Karl Ove Hufthammer Petr, I think that could be confusing! The way I see it is that a matrix is a special case of an array, whose dimension attribute is of length 2 (number of rows, number of columns); and row and column refer to the rectangular display which you see when R prints to matrix. And this, of course, derives directly from the historic rectangular view of a matrix when written down. When you went from dim(matrix)-c(2,6) to dim(matrix)-c(2,2,3) you stripped it of its special title of matrix and cast it out into the motley mob of arrays (some of whom are matrices, but matrix no longer is). So the rectangular structure of printed matrix is not a coincidence, but is its main feature! Ok. Point taken. However I feel that possibility to manipulate matrix/array dimensions by simple changing them as I showed above together with perceiving matrix as a **vector with dimensions** prevented me especially in early days from using matrices instead of data frames and vice versa. Consider cbind and rbind confusing results for vectors with unequal mode. Far to often we can see something like that cbind(1:2,letters[1:2]) [,1] [,2] [1,] 1 a [2,] 2 b instead of data.frame(1:2,letters[1:2]) X1.2 letters.1.2. 11a 22b and then a question why does not the result behave as expected. Each type of object has some features which is good for some type of manipulation/analysis/plotting bud quite detrimental for others. Regards Petr To come back to Karl's query about why $ works for a dataframe but not for a matrix, note that $ is the extractor for getting a named component of a list. So, Karl, when you did d=head(iris[1:4]) you created a dataframe: str(d) # 'data.frame': 6 obs. of 4 variables: # $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 # $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 # $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 # $ Petal.Width : num 0.2 0.2 0.2 0.2
Re: [R] two questions for R beginners
Hi that is why I consider matrix is just a vector with dimensions and data.frame is a rectangular structure similar to Excel table. That saved me a lot of surprises. But I must admit I am not a real beginner nowadays although I still learn when using R, reading help list and trying sometimes to help others. Regards Petr John Sorkin jsor...@grecc.umaryland.edu napsal dne 03.03.2010 16:30:39: Petr, On the other hand . . . mat-matrix(1:12, 3,4) dat-as.data.frame(mat) mat [,1] [,2] [,3] [,4] [1,]147 10 [2,]258 11 [3,]369 12 dat V1 V2 V3 V4 1 1 4 7 10 2 2 5 8 11 3 3 6 9 12 What you are demonstrating by your example is the manner in which the data are organized deep in the guts of R, not the way people, especially R beginners visualize objects in their mind. When I think of the integer sixty-nine, I visualize 69, not 1000101 despite the fact that 69, as an integer is represented in the computer as 1000101. John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Petr PIKAL petr.pi...@precheza.cz 3/3/2010 9:44 AM John Sorkin jsor...@grecc.umaryland.edu napsal dne 01.03.2010 15:19:10: If it looks like a duck and quacks like a duck, it ought to behave like a duck. To the user a matrix and a dataframe look alike . . . except a dataframe can Well, matrix looks like a data.frame only on the first sight. mat-matrix(1:12, 3,4) dat-as.data.frame(mat) str(dat) 'data.frame': 3 obs. of 4 variables: $ V1: int 1 2 3 $ V2: int 4 5 6 $ V3: int 7 8 9 $ V4: int 10 11 12 str(mat) int [1:3, 1:4] 1 2 3 4 5 6 7 8 9 10 ... seems to me a pretty different look like. Regards Petr hold non-numeric values. Thus to the users, a matrix looks like a special case of a DF, or perhaps conversely. If you can address elements of one structure using a given syntax, you should be able to address elements of the other structure using the same syntax. To do otherwise leads to confusion and is counter intuitive. John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Petr PIKAL petr.pi...@precheza.cz 3/1/2010 8:57 AM Hi r-help-boun...@r-project.org napsal dne 01.03.2010 13:03:24: snip I understand that 2 dimensional rectangular matrix looks quite similar to data frame however it is only a vector with dimensions. As such it can have items of only one type (numeric, character, ...). And you can easily change dimensions of matrix. matrix-1:12 dim(matrix) - c(2,6) matrix dim(matrix) - c(2,2,3) matrix dim(matrix) -NULL matrix So rectangular structure of printed matrix is a kind of coincidence only, whereas rectangular structure of data frame is its main feature. Regards Petr -- Karl Ove Hufthammer Petr, I think that could be confusing! The way I see it is that a matrix is a special case of an array, whose dimension attribute is of length 2 (number of rows, number of columns); and row and column refer to the rectangular display which you see when R prints to matrix. And this, of course, derives directly from the historic rectangular view of a matrix when written down. When you went from dim(matrix)-c(2,6) to dim(matrix)-c(2,2,3) you stripped it of its special title of matrix and cast it out into the motley mob of arrays (some of whom are matrices, but matrix no longer is). So the rectangular structure of printed matrix is not a coincidence, but is its main feature! Ok. Point taken. However I feel that possibility to manipulate matrix/array dimensions by simple changing them as I showed above together with perceiving matrix as a **vector with dimensions** prevented me especially in early days from using matrices instead of data frames and vice versa. Consider cbind and rbind confusing results for vectors with unequal mode. Far to often we can see something like that cbind(1:2,letters[1:2]) [,1] [,2] [1,] 1 a [2,] 2 b instead of data.frame(1:2,letters[1:2]) X1.2 letters.1.2. 11a 22b and then a question why does not the result behave as expected. Each type of object has some features which is good for some type of
Re: [R] two questions for R beginners
If R made matrix$columnName mean the same as matrix[, columnName] (a vector) so matrices looked more like data.frames, would we also want the following to work as they do with data.frames? with(matrix, log(columnName)) # log of that column as a vector matrix[columnName] # 1-column matrix matrix[[columnName]] # vector equivalent of that 1-column matrix lm(responseColumn~predictorColumn, data=matrix) eval(quote(columnName), envir=matrix) The last 2 bump into the rule allowing envir to be a frame number (since a 1x1 matrix is currently taken as the frame number now). Perhaps the print methods for data.frame and matrix should announce the class of the object being printed. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Patrick Burns Sent: Wednesday, March 03, 2010 2:44 AM To: r-help@r-project.org Subject: Re: [R] two questions for R beginners I think Duncan's example of a list that is a matrix is a compelling argument not to do the change. A matrix that is a list with both names and dimnames *is* probably rare (but certainly imaginable). A matrix that is a list is not so rare, and the proposed double meaning of '$' would certainly be confusing in that case. Pat On 02/03/2010 17:55, Duncan Murdoch wrote: On 02/03/2010 11:53 AM, William Dunlap wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin Sent: Tuesday, March 02, 2010 3:46 AM To: Karl Ove Hufthammer; r-h...@stat.math.ethz.ch Subject: Re: [R] two questions for R beginners Please take what follows not as an ad hominem statement, but rather as an attempt to improve what is already an excellent program, that has been built as a result of many, many hours of dedicated work by many, many unpaid, unsung volunteers. It troubles me a bit that when a confusing aspect of R is pointed out the response is not to try to improve the language so as to avoid the confusion, but rather to state that the confusion is inherent in the language. I understand that to make changes that would avoid the confusing aspect of the language that has been discussed in this thread would take time and effort by an R wizard (which I am not), time and effort that would not be compensated in the traditional sense. This does not mean that we should not acknowledge the confusion. If we what R to be the de facto lingua franca of statistical analysis doesn't it make sense to strive for syntax that is as straight forward and consistent as possible? Whenever one changes the language that way old code will break. I think in this case not much code would break. Mostly when people have a matrix M and ask for M$column they'll get an error; the proposal is that they'll get the requested column. (It is possible to have a list with names that is also a matrix with dimnames, but I think that is a pretty unusual construction.) But I haven't been convinced that the proposal is a net improvement to the language. Duncan Murdoch The developers can, with a lot of effort, fix their own code, and perhaps even user-written code on CRAN, but code that thousands of users have written will break. There is a lot of code out there that was written by trial and error and by folks who no longer work at an institution: the code works but no one knows exactly why it works. Telling folks they need to change that code because we have a cleaner but different syntax now is not good. Why would one spend time writing a package that might stop working when R is upgraded? I think the solution is not to change current semantics but to write functions that behave better and encourage users to use them, gradually abandoning the old constructs. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com Again, please understand that my comment is made with deepest respect for the many people who have unselfishly contributed to the R project. Many thanks to each and every one of you. John Karl Ove Hufthammer k...@huftis.org 3/2/2010 4:00 AM On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch murd...@stats.uwo.ca wrote: Suppose X is a dataframe or a matrix. What would you expect to get from X[1]? What about as.vector(X), or as.numeric(X)? All this of course depends on type of object one is speaking of. There are plenty of surprises available, and it's best to use the most logical way of extracting. E.g., to extract the top-left element of a 2D structure (data frame or matrix), use 'X[1,1]'. Luckily, R provides some shortcuts. For example, you can write 'X[2,3]' on a data frame, just as if it was a matrix, even though the underlying structure is completely different. (This doesn't
Re: [R] two questions for R beginners
Bill, The points you make are well taken; one needs to know when to stop. I would suggest standardizing the methods used to refer to elements of a matrix and a dataframe and going no further. Why do I say this? A beginner, even a more experienced R users, probably envisions a dataframe and a matrix has having the same structure, but not the same contents. Both appear to be multi-dimensional structures that can store data, albeit data of different types. A matrix stores numerical values, a dataframe stores data of mixed types. This being the case it makes sense to assume that A%*%B will work when A and B are matrices, but C%*% D will not work when C and D are dataframes. This is quite logical and intuitive. It is an extension of the truism that one can perform the following arithmetic operation 2*3, but can't perform the following operation Bill*John (I use quotes to indicate that the names are proper names and not variable names). Despite the observation that on can reasonably expect that there are certain operations that one can perform on matrices, but not on dataframes (and conversely), the apparent similarity in structure of the two objects makes one assume (incorrectly at this time) that the syntax used to access elements of an array and a dataframe should be the same. I submit that having similar syntax for accessing elements of the two structures will assist users learn R. It will not cause them to assume that one can perform the exactly the same operations on the two structures. I apologize to other members of the listserver for the length of this subthread. It appears that I have lost the argument, and have not convinced those who would need to make the changes to allow matrices and dataframes to have similar syntax for addressing elements of the respective structures. I do not expect I will be adding any additional comments to this thread, but will continue to follow contributions other people make. Perhaps I will learn that I am not the only person who feels that the syntax should be consistent, but given what I have read so far, I doubt it. I thank everyone who has contributed to the discussion. John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) William Dunlap wdun...@tibco.com 3/3/2010 1:15 PM If R made matrix$columnName mean the same as matrix[, columnName] (a vector) so matrices looked more like data.frames, would we also want the following to work as they do with data.frames? with(matrix, log(columnName)) # log of that column as a vector matrix[columnName] # 1-column matrix matrix[[columnName]] # vector equivalent of that 1-column matrix lm(responseColumn~predictorColumn, data=matrix) eval(quote(columnName), envir=matrix) The last 2 bump into the rule allowing envir to be a frame number (since a 1x1 matrix is currently taken as the frame number now). Perhaps the print methods for data.frame and matrix should announce the class of the object being printed. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Patrick Burns Sent: Wednesday, March 03, 2010 2:44 AM To: r-help@r-project.org Subject: Re: [R] two questions for R beginners I think Duncan's example of a list that is a matrix is a compelling argument not to do the change. A matrix that is a list with both names and dimnames *is* probably rare (but certainly imaginable). A matrix that is a list is not so rare, and the proposed double meaning of '$' would certainly be confusing in that case. Pat On 02/03/2010 17:55, Duncan Murdoch wrote: On 02/03/2010 11:53 AM, William Dunlap wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin Sent: Tuesday, March 02, 2010 3:46 AM To: Karl Ove Hufthammer; r-h...@stat.math.ethz.ch Subject: Re: [R] two questions for R beginners Please take what follows not as an ad hominem statement, but rather as an attempt to improve what is already an excellent program, that has been built as a result of many, many hours of dedicated work by many, many unpaid, unsung volunteers. It troubles me a bit that when a confusing aspect of R is pointed out the response is not to try to improve the language so as to avoid the confusion, but rather to state that the confusion is inherent in the language. I understand that to make changes that would avoid the confusing aspect of the language that has been discussed in this thread would take time and effort by an R wizard (which I am not), time and effort that would
Re: [R] two questions for R beginners
On Mar 3, 2010, at 12:15 PM, William Dunlap wrote: If R made matrix$columnName mean the same as matrix[, columnName] (a vector) so matrices looked more like data.frames, would we also want the following to work as they do with data.frames? with(matrix, log(columnName)) # log of that column as a vector matrix[columnName] # 1-column matrix matrix[[columnName]] # vector equivalent of that 1-column matrix lm(responseColumn~predictorColumn, data=matrix) eval(quote(columnName), envir=matrix) The last 2 bump into the rule allowing envir to be a frame number (since a 1x1 matrix is currently taken as the frame number now). Perhaps the print methods for data.frame and matrix should announce the class of the object being printed. Yes! An enthusiastic vote for highlighting this fundamental distinction. There is already quite enough conflation of these two very dissimilar object classes. -- David Winsemius Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Patrick Burns Sent: Wednesday, March 03, 2010 2:44 AM To: r-help@r-project.org Subject: Re: [R] two questions for R beginners I think Duncan's example of a list that is a matrix is a compelling argument not to do the change. A matrix that is a list with both names and dimnames *is* probably rare (but certainly imaginable). A matrix that is a list is not so rare, and the proposed double meaning of '$' would certainly be confusing in that case. Pat On 02/03/2010 17:55, Duncan Murdoch wrote: On 02/03/2010 11:53 AM, William Dunlap wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin Sent: Tuesday, March 02, 2010 3:46 AM To: Karl Ove Hufthammer; r-h...@stat.math.ethz.ch Subject: Re: [R] two questions for R beginners Please take what follows not as an ad hominem statement, but rather as an attempt to improve what is already an excellent program, that has been built as a result of many, many hours of dedicated work by many, many unpaid, unsung volunteers. It troubles me a bit that when a confusing aspect of R is pointed out the response is not to try to improve the language so as to avoid the confusion, but rather to state that the confusion is inherent in the language. I understand that to make changes that would avoid the confusing aspect of the language that has been discussed in this thread would take time and effort by an R wizard (which I am not), time and effort that would not be compensated in the traditional sense. This does not mean that we should not acknowledge the confusion. If we what R to be the de facto lingua franca of statistical analysis doesn't it make sense to strive for syntax that is as straight forward and consistent as possible? Whenever one changes the language that way old code will break. I think in this case not much code would break. Mostly when people have a matrix M and ask for M$column they'll get an error; the proposal is that they'll get the requested column. (It is possible to have a list with names that is also a matrix with dimnames, but I think that is a pretty unusual construction.) But I haven't been convinced that the proposal is a net improvement to the language. Duncan Murdoch The developers can, with a lot of effort, fix their own code, and perhaps even user-written code on CRAN, but code that thousands of users have written will break. There is a lot of code out there that was written by trial and error and by folks who no longer work at an institution: the code works but no one knows exactly why it works. Telling folks they need to change that code because we have a cleaner but different syntax now is not good. Why would one spend time writing a package that might stop working when R is upgraded? I think the solution is not to change current semantics but to write functions that behave better and encourage users to use them, gradually abandoning the old constructs. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com Again, please understand that my comment is made with deepest respect for the many people who have unselfishly contributed to the R project. Many thanks to each and every one of you. John Karl Ove Hufthammer k...@huftis.org 3/2/2010 4:00 AM On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch murd...@stats.uwo.ca wrote: Suppose X is a dataframe or a matrix. What would you expect to get from X[1]? What about as.vector(X), or as.numeric(X)? All this of course depends on type of object one is speaking of. There are plenty of surprises available, and it's best to use the most logical way of extracting. E.g., to extract the top-left element of a 2D structure (data frame or matrix), use 'X[1,1]'. Luckily, R provides some shortcuts. For example, you can write 'X[2,3]' on a data frame, just as if it was a matrix
Re: [R] two questions for R beginners
On 03/04/2010 08:20 AM, David Winsemius wrote: ... Perhaps the print methods for data.frame and matrix should announce the class of the object being printed. Yes! An enthusiastic vote for highlighting this fundamental distinction. There is already quite enough conflation of these two very dissimilar object classes. If so, please make it an option with an argument like show.class or print.fancy that can be set globally in options. Otherwise those of us who depend upon the sparse displays of R objects in our functions (e.g. in the prettyR package) will suffer the results. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
John, I felt a short, somewhat strong reply was in order. One of the inherent aspects of the language is that R demands more of an understanding from users about what is taking place. Model formulae, for example, are close to what one would use if they were to write the model on paper. I consider this a strong feature. The confusing aspects that you point out are not the result of syntax. Syntax in R is well specified and, I believe, far easier to work with than many programming languages. English is a confusing language. C++ is a confusing language. One may have far more success learning, say, French if he/she does not like the syntax or grammar of English, or visual Pascal if the syntax of C++ is not preferred, rather than changing the language. If one wants to do business in a particular area, then it generally behooves one to suck it up and learn the native tongue or hire someone for that part. If one wants the program that is the standard for other world class statistics packages, which also happens to have a very amendable license agreement, then it behooves one to suck it up and learn R. R is what it is. If someone does not like it, he/she can use something else, pay far more for an inferior product which will also take longer to do a calculation and handle less data at once, while risking that the content of their understanding of statistics is diminished for it. Not that there is not room for development in R, but the sort of development you demand will evolve according to similar laws as those that govern economics and/or change in spoken language. You'd need major financial backing, and a strong influence over the culture of those who use R to pull this off. Other than that, you'll have to wait for the dialect to change over time from the cumulative effect of contributions from people the world over who all want something different out of the language. If someone wants to take on the R challenge for him/herself, however, then there is likely no better technical support in the world than the R community, albeit perhaps after dispensing with some of the niceties. Sincerely, KeithC. -Original Message- From: John Sorkin [mailto:jsor...@grecc.umaryland.edu] Sent: Tuesday, March 02, 2010 4:46 AM To: Karl Ove Hufthammer; r-h...@stat.math.ethz.ch Subject: Re: [R] two questions for R beginners Please take what follows not as an ad hominem statement, but rather as an attempt to improve what is already an excellent program, that has been built as a result of many, many hours of dedicated work by many, many unpaid, unsung volunteers. It troubles me a bit that when a confusing aspect of R is pointed out the response is not to try to improve the language so as to avoid the confusion, but rather to state that the confusion is inherent in the language. I understand that to make changes that would avoid the confusing aspect of the language that has been discussed in this thread would take time and effort by an R wizard (which I am not), time and effort that would not be compensated in the traditional sense. This does not mean that we should not acknowledge the confusion. If we what R to be the de facto lingua franca of statistical analysis doesn't it make sense to strive for syntax that is as straight forward and consistent as possible? Again, please understand that my comment is made with deepest respect for the many people who have unselfishly contributed to the R project. Many thanks to each and every one of you. John Karl Ove Hufthammer k...@huftis.org 3/2/2010 4:00 AM On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch murd...@stats.uwo.ca wrote: Suppose X is a dataframe or a matrix. What would you expect to get from X[1]? What about as.vector(X), or as.numeric(X)? All this of course depends on type of object one is speaking of. There are plenty of surprises available, and it's best to use the most logical way of extracting. E.g., to extract the top-left element of a 2D structure (data frame or matrix), use 'X[1,1]'. Luckily, R provides some shortcuts. For example, you can write 'X[2,3]' on a data frame, just as if it was a matrix, even though the underlying structure is completely different. (This doesn't work on a normal list; there you have to type the whole 'X[[2]][3]'.) The behaviour of the 'as.' functions may sometimes be surprising, at least for me. For example, 'as.data.frame' on a named vector gives a single-column data frame, instead of a single-row data frame. (I'm not sure what's the recommended way of converting a named vector to row data frame, but 'as.data.frame(t(X))' works, even though both 'X' and 't(X)' looks like a row of numbers.) The point is that a dataframe is a list, and a matrix isn't. If users don't understand that, then they'll be confused somewhere. Making matrices more list-like in one respect will just move the confusion elsewhere. The solution is to understand the difference. My main problem is not understanding the difference
Re: [R] two questions for R beginners
On Tue, 2 Mar 2010 08:58:25 +1300 Peter Alspach peter.alsp...@plantandfood.co.nz wrote: This brings up another confusion for new users. Simply typing the object name at the command line gives just one view of the object (that provided by print()). Good point. Any good introduction to R should include a brief discussion on 'str'. But sometimes even 'str' can fool you from discovering the real underlying structure of an object, e.g. for data frames. The solution is to use 'unclass' first. -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch murd...@stats.uwo.ca wrote: Suppose X is a dataframe or a matrix. What would you expect to get from X[1]? What about as.vector(X), or as.numeric(X)? All this of course depends on type of object one is speaking of. There are plenty of surprises available, and it's best to use the most logical way of extracting. E.g., to extract the top-left element of a 2D structure (data frame or matrix), use 'X[1,1]'. Luckily, R provides some shortcuts. For example, you can write 'X[2,3]' on a data frame, just as if it was a matrix, even though the underlying structure is completely different. (This doesn't work on a normal list; there you have to type the whole 'X[[2]][3]'.) The behaviour of the 'as.' functions may sometimes be surprising, at least for me. For example, 'as.data.frame' on a named vector gives a single-column data frame, instead of a single-row data frame. (I'm not sure what's the recommended way of converting a named vector to row data frame, but 'as.data.frame(t(X))' works, even though both 'X' and 't(X)' looks like a row of numbers.) The point is that a dataframe is a list, and a matrix isn't. If users don't understand that, then they'll be confused somewhere. Making matrices more list-like in one respect will just move the confusion elsewhere. The solution is to understand the difference. My main problem is not understanding the difference, which is easy, but knowing which type of I have when I get the output a function in a package. If I know the object is a named vector or a matrix with column names, it's easy enough to type 'X[,colname]', and if it's a data frame one may use the shortcut 'X$colname'. Usually, it *is* documented what the return value of a function is, but just looking at the output is much faster, and *usually* gives the correct answer. For example, 'mean' applied on a data frame gives a named vector, not a data frame, which is somewhat surprising (given that the columns of a data frame may be of different types, while the elements of a vector may not). (And yes, I know that it's *documented* that it returns a named vector.) On the other hand, perhaps it is surprising that 'mean' works on data frames at all. :-) -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Mon, Mar 1, 2010 at 11:49 PM, Liviu Andronic landronim...@gmail.com wrote: On 3/1/10, Keo Ormsby keo.orms...@gmail.com wrote: Perhaps my biggest problem was that I couldn't (and still haven't) seen *absolute beginners* documents. there was once a link posted on r-sig-teaching that would probably fit your needs, but I cannot find it now. OK, I found it. Below is an excerpt of that r-sig-teaching e-mail. Liviu On Thu, Jul 2, 2009 at 2:19 PM, Robert W. Hayden hay...@mv.mv.com wrote: I think such a website would be a real asset. It would be most useful if it either were restricted to intro. stats. OR organized so that materials for real beginners were easy to extract from all the materials for programmers and Ph.D. statisticians. As a relative beginner myself, I find the usual resources useless. In self defense, I created materials for my own beginning students: http://courses.statistics.com/software/R/Rhome.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginner
What were your biggest misconceptions or stumbling blocks to getting up and running with R? Easy. I terms of materials I have been unable to find good books that introduce users to R from the perspective of someone familiar only with packages like SPSS or STATA, or not familiar with statistics packages at all. Even introduction texts use jargon without introducing it. I think that R-help files should be more thorough than they are, and contain more examples. I thought that STATA help files were sparse! The notion that 'R is a user community and thus they do this in their spare time' is no excuse for those creating new tools for R not developing complete help files. It doesn't take that much time relative to actually creating the new function. In terms of actual R use - creating, using, and manipulating data are the biggest frustration for those of the 'spreadsheet generation'. I get the impression that one needs to not merely understand, but be fully fluent in the jargon of matrix mathematics to even know what is going on half the time. I find myself - even now - using 'rules of thumb' that 'seemed to work' rather than fully understanding what I am doing. It is particularly discouraging when many of those 'intro books' suggest using something besides R for data manipulation - how clumsy is that!? I find the actual programming syntax itself is the easiest part to master. It is certainly more flexible - but without a particularly sufficient increase in complexity - than trying to write script in SPSS and STATA. Brandon Zicha __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Please take what follows not as an ad hominem statement, but rather as an attempt to improve what is already an excellent program, that has been built as a result of many, many hours of dedicated work by many, many unpaid, unsung volunteers. It troubles me a bit that when a confusing aspect of R is pointed out the response is not to try to improve the language so as to avoid the confusion, but rather to state that the confusion is inherent in the language. I understand that to make changes that would avoid the confusing aspect of the language that has been discussed in this thread would take time and effort by an R wizard (which I am not), time and effort that would not be compensated in the traditional sense. This does not mean that we should not acknowledge the confusion. If we what R to be the de facto lingua franca of statistical analysis doesn't it make sense to strive for syntax that is as straight forward and consistent as possible? Again, please understand that my comment is made with deepest respect for the many people who have unselfishly contributed to the R project. Many thanks to each and every one of you. John Karl Ove Hufthammer k...@huftis.org 3/2/2010 4:00 AM On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch murd...@stats.uwo.ca wrote: Suppose X is a dataframe or a matrix. What would you expect to get from X[1]? What about as.vector(X), or as.numeric(X)? All this of course depends on type of object one is speaking of. There are plenty of surprises available, and it's best to use the most logical way of extracting. E.g., to extract the top-left element of a 2D structure (data frame or matrix), use 'X[1,1]'. Luckily, R provides some shortcuts. For example, you can write 'X[2,3]' on a data frame, just as if it was a matrix, even though the underlying structure is completely different. (This doesn't work on a normal list; there you have to type the whole 'X[[2]][3]'.) The behaviour of the 'as.' functions may sometimes be surprising, at least for me. For example, 'as.data.frame' on a named vector gives a single-column data frame, instead of a single-row data frame. (I'm not sure what's the recommended way of converting a named vector to row data frame, but 'as.data.frame(t(X))' works, even though both 'X' and 't(X)' looks like a row of numbers.) The point is that a dataframe is a list, and a matrix isn't. If users don't understand that, then they'll be confused somewhere. Making matrices more list-like in one respect will just move the confusion elsewhere. The solution is to understand the difference. My main problem is not understanding the difference, which is easy, but knowing which type of I have when I get the output a function in a package. If I know the object is a named vector or a matrix with column names, it's easy enough to type 'X[,colname]', and if it's a data frame one may use the shortcut 'X$colname'. Usually, it *is* documented what the return value of a function is, but just looking at the output is much faster, and *usually* gives the correct answer. For example, 'mean' applied on a data frame gives a named vector, not a data frame, which is somewhat surprising (given that the columns of a data frame may be of different types, while the elements of a vector may not). (And yes, I know that it's *documented* that it returns a named vector.) On the other hand, perhaps it is surprising that 'mean' works on data frames at all. :-) -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginner
Brandon Zicha wrote: What were your biggest misconceptions or stumbling blocks to getting up and running with R? Easy. I terms of materials I have been unable to find good books that introduce users to R from the perspective of someone familiar only with packages like SPSS or STATA, or not familiar with statistics packages at all. Even introduction texts use jargon without introducing it. I think that R-help files should be more thorough than they are, and contain more examples. I thought that STATA help files were sparse! The notion that 'R is a user community and thus they do this in their spare time' is no excuse for those creating new tools for R not developing complete help files. It doesn't take that much time relative to actually creating the new function. Hi Brandon, I would disagree with your point that documentation doesn't take much time. Writing documentation that is suitable for both the advanced user (being a reference, and thus preferably short) and the beginning user (being sort of a tutorial, and thus prefererably longer) is quite a challenge, comparable to writing a good paper. Apart from the fact that it takes quite a while, it is also not much fun. Often people develop packages for their own research and put the software online so others can benefit, they don;t need the documentation themselves and don't get paid to write the documentation. So saying 'it's no excuse' really goes too far in my view. R is free, you did not pay several thousands of euros giving you the right for good support. Even the support is free through the mailing list. You can get a paid version of R at Revelution Computing. Then you can call them if there are problems. I'm not meaning to offend anybody, but I didn't agree with is no excuse for those creating new tools for R not developing complete help files. Partly the strength of R is in the open source, but sometimes, as with documentation, this can bite you. But I think the R docs aren't that bad, I've seen proprietary software that a worse job than R. my 2euro on the subject :), Cheers, Paul In terms of actual R use - creating, using, and manipulating data are the biggest frustration for those of the 'spreadsheet generation'. I get the impression that one needs to not merely understand, but be fully fluent in the jargon of matrix mathematics to even know what is going on half the time. I find myself - even now - using 'rules of thumb' that 'seemed to work' rather than fully understanding what I am doing. It is particularly discouraging when many of those 'intro books' suggest using something besides R for data manipulation - how clumsy is that!? I find the actual programming syntax itself is the easiest part to master. It is certainly more flexible - but without a particularly sufficient increase in complexity - than trying to write script in SPSS and STATA. Brandon Zicha __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Drs. Paul Hiemstra Department of Physical Geography Faculty of Geosciences University of Utrecht Heidelberglaan 2 P.O. Box 80.115 3508 TC Utrecht Phone: +3130 274 3113 Mon-Tue Phone: +3130 253 5773 Wed-Fri http://intamap.geo.uu.nl/~paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginner
On Tue, 2 Mar 2010 12:31:45 +0100 Brandon Zicha brandon.zi...@ua.ac.be wrote: Easy. I terms of materials I have been unable to find good books that introduce users to R from the perspective of someone familiar only with packages like SPSS or STATA, Have you read these books: R for SAS and SPSS Users http://www.springer.com/statistics/computanional+statistics/book/978-0- 387-09417-5 R for Stata Users http://www.springer.com/statistics/computanional+statistics/book/978-1- 4419-1317-3 (I have not, so I don't know how good they are.) -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
John Sorkin wrote: Please take what follows not as an ad hominem statement, but rather as an attempt to improve what is already an excellent program, that has been built as a result of many, many hours of dedicated work by many, many unpaid, unsung volunteers. It troubles me a bit that when a confusing aspect of R is pointed out the response is not to try to improve the language so as to avoid the confusion, but rather to state that the confusion is inherent in the language. I understand that to make changes that would avoid the confusing aspect of the language that has been discussed in this thread would take time and effort by an R wizard (which I am not), time and effort that would not be compensated in the traditional sense. This does not mean that we should not acknowledge the confusion. If we what R to be the de facto lingua franca of statistical analysis doesn't it make sense to strive for syntax that is as straight forward and consistent as possible? I think you've misunderstood the argument. It would not be hard to make the suggested change. I don't object to it because it would be too much work, I object to it because I think it is not an improvement. Dataframes and matrices are different, and there is no way to avoid that fact. The arguments in favour of the change seem to be these: - Dataframes and matrices are similar in some respects, so they should be similar in more. In fact, I believe that the source of confusion is the fact that the are similar, so this would not improve things. People would still be confused by the differences, which are unavoidable. - Using $ to extract a column of a matrix would be convenient. I agree, it saves 4 keystrokes to type X$column instead of X[,column]. But I think it increases confusion, so the savings are not worthwhile. For example, the col2rgb function returns a matrix with rows named red, green and blue. But under your proposal, I'd still need to use X[red,] to extract the red component, because columns are components, but rows are not. You are complaining that the lack of $ for matrices is an unnecessary asymmetry, and unnecessary asymmetries are confusing. But your proposal introduces a new one! - Some functions return matrices when I expect a dataframe, or vice versa. That will continue to be true regardless of whether the proposed change is made. You need to read the documentation. If it is unclear, it should be improved, the language shouldn't be changed so that sloppy documentation is accurate. - You suggested this so anyone who disagrees must be lazy. Which really is an ad hominem argument, despite your disclaimer. I think you should respect the fact that there are people who disagree with the value of your suggestion. (Which is also an ad hominem attack, but isn't central to my argument.) Duncan Murdoch Again, please understand that my comment is made with deepest respect for the many people who have unselfishly contributed to the R project. Many thanks to each and every one of you. John Karl Ove Hufthammer k...@huftis.org 3/2/2010 4:00 AM On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch murd...@stats.uwo.ca wrote: Suppose X is a dataframe or a matrix. What would you expect to get from X[1]? What about as.vector(X), or as.numeric(X)? All this of course depends on type of object one is speaking of. There are plenty of surprises available, and it's best to use the most logical way of extracting. E.g., to extract the top-left element of a 2D structure (data frame or matrix), use 'X[1,1]'. Luckily, R provides some shortcuts. For example, you can write 'X[2,3]' on a data frame, just as if it was a matrix, even though the underlying structure is completely different. (This doesn't work on a normal list; there you have to type the whole 'X[[2]][3]'.) The behaviour of the 'as.' functions may sometimes be surprising, at least for me. For example, 'as.data.frame' on a named vector gives a single-column data frame, instead of a single-row data frame. (I'm not sure what's the recommended way of converting a named vector to row data frame, but 'as.data.frame(t(X))' works, even though both 'X' and 't(X)' looks like a row of numbers.) The point is that a dataframe is a list, and a matrix isn't. If users don't understand that, then they'll be confused somewhere. Making matrices more list-like in one respect will just move the confusion elsewhere. The solution is to understand the difference. My main problem is not understanding the difference, which is easy, but knowing which type of I have when I get the output a function in a package. If I know the object is a named vector or a matrix with column names, it's easy enough to type 'X[,colname]', and if it's a data frame one may use the shortcut 'X$colname'. Usually, it *is* documented what the return value of a function is, but just looking at the output is much faster,
Re: [R] two questions for R beginner
Hi Brandon, I just read this book, which I am sure you will be interested in: http://www.amazon.com/SAS-SPSS-Users-Statistics-Computing/dp/0387094172 Cheers!! Albert-Jan ~~ In the face of ambiguity, refuse the temptation to guess. ~~ --- On Tue, 3/2/10, Brandon Zicha brandon.zi...@ua.ac.be wrote: From: Brandon Zicha brandon.zi...@ua.ac.be Subject: Re: [R] two questions for R beginner To: r-help@r-project.org Date: Tuesday, March 2, 2010, 12:31 PM What were your biggest misconceptions or stumbling blocks to getting up and running with R? Easy. I terms of materials I have been unable to find good books that introduce users to R from the perspective of someone familiar only with packages like SPSS or STATA, or not familiar with statistics packages at all. Even introduction texts use jargon without introducing it. I think that R-help files should be more thorough than they are, and contain more examples. I thought that STATA help files were sparse! The notion that 'R is a user community and thus they do this in their spare time' is no excuse for those creating new tools for R not developing complete help files. It doesn't take that much time relative to actually creating the new function. In terms of actual R use - creating, using, and manipulating data are the biggest frustration for those of the 'spreadsheet generation'. I get the impression that one needs to not merely understand, but be fully fluent in the jargon of matrix mathematics to even know what is going on half the time. I find myself - even now - using 'rules of thumb' that 'seemed to work' rather than fully understanding what I am doing. It is particularly discouraging when many of those 'intro books' suggest using something besides R for data manipulation - how clumsy is that!? I find the actual programming syntax itself is the easiest part to master. It is certainly more flexible - but without a particularly sufficient increase in complexity - than trying to write script in SPSS and STATA. Brandon Zicha __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Tue, Mar 2, 2010 at 7:27 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: John Sorkin wrote: Please take what follows not as an ad hominem statement, but rather as an attempt to improve what is already an excellent program, that has been built as a result of many, many hours of dedicated work by many, many unpaid, unsung volunteers. It troubles me a bit that when a confusing aspect of R is pointed out the response is not to try to improve the language so as to avoid the confusion, but rather to state that the confusion is inherent in the language. I understand that to make changes that would avoid the confusing aspect of the language that has been discussed in this thread would take time and effort by an R wizard (which I am not), time and effort that would not be compensated in the traditional sense. This does not mean that we should not acknowledge the confusion. If we what R to be the de facto lingua franca of statistical analysis doesn't it make sense to strive for syntax that is as straight forward and consistent as possible? I think you've misunderstood the argument. It would not be hard to make the suggested change. I don't object to it because it would be too much work, I object to it because I think it is not an improvement. Dataframes and matrices are different, and there is no way to avoid that fact. The arguments in favour of the change seem to be these: Users of zoo have some experience with this since zoo uses matrices to represent 2d time series and originally did not support $ as a column extractor but now does. I was originally opposed to adding it for the reasons you state but it was eventually added and having used it for some time now since it got into the package I must say that it is very convenient and I now regard it as a definite improvement in user experience. Certainly I use the feature all the time. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginner
Brandon Zicha wrote: Hey Paul, Hey Brandon, (adding R-help in the cc) I agree with you that the documentation of R could be better, especially with more examples in code showing not only the common cases, but also more esoteric cases. It would be great if everyone invested a lot of time to write awesome documentation, but this is not the case. I just objected to the tone (I tought :)) I spotted. Some more comments are inline: Accepting the main point of my post - that the often VERY incomplete help files appended to packages can be a major stumbling block for getting up and running in R - I take your point. I probably went a bit to far with my language there. I would point out though that a great many parts of research (like writing a bibliography - or searching for citations of any kind usually) aren't much fun, but are an important part of research related work. Likewise, complete documentation (by which I hardly mean a paper - looking at STATA help files as a minimum would be a good start) is part of programming. I agree that one needs to employ some level of judgement, otherwise you will get helpfile that says First turn on the computer... then click the 'R' Icon But, I have myself created one or two STATA functions that I have put up for public use - so I know how not fun, but necessary complete documentation is. Further, I didn't say that writing documentation doesn't take time. Everything takes time. My point was that relative to actually creating the application - writing more complete documentation takes very little time. If one invests the time to do the 'fun' stuff of writing a new package for R, it seems reasonable that taking the (proportionately) little time to write a nicer help file would be the most 'professional' thing to do. But, this could be my illusion that all researchers seem themselves as professionals - rather than an anarchic egoistic enclave of independent self-interested paper producers. This is what scientists get judged upon, not on how much software they publish and how good their documentation is. Furthermore, it is quite hard for a hardcore R programmer to judge what people find har about their software. I am notorious for assuming greater standards as an acceptable 'norm' than my community at large :-) Furthermore, you are absolutely right that my standards are apparently even to high for many commercial applications! R help is sometimes downright good! So, if I accept that I am demanding S.O.B. and tone down my thoughts of proper documentation and professionalism and adopt the (probably more) reasonable perspective you do at the end of well, this is the world we live in... and come on it's free I totally agree that I probably went too far! But, better yet, I think that this observation you make suggests a solution: Perhaps R could use a more integrated and organized open source help system. I can think of a few possibilities - the easiest being a wiki version of R help. This way users could add useful information to help files - such as more examples, tricks, tips, and known problems. This would take advantage of the open source, free, user-community centered aspects of R, and permit those with an interest in helping beginners to post notes for beginners - on the help files. I know that if such a wiki existed I would have posted my recent example of constrain optimization I just did recently. It wouldn't be too difficult to add a function wikihelp(X) that would open the wiki help page rather than the standard help documentation. Currently, help on any given command is scattered all over help fora all about the web. A central, indexed, and easily referenced help system might be a solution. Heck, such a system could go a step further and link R-help listserv archives by command thus centralizing and integrating the open-source user-built information resource of the listserv into help(). How many e-mails to this listserv begin with 'I just spent a few hours cruising the help forums related to 'X' and couldn't find an answer.' Sounds like a good addition, allowing people to add to the documentation as they see fit. There is ofcourse the R wiki, but this is not widely used and not firmly embedded into R itself. But how would we keep such a system you propose manageable, preventing it from becoming an enormous mess. Maybe some kind of moderation? I note that STATA has all their help files for the latest version of stata available on the web (http://www.stata.com/help.cgi?contents). How difficult would a similar system - only with R, editable and with links to supplementary information - be to set up? I can't imagine it would be horribly expensive in terms of set up costs. A problem is that there is no company that markets R that could set this up, the community is much looser, much more open source. Probably the R core team would be the closest thing we have. What do you
Re: [R] two questions for R beginners
On 02/03/2010 11:53 AM, William Dunlap wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin Sent: Tuesday, March 02, 2010 3:46 AM To: Karl Ove Hufthammer; r-h...@stat.math.ethz.ch Subject: Re: [R] two questions for R beginners Please take what follows not as an ad hominem statement, but rather as an attempt to improve what is already an excellent program, that has been built as a result of many, many hours of dedicated work by many, many unpaid, unsung volunteers. It troubles me a bit that when a confusing aspect of R is pointed out the response is not to try to improve the language so as to avoid the confusion, but rather to state that the confusion is inherent in the language. I understand that to make changes that would avoid the confusing aspect of the language that has been discussed in this thread would take time and effort by an R wizard (which I am not), time and effort that would not be compensated in the traditional sense. This does not mean that we should not acknowledge the confusion. If we what R to be the de facto lingua franca of statistical analysis doesn't it make sense to strive for syntax that is as straight forward and consistent as possible? Whenever one changes the language that way old code will break. I think in this case not much code would break. Mostly when people have a matrix M and ask for M$column they'll get an error; the proposal is that they'll get the requested column. (It is possible to have a list with names that is also a matrix with dimnames, but I think that is a pretty unusual construction.) But I haven't been convinced that the proposal is a net improvement to the language. Duncan Murdoch The developers can, with a lot of effort, fix their own code, and perhaps even user-written code on CRAN, but code that thousands of users have written will break. There is a lot of code out there that was written by trial and error and by folks who no longer work at an institution: the code works but no one knows exactly why it works. Telling folks they need to change that code because we have a cleaner but different syntax now is not good. Why would one spend time writing a package that might stop working when R is upgraded? I think the solution is not to change current semantics but to write functions that behave better and encourage users to use them, gradually abandoning the old constructs. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com Again, please understand that my comment is made with deepest respect for the many people who have unselfishly contributed to the R project. Many thanks to each and every one of you. John Karl Ove Hufthammer k...@huftis.org 3/2/2010 4:00 AM On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch murd...@stats.uwo.ca wrote: Suppose X is a dataframe or a matrix. What would you expect to get from X[1]? What about as.vector(X), or as.numeric(X)? All this of course depends on type of object one is speaking of. There are plenty of surprises available, and it's best to use the most logical way of extracting. E.g., to extract the top-left element of a 2D structure (data frame or matrix), use 'X[1,1]'. Luckily, R provides some shortcuts. For example, you can write 'X[2,3]' on a data frame, just as if it was a matrix, even though the underlying structure is completely different. (This doesn't work on a normal list; there you have to type the whole 'X[[2]][3]'.) The behaviour of the 'as.' functions may sometimes be surprising, at least for me. For example, 'as.data.frame' on a named vector gives a single-column data frame, instead of a single-row data frame. (I'm not sure what's the recommended way of converting a named vector to row data frame, but 'as.data.frame(t(X))' works, even though both 'X' and 't(X)' looks like a row of numbers.) The point is that a dataframe is a list, and a matrix isn't. If users don't understand that, then they'll be confused somewhere. Making matrices more list-like in one respect will just move the confusion elsewhere. The solution is to understand the difference. My main problem is not understanding the difference, which is easy, but knowing which type of I have when I get the output a function in a package. If I know the object is a named vector or a matrix with column names, it's easy enough to type 'X[,colname]', and if it's a data frame one may use the shortcut 'X$colname'. Usually, it *is* documented what the return value of a function is, but just looking at the output is much faster, and *usually* gives the correct answer. For example, 'mean' applied on a data frame gives a named vector, not a data frame, which is somewhat surprising (given that the columns of a data frame may be of different types, while the elements of a vector may
Re: [R] two questions for R beginners
William, I agree that changing syntax can lead to problems. I don't, however think extending the language will break existing code. Providing a common syntax for accessing matrices and dataframes will not change the way things have been done to date, but rather how things will be done in the future. John John Sorkin jsor...@grecc.umaryland.edu -Original Message- From: William Dunlap wdun...@tibco.com To: John Sorkin jsor...@grecc.umaryland.edu To: Karl Ove Hufthammer k...@huftis.org To: r-h...@stat.math.ethz.ch Sent: 3/2/2010 11:53:45 AM Subject: RE: [R] two questions for R beginners -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin Sent: Tuesday, March 02, 2010 3:46 AM To: Karl Ove Hufthammer; r-h...@stat.math.ethz.ch Subject: Re: [R] two questions for R beginners Please take what follows not as an ad hominem statement, but rather as an attempt to improve what is already an excellent program, that has been built as a result of many, many hours of dedicated work by many, many unpaid, unsung volunteers. It troubles me a bit that when a confusing aspect of R is pointed out the response is not to try to improve the language so as to avoid the confusion, but rather to state that the confusion is inherent in the language. I understand that to make changes that would avoid the confusing aspect of the language that has been discussed in this thread would take time and effort by an R wizard (which I am not), time and effort that would not be compensated in the traditional sense. This does not mean that we should not acknowledge the confusion. If we what R to be the de facto lingua franca of statistical analysis doesn't it make sense to strive for syntax that is as straight forward and consistent as possible? Whenever one changes the language that way old code will break. The developers can, with a lot of effort, fix their own code, and perhaps even user-written code on CRAN, but code that thousands of users have written will break. There is a lot of code out there that was written by trial and error and by folks who no longer work at an institution: the code works but no one knows exactly why it works. Telling folks they need to change that code because we have a cleaner but different syntax now is not good. Why would one spend time writing a package that might stop working when R is upgraded? I think the solution is not to change current semantics but to write functions that behave better and encourage users to use them, gradually abandoning the old constructs. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com Again, please understand that my comment is made with deepest respect for the many people who have unselfishly contributed to the R project. Many thanks to each and every one of you. John Karl Ove Hufthammer k...@huftis.org 3/2/2010 4:00 AM On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch murd...@stats.uwo.ca wrote: Suppose X is a dataframe or a matrix. What would you expect to get from X[1]? What about as.vector(X), or as.numeric(X)? All this of course depends on type of object one is speaking of. There are plenty of surprises available, and it's best to use the most logical way of extracting. E.g., to extract the top-left element of a 2D structure (data frame or matrix), use 'X[1,1]'. Luckily, R provides some shortcuts. For example, you can write 'X[2,3]' on a data frame, just as if it was a matrix, even though the underlying structure is completely different. (This doesn't work on a normal list; there you have to type the whole 'X[[2]][3]'.) The behaviour of the 'as.' functions may sometimes be surprising, at least for me. For example, 'as.data.frame' on a named vector gives a single-column data frame, instead of a single-row data frame. (I'm not sure what's the recommended way of converting a named vector to row data frame, but 'as.data.frame(t(X))' works, even though both 'X' and 't(X)' looks like a row of numbers.) The point is that a dataframe is a list, and a matrix isn't. If users don't understand that, then they'll be confused somewhere. Making matrices more list-like in one respect will just move the confusion elsewhere. The solution is to understand the difference. My main problem is not understanding the difference, which is easy, but knowing which type of I have when I get the output a function in a package. If I know the object is a named vector or a matrix with column names, it's easy enough to type 'X[,colname]', and if it's a data frame one may use the shortcut 'X$colname'. Usually, it *is* documented what the return value of a function is, but just looking at the output is much faster, and *usually* gives the correct answer. For example, 'mean' applied on a data frame gives a named vector, not a data frame, which
Re: [R] two questions for R beginners
Liviu Andronic escribió: On Mon, Mar 1, 2010 at 11:49 PM, Liviu Andronic landronim...@gmail.com wrote: On 3/1/10, Keo Ormsby keo.orms...@gmail.com wrote: Perhaps my biggest problem was that I couldn't (and still haven't) seen *absolute beginners* documents. there was once a link posted on r-sig-teaching that would probably fit your needs, but I cannot find it now. OK, I found it. Below is an excerpt of that r-sig-teaching e-mail. Liviu On Thu, Jul 2, 2009 at 2:19 PM, Robert W. Hayden hay...@mv.mv.com wrote: I think such a website would be a real asset. It would be most useful if it either were restricted to intro. stats. OR organized so that materials for real beginners were easy to extract from all the materials for programmers and Ph.D. statisticians. As a relative beginner myself, I find the usual resources useless. In self defense, I created materials for my own beginning students: http://courses.statistics.com/software/R/Rhome.htm Hi Liviu, This is indeed the best site for introduction I have seen. Although it still assumes some things that at first might seem unintuitive to the absolute beginner I talk about. For instance, in the first page, it shows that you can do sqrt(x), where x can be a vector, and return a vector of the square roots of each number. Although this is high school matrix algebra, most users expect that the input to square root function to be a single number, not a matrix, as in Excel or a calculator. Other concepts that are not explicitly introduced are R workspace, the use of arguments in functions (with or without the =), etc. Others are things like diff(range(rainfall)) , where you have the output of one function used as the input to another, all in the same command line. All these things seem very basic, but can be difficult if you are trying to learn on your own with no prior experience in programming. I hope I am not sounding too difficult and contrarian, I am just trying to share my experience with starting with R, and in trying to convey this learning to my colleagues and students. In the end, I did find everything I needed to learn, and now I feel at ease with R, and I believe that almost anybody that can use Excel or something like it, could learn R. Thank you for the information, Best wishes, Keo. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginner
On Mar 2, 2010, at 8:01 AM, Paul Hiemstra wrote: Brandon Zicha wrote: Hey Paul, Hey Brandon, (adding R-help in the cc) I agree with you that the documentation of R could be better, especially with more examples in code showing not only the common cases, but also more esoteric cases. It would be great if everyone invested a lot of time to write awesome documentation, but this is not the case. I just objected to the tone (I tought :)) I spotted. Some more comments are inline: Accepting the main point of my post - that the often VERY incomplete help files appended to packages can be a major stumbling block for getting up and running in R - I take your point. I probably went a bit to far with my language there. I would point out though that a great many parts of research (like writing a bibliography - or searching for citations of any kind usually) aren't much fun, but are an important part of research related work. Likewise, complete documentation (by which I hardly mean a paper - looking at STATA help files as a minimum would be a good start) is part of programming. I agree that one needs to employ some level of judgement, otherwise you will get helpfile that says First turn on the computer... then click the 'R' Icon But, I have myself created one or two STATA functions that I have put up for public use - so I know how not fun, but necessary complete documentation is. Further, I didn't say that writing documentation doesn't take time. Everything takes time. My point was that relative to actually creating the application - writing more complete documentation takes very little time. If one invests the time to do the 'fun' stuff of writing a new package for R, it seems reasonable that taking the (proportionately) little time to write a nicer help file would be the most 'professional' thing to do. But, this could be my illusion that all researchers seem themselves as professionals - rather than an anarchic egoistic enclave of independent self-interested paper producers. This is what scientists get judged upon, not on how much software they publish and how good their documentation is. Furthermore, it is quite hard for a hardcore R programmer to judge what people find har about their software. I am notorious for assuming greater standards as an acceptable 'norm' than my community at large :-) Furthermore, you are absolutely right that my standards are apparently even to high for many commercial applications! R help is sometimes downright good! So, if I accept that I am demanding S.O.B. and tone down my thoughts of proper documentation and professionalism and adopt the (probably more) reasonable perspective you do at the end of well, this is the world we live in... and come on it's free I totally agree that I probably went too far! But, better yet, I think that this observation you make suggests a solution: Perhaps R could use a more integrated and organized open source help system. I can think of a few possibilities - the easiest being a wiki version of R help. This way users could add useful information to help files - such as more examples, tricks, tips, and known problems. This would take advantage of the open source, free, user-community centered aspects of R, and permit those with an interest in helping beginners to post notes for beginners - on the help files. I know that if such a wiki existed I would have posted my recent example of constrain optimization I just did recently. It wouldn't be too difficult to add a function wikihelp(X) that would open the wiki help page rather than the standard help documentation. Currently, help on any given command is scattered all over help fora all about the web. A central, indexed, and easily referenced help system might be a solution. Heck, such a system could go a step further and link R-help listserv archives by command thus centralizing and integrating the open-source user-built information resource of the listserv into help(). How many e-mails to this listserv begin with 'I just spent a few hours cruising the help forums related to 'X' and couldn't find an answer.' Sounds like a good addition, allowing people to add to the documentation as they see fit. There is ofcourse the R wiki, but this is not widely used and not firmly embedded into R itself. But how would we keep such a system you propose manageable, preventing it from becoming an enormous mess. Maybe some kind of moderation? I note that STATA has all their help files for the latest version of stata available on the web (http://www.stata.com/help.cgi? contents). How difficult would a similar system - only with R, editable and with links to supplementary information - be to set up? I cannot comment on how difficult it would was to set up, but I must disagree that it does not exist for R. The default for RSiteSearch is Jon Baron's search
Re: [R] two questions for R beginners
On Thu, 25 Feb 2010 17:31:19 + Patrick Burns pbu...@pburns.seanet.com wrote: * What were your biggest misconceptions or stumbling blocks to getting up and running with R? I didn't have any major stumbling blocks, but even after years of using R I didn't have a clear concept of what exactly a vector, a list and a data frame was, and what was the difference and similarities between them (and stuff like why does x[i] return a different result than x[[i]]). Some things that have tripped my up is reassigning the value of T or F and getting very strange results afterwards (I now use only TRUE and FALSE). FAQ 7.31 and 7.22 have also been troublesome at times, especially 7.31 when used in 'for' loops. Also I found it quite confusing that ?ifelse works, but not ?if (you have to type ?if) Also, why ?plot didn't give me the information I was looking for but ?plot.default did was rather confusing. I still experience similar problems with other functions. Usually 'methods' help, but some packages use S4 methods, which makes finding the correct help package quite challenging at times. * What documents helped you the most in this initial phase? In the initial phase I found the Rtips http://pj.freefaculty.org/R/Rtips.html; extremely useful. For understanding the difference between the various data types in R, Phil Spector's wonderful book 'Data Manipulation with R' was a great help. When reading it I finally understood things I have been wondering about for years. It really like the book. It's short, crystal clear and immensely useful. Another very useful document of a more advanced nature is the R Inferno. Best read after you've been using R for some time, though. I'm over the initial phase now, but two resources which continue to be of great help is http://www.rseek.org/ (mainly for searching the mailing list) and the 'sos' package (for finding the functions and packages I need). 'sos' really is great. There have been other packages/functions trying to do the same thing, but they have been to time-consuming and difficult to use (and learn), typically requiring you to first do a search, and then do some advanced subsetting to get useful results. This is similar to older search engines requiring many boolean terms to give the needed search results. With 'sos' I just choose some simple search terms describing what I'm looking for, and immediately get relevant results. 'sos' really is the Google of the R world. It has made a great impact on the discoverability of the various R functions and packages. Lastly, the 'demo' function is seldom mentioned, and easy to overlook, but gives a nice (and sometimes impressive) overview of what type of graphics is possible to create with a given packages. I wish more packages would have well-written demos. Also, I think some of the examples from the 'example' sections of help pages for functions could very well be copied to the demo of the corresponding package, e.g. a few of the examples of the 'xyplot' function in 'lattice'. -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Fri, 26 Feb 2010 11:56:10 -0800 (PST) Jack Siegrist jack...@eden.rutgers.edu wrote: What I think would be very helpful is an introduction to programming using R Here you are: A First Course in Statistical Programming with R http://www.cambridge.org/uk/catalogue/catalogue.asp?isbn=9780521694247 -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Mon, 1 Mar 2010 11:02:59 +0100 Karl Ove Hufthammer k...@huftis.org wrote: * What were your biggest misconceptions or stumbling blocks to getting up and running with R? Also I found it quite confusing that One more thing that still trips me up sometimes. '$' works on data frames but not on matrices (with dimnames/colnames). Even though the two objects *look* exactly the same, '$' on one of them works while '$' on the other gives a *very* confusing error message. Example: d=head(iris[1:4]) d2=as.matrix(d) d d2 d$Sepal.Width d2$Sepal.Width Some functions output matrices where you would expect them to output data frames, and then this problem occurs. (Is there a reason why '$' could/should not be made to 'work' on matrices too?) -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Hi r-help-boun...@r-project.org napsal dne 01.03.2010 11:26:40: On Mon, 1 Mar 2010 11:02:59 +0100 Karl Ove Hufthammer k...@huftis.org wrote: * What were your biggest misconceptions or stumbling blocks to getting up and running with R? Also I found it quite confusing that One more thing that still trips me up sometimes. '$' works on data frames but not on matrices (with dimnames/colnames). Even though the two objects *look* exactly the same, '$' on one of them works while '$' on the other gives a *very* confusing error message. Example: d=head(iris[1:4]) d2=as.matrix(d) d d2 d$Sepal.Width d2$Sepal.Width Some functions output matrices where you would expect them to output data frames, and then this problem occurs. (Is there a reason why '$' could/should not be made to 'work' on matrices too?) I understand that 2 dimensional rectangular matrix looks quite similar to data frame however it is only a vector with dimensions. As such it can have items of only one type (numeric, character, ...). And you can easily change dimensions of matrix. matrix-1:12 dim(matrix) - c(2,6) matrix dim(matrix) - c(2,2,3) matrix dim(matrix) -NULL matrix So rectangular structure of printed matrix is a kind of coincidence only, whereas rectangular structure of data frame is its main feature. Regards Petr -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Karl Ove Hufthammer wrote: On Fri, 26 Feb 2010 11:56:10 -0800 (PST) Jack Siegrist jack...@eden.rutgers.edu wrote: What I think would be very helpful is an introduction to programming using R Here you are: A First Course in Statistical Programming with R http://www.cambridge.org/uk/catalogue/catalogue.asp?isbn=9780521694247 Jack also asked for it to be a big thick college textbook that takes at least a semester to go through, which should be a prerequisite for going through the Introduction to R available on CRAN. That book (of which I am an author) is not big or thick. But it is aimed at an audience who don't have programming experience. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Karl Ove Hufthammer wrote: On Mon, 1 Mar 2010 11:02:59 +0100 Karl Ove Hufthammer k...@huftis.org wrote: * What were your biggest misconceptions or stumbling blocks to getting up and running with R? Also I found it quite confusing that One more thing that still trips me up sometimes. '$' works on data frames but not on matrices (with dimnames/colnames). Even though the two objects *look* exactly the same, '$' on one of them works while '$' on the other gives a *very* confusing error message. Example: d=head(iris[1:4]) d2=as.matrix(d) d d2 d$Sepal.Width d2$Sepal.Width Some functions output matrices where you would expect them to output data frames, and then this problem occurs. (Is there a reason why '$' could/should not be made to 'work' on matrices too?) The reason for the difference is that data.frames are lists organized into columns (so the $ handling comes from the list, where it means extract the component) whereas a matrix is a single vector displayed in columns. Of course, the problem is that a beginner only knows that they both look the same. But I think the idea of a list is so fundamental to R that it needs to be something learned pretty early, so I'd rather not blur the distinction between dataframes and matrices. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On 01-Mar-10 11:09:51, Petr PIKAL wrote: Hi r-help-boun...@r-project.org napsal dne 01.03.2010 11:26:40: On Mon, 1 Mar 2010 11:02:59 +0100 Karl Ove Hufthammer k...@huftis.org wrote: * What were your biggest misconceptions or stumbling blocks to getting up and running with R? Also I found it quite confusing that One more thing that still trips me up sometimes. '$' works on data frames but not on matrices (with dimnames/colnames). Even though the two objects *look* exactly the same, '$' on one of them works while '$' on the other gives a *very* confusing error message. Example: d=head(iris[1:4]) d2=as.matrix(d) d d2 d$Sepal.Width d2$Sepal.Width Some functions output matrices where you would expect them to output data frames, and then this problem occurs. (Is there a reason why '$' could/should not be made to 'work' on matrices too?) I understand that 2 dimensional rectangular matrix looks quite similar to data frame however it is only a vector with dimensions. As such it can have items of only one type (numeric, character, ...). And you can easily change dimensions of matrix. matrix-1:12 dim(matrix) - c(2,6) matrix dim(matrix) - c(2,2,3) matrix dim(matrix) -NULL matrix So rectangular structure of printed matrix is a kind of coincidence only, whereas rectangular structure of data frame is its main feature. Regards Petr -- Karl Ove Hufthammer Petr, I think that could be confusing! The way I see it is that a matrix is a special case of an array, whose dimension attribute is of length 2 (number of rows, number of columns); and row and column refer to the rectangular display which you see when R prints to matrix. And this, of course, derives directly from the historic rectangular view of a matrix when written down. When you went from dim(matrix)-c(2,6) to dim(matrix)-c(2,2,3) you stripped it of its special title of matrix and cast it out into the motley mob of arrays (some of whom are matrices, but matrix no longer is). So the rectangular structure of printed matrix is not a coincidence, but is its main feature! To come back to Karl's query about why $ works for a dataframe but not for a matrix, note that $ is the extractor for getting a named component of a list. So, Karl, when you did d=head(iris[1:4]) you created a dataframe: str(d) # 'data.frame': 6 obs. of 4 variables: # $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 # $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 # $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 # $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 (with named components Sepal.Length, ... , Petal.Width), and a dataframe is a special case of a general list. In a general list, the separate components can each be anything. In a dataframe, each component is a vector; the different vectors may be of different types (logical, numeric, ... ) but of course the elements of any single vector must be of the same type; and, in a dataframe, all the vectors must have the same length (otherwise it is a general list, not a dataframe). So, when you print a dataframe, R chooses to display it as a rectangular structure. On the other hand, when you print a general list, R displays it quite differently: d # Sepal.Length Sepal.Width Petal.Length Petal.Width # 1 5.1 3.5 1.4 0.2 # 2 4.9 3.0 1.4 0.2 # 3 4.7 3.2 1.3 0.2 # 4 4.6 3.1 1.5 0.2 # 5 5.0 3.6 1.4 0.2 # 6 5.4 3.9 1.7 0.4 d3 - list(C1=c(1.1,1.2,1.3), C2=c(2.1,2.2,2.3,2.4)) d3 # $C1 # [1] 1.1 1.2 1.3 # $C2 # [1] 2.1 2.2 2.3 2.4 Notice the similarity (though not identity) between the print of d3 and the output of str(d). There is a bit more hard-wired stuff built into a dataframe which makes it more than simply a list with all components vectors of equal length). However, one could also say that the rectangular structure is its main feature. As to why $ will not work on matrices: a matrix, as Petr points out, is a vector with a dimensions attribute which has length 2 (as opposed to a general array where the length of the dimensions attribute could be anything). Hence it is not a list of named components in the sense of list. Hence $ will not work with a matrix, since $ will not be able to find any list-components. which is basically what the error message d2$Sepal.Width # Error in d2$Sepal.Width : $ operator is invalid for atomic vectors is telling you: d2 is an atomic vector with a length-2 dimensions attribute. It has no list-type components for $ to get its hands on. Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 01-Mar-10 Time: 12:03:21