[R] The function predict
Good Morning! May you help me? I need to understand the function predict. I need to understand the algorithm implemented, the calculations associated. Where can I find this information? Thank You! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tree() producing NA's
Hi Hoping someone can help me (a newbie). I am trying to construct a tree using tree() in package tree. One of the fields is a factor field (owner), with many levels. In the resulting tree, I see many NA's (see below), yet in the actual data there are none. rr200.tr - tree(backprof ~ ., rr200) rr200.tr 1) root 200 1826.00 -0.2332 ... [snip] ... 5) owner: Cliveden Stud,NA,NA,NA,NA,NA,NA,NA,NA 10 14.25 1.5870 * 3) owner: B E T Partnership,Flaming Sambuca Syndicate,NA,NA,NA,NA,NA,NA,NA,NA 11 384.40 10.5900 6) decodds 12 5 74.80 6.3000 * 7) decodds 12 6 140.80 14.1700 * Can anyone tell me why this happens and what I can do about it? Regards Amnon [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Dendrogram for agglomerative hierarchical clustering result
Hey group, I have a problem of drawing dendrogram as the result of my program written in C. My algorithm is a approximation algorithm for single linkage method. AS a result I will get the following data: [Average distance] [cluster A] [cluster B] For example: 42.593141 1 26 42.593141 4 6 42.593141 123 124 42.593141 4 113 74.244206 1 123 74.244206 4 133 74.244206 1 36 So far I have used C to generate a bitmap output but I would like to use the computed result as an input for R to just draw the dendrogram. As I'm new to R any help is appreciated. Thanks, Risto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] image quality
dear all, I am writing a sweave documentation for my analysis, and I am plotting huge scatter plot data for microarray. unlucly this take a lot of resource to my pc because of the quality of the image which is to high (I see the PC get stuck for each single spot). how can I overcome this problem? is there a way to make lighter image? john [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conditional rows
Hi, Given a simple example, test - matrix(c(0.1, 0.2, 0.1, 0.2, 0.1, 0.1, 0.3, 0.1, 0.1), 3, 3) How to generate row indexes for which their corresponding row values are less than or equal to 0.2 ? For this example, row 2 and 3 are the correct ones. Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to generate a column based on other columns in a data frame
Try this: x2 - merge(x, cbind(unique(x), Site=sprintf(S%d, seq_len(nrow(unique(x), by=c(X, Y)) x2[order(x2$site)] On 11/02/2008, Weidong Gu [EMAIL PROTECTED] wrote: HI, I am working on a data set with multiple collections of mosquitoes at sampling sites. Each row represents a collection of individual samples with coordinates for each collection. ... X, Y,... 1 36.435 30.118 2 36.435 30.118 3 36.435 30.118 4 35.329 29.657 5 35.329 29.657 6 36.431 30.111 7 36.431 30.111 8 35.421 29.797 9 35.421 29.797 10 35.421 29.797 Unfortunately, there is no 'site' entry. I would like to add a column of 'site' based on the coordinates of samples so that samples from the same sites have the same site ID like S1, S2, How to do this in R way? Thanks. Weidong Gu, Department of Medicine University of Alabama, Birmingham 1900 University Blvd., Birmingham, Alabama 35294 Email: [EMAIL PROTECTED] PH: (205)-975-9053 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gcc 4.3 any known issues?
Hi, Fedora is for Fedora 9 switching to gcc 4.3. Before I test it (rawhide) I want to be sure that R is running. So my question is whether there have been issues compiling R + packages using 4.3? Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tinn-R not working well with latest R
Pending the solution of the problem I use tinn-R as follows. 1) I make none of the recommended additions to the Rprofile.site file 2) I start tinn-R from the desktop and then load an r-file from my working directory. 3) I then start R from the R| start preferred Rgui | menu in Tinn-R. This has the effect of starting R in the work directory and any saved data there will be loaded automatically. I may be missing some functionality but I like the way it works. As far as I can see this is not documented. I do occasionally have a problem which i can not replicate at will but which occurs occasionally. If I have the cursor in the middle of a line containing an R command any attempt to insert some thing i=at the cursor is inserted one character per line above the line that I am amending. For example If I try to change x - a + b + c+ d + e to x - a + b123 + c + d + e Tinn-R displays 1 2 3 x - a + b + c + d + e with the cursor staying after the b To recover close Tinn-R and restart and the problem vanishes. Any suggestions Best Regards John On 11/02/2008, Farrel Buchinsky [EMAIL PROTECTED] wrote: I recently installed R 2.6.2 and am getting errors on startup that relate to svIDE being loaded by Tinn-R. Loading required package: tcltk Loading Tcl/Tk interface ... done Warning messages: 1: '\A' is an unrecognized escape in a character string 2: unrecognized escape removed from ;for Options\AutoIndent: 0=Off, 1=follow language scoping and 2=copy from previous line\n 3: In grep(paste([{]TclEval , topic, [}], sep = ), tclvalue(.Tcl(dde services TclEval {})), : argument 'useBytes = TRUE' will be ignored Loading required package: svMisc Loading required package: R2HTML Any idea what is going on. I use R 2.6.2 on windows xp I also started R without the profile that Tinn-R made. If I manualy enter library(svIDE) then I get. library(svIDE) Warning messages: 1: '\A' is an unrecognized escape in a character string 2: unrecognized escape removed from ;for Options\AutoIndent: 0=Off, 1=follow language scoping and 2=copy from previous line\n So the underlying problem may be svIDE see: http://tolstoy.newcastle.edu.au/R/e2/help/07/04/15738.html Apparently, because of this error, several great features in Tinn-R are not working properly. Any solutions or workarounds? -- Farrel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John C Frain Trinity College Dublin Dublin 2 Ireland www.tcd.ie/Economics/staff/frainj/home.html mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting a data.frame degenerates at one column?
jim holtman [EMAIL PROTECTED] writes: try: input[,targets, drop=FALSE] see: ?[ for an explanation. Thanks, you who responded; this was exactly helpful, and a good reference to the part of the FM I was missing. To unpack (and demonstrate some comprehension gained.. ;) the subsetting operations on data frames, by default, use the most basic data type capable of representing the answer. Either the drop=FALSE or the inputs[targets] solution give me the result I had in mind. I mildly prefer the [targets] statement from a visual perspective. - Allen S. Rout __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tinn-R not working well with latest R
Corinna Thanks for the suggestion. I can not duplicate the error myself. I generally have a code segment open in Tinn-R and have sent it to R and wish to rerun it with some changes but when I try to make the changes they are transferred to the line above one character per line. This has happened about 5/6 times since Christmas. The code segments were different. I can think of no common factor that might have caused the problem. Closing tinn-r and R and restarting always cured the problem which then did not occur again for several days and in different circumstances. I have not looked for help because I have been unable to replicate the problem. John Frain On 11/02/2008, Schmitt, Corinna [EMAIL PROTECTED] wrote: Hallo, I had the same problems before. I think the best solution is that you just copy the needed codepart out of Tinn-R with Ctr+C. Then open R directly from your desktop NOT from Tinn-R. Than paste in the command. you can still make changes in the command when you have not pressed enter by using the arrow buttons of the keyboard. put the curse where you want in the command line and change it. Hope that is what you want. I cannot imitate your example. Corinna -- John C Frain Trinity College Dublin Dublin 2 Ireland www.tcd.ie/Economics/staff/frainj/home.html mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] WG: Tinn-R not working well with latest R
I am in R command window and just make Crt+V. Corinna -Ursprüngliche Nachricht- Von: Farrel Buchinsky [mailto:[EMAIL PROTECTED] Gesendet: Mo 11.02.2008 21:16 An: Schmitt, Corinna Betreff: Re: Tinn-R not working well with latest R I can easily get R to open without an error. I simply removed the Tinn-R related lines from the Rprofile.site file C:\Program Files\R-2.6.2\etc\Rprofile.site but then when I try to manually load the svIDE library by entering library(svIDE) from the command line, I get a similar error. So when you say Than paste in the command, what command are you referring to? What do you change it to? Schmitt, Corinna [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Hallo, I had the same problems before. I think the best solution is that you just copy the needed codepart out of Tinn-R with Ctr+C. Then open R directly from your desktop NOT from Tinn-R. Than paste in the command. you can still make changes in the command when you have not pressed enter by using the arrow buttons of the keyboard. put the curse where you want in the command line and change it. Hope that is what you want. I cannot imitate your example. Corinna [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ** This email and any files transmitted with it are confide...{{dropped:16}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Histogram in Lattice with 3 factors
On 2/11/08, willem vervoort [EMAIL PROTECTED] wrote: Dear R-help list, I am trying to construct a lattice histogram using 3 factors. My dataframe looks like this: (simulating a waterbalance over groundwater with different salinities) s days modelECEC_max 0.4 1A 10 9 0.42 2A 10 9 0.44 3A 10 9 :: : : : 0.4 1B 10 9 :: : : : 0.4 1A 309 :: : : : 0.4 1A 3036 Anyway you get the gist EC_max has two levels 9 and 36, EC has 3 levels 10, 30 and 70, and model has two levels (A and B). There are say 365 days and s is the variable of interest (soil saturation) Can maybe be reproduced with: data - data.frame(s = rnorm(2*3*365*2),rep(1:365,12), model = sort(rep(c(A,B),6*365)), EC = rep(sort(rep(c(10,30,70),365*2)),2), EC_max = rep(sort(rep(c(9,36),3*365)),2)) I would like to plot histograms with the three factors using Lattice so I had the following code: my.strip - function(which.given, ..., factor.levels) { levs - if (which.given == 1) c(Model A,Model B) else {if(which.given == 2) paste(EC = ,as.character(EC),dS/m) else paste(ECmax = ,as.character(EC_max),dS/m)} strip.default(which.given, ..., factor.levels = levs) } histogram(~s|model*as.factor(EC)*as.factor(EC_max),data=Store,xlab=soil saturation,type=density,strip=my.strip) But I am doing something wrong, because it plots the histogram for factor level EC_max =9 first and than straight over it the histogram for factor level 36, so only 6 panels on the graph rather than 12. I searched the archives, but no luck so far. Look up the 'layout' argument in ?xyplot. By default, for 2 or more conditioning variables, the levels of the first two define columns and rows, and the rest are spread out over multiple pages. In your example, you could try layout = c(6, 2) for starters. -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Histogram in Lattice with 3 factors
Dear R-help list, I am trying to construct a lattice histogram using 3 factors. My dataframe looks like this: (simulating a waterbalance over groundwater with different salinities) s days modelECEC_max 0.4 1A 10 9 0.42 2A 10 9 0.44 3A 10 9 :: : : : 0.4 1B 10 9 :: : : : 0.4 1A 309 :: : : : 0.4 1A 3036 Anyway you get the gist EC_max has two levels 9 and 36, EC has 3 levels 10, 30 and 70, and model has two levels (A and B). There are say 365 days and s is the variable of interest (soil saturation) Can maybe be reproduced with: data - data.frame(s = rnorm(2*3*365*2),rep(1:365,12), model = sort(rep(c(A,B),6*365)), EC = rep(sort(rep(c(10,30,70),365*2)),2), EC_max = rep(sort(rep(c(9,36),3*365)),2)) I would like to plot histograms with the three factors using Lattice so I had the following code: my.strip - function(which.given, ..., factor.levels) { levs - if (which.given == 1) c(Model A,Model B) else {if(which.given == 2) paste(EC = ,as.character(EC),dS/m) else paste(ECmax = ,as.character(EC_max),dS/m)} strip.default(which.given, ..., factor.levels = levs) } histogram(~s|model*as.factor(EC)*as.factor(EC_max),data=Store,xlab=soil saturation,type=density,strip=my.strip) But I am doing something wrong, because it plots the histogram for factor level EC_max =9 first and than straight over it the histogram for factor level 36, so only 6 panels on the graph rather than 12. I searched the archives, but no luck so far. Any help is appreciated Willem platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 6.1 year 2007 month 11 day26 svn rev43537 language R version.string R version 2.6.1 (2007-11-26) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [OT] good reference for mixed models and EM algorithm
Erin, as well as P B can I recommend McCullogh CE, Searle SR (2000), Generalized, Linear, and Mixed Models, Wiley I also found Data analysis using regression and multilevel/hierarchical models by Andrew Gelman and Jennifer Hill. Cambridge ; New York : Cambridge University Press, 2007. useful although it takes a Bayesian rather than EM approach. Cheers, Murray -- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: [EMAIL PROTECTED]Fax 7 838 4155 Phone +64 7 838 4773 wkHome +64 7 825 0441Mobile 021 1395 862 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R on Mac PRO does anyone have experience with R on such a platform ?
JiHO, in case you are not following TextMate's mailing list, you might want to check out Hans-Jorg Bibiko's work on Rdaemon: http://article.gmane.org/gmane.editors.textmate.general/24195/ It provides a lot of the terminal functionality within a TextMate window, uses X11 for the plots, and opens help files either in a browser or in a TextMate HTML window. It essentially runs an R process in the background, and communicates with it, so I'm not sure it would allow you to run R on a remote server. But I think it is worth checking out otherwise. Currently you have to install the bundles from the above link, but I'm hoping soon we'll be able to commit these bundles to TextMate's bundle repository. Anyone interested in trying it out and having problems, you can email TextMate's mailing list (http://macromates.com/community), which both I and Hans-Jorg follow closely. Haris Skiadas Department of Mathematics and Computer Science Hanover College PS: Yes, it is the best $40 I've ever spent, by far. On Feb 11, 2008, at 2:08 PM, jiho wrote: On 2008-February-11 , at 19:14 , Roger Day wrote: My experience with R.app on a MACbook has been mostly very positive. I like the interface much better than that of Windows-- with two exceptions. a) I use stepping thru code with control-R. It's not as convenient on Mac- the code you want to run has to be actually selected; not good enough just to be on the line you want. That slows down code-stepping. b) saveHistory() doesn't save the history of the current session -- beware, I lost some work that way. you have to actually click a button. c) no resizing graphs post-hoc, d) saving graphics to a file is inconvenient except for pdf output. Some plusses are: a) better built-in editor (if you're not using ESS), including delimiter matching b) the history pane is nice, c) the package installer and manager are nicer than on Win, d) autocompletion with ctrl-period, e) you can select text on the current or past command line much easier, f) attractive interface with lots of cosmetic options. I've done some tkrplot work in both (using X11 in OSX) -- some inconsistencies with placement of widgets show up. This is off the top of my head. Check out the mailing list R-sig-mac for more info. After using R via R-app (which is indeed very nice to start with) I eventually switched to a combination of TextMate + Terminal + CarbonEL - TextMate[1] is a very powerful editor, well worth the $40 price tag, and has nice goodies for R besides syntax highlighting such as command autocompletion, command templates, plenty of snippets, etc. - I run R in a regular Terminal window. This way I get command line editing and searching through history. In addition it makes it as easy to run R on my local machine that on a remote server (useful to run demanding tasks on a large CPU). I can send code from TextMate to the terminal prompt using AppleScript commands in TextMate[2]. This allows to send selected text _or_ current line directly to the Terminal with just a keystroke. - CarbonEL is a package which allows to plot to a quartz window even from a simple Terminal (quartz is Mac OS X graphics engine). The plots on quartz look gorgeous and going back to X11 would have been a pain. Another similar solution would be to use the Cairo package. All in all, I fond it a very convenient and flexible way to use R. It has the added bonus that the same combination (TM+Terminal) works for anything that can run in a terminal window (MATLAB, Scilab, python etc.). So, even if you don't use only R, you can keep the same habits with a nice editor. I haven't tried Emacs+ESS. I've heard a lot of good things about it but learning Emacs is a task in itself. [1] http://macromates.com/ [2] modification of those http://jo.irisson.free.fr/?p=32 for the built-in Terminal, since Terminal on Leopard finally has tabs JiHO --- http://jo.irisson.free.fr/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] barplot or maybe related?
On Mon, 2008-02-11 at 09:31 -0800, questions? wrote: I have two distributions, represented by heights of several intervals. e.g. the distribution is partitioned into 10 segments, I have numbers(freq or counts) associated with each region in the format as: 0.2 0.3 0.1 0.1 . 0.01 0.02 I want to plot the two distributions side by side in meaning that, for each region,the two bars(in barplot) from the two distribution are adjacent to each other. If you do barplot(beside=T), the two distribution are plotted side by side, not interleaved. I was wondering there are ways to do what I want Compare: mat1 - matrix(c(1:10), nrow=5, ncol=2) mat2 - t(mat1) # Transpose mat1 barplot(mat1, beside=TRUE) barplot(mat2, beside=TRUE) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- http://mutualism.williams.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic regression with repeated measures
Steven Vamosi smvamosi at gmail.com writes: In a nutshell, the experiment involved presenting females from two groups (treatment, control) with an opportunity to mate with a virgin male every 6 hours for 48 hours. Every female was presented this opportunity at every time step (i.e., whether or not she mated at 6 hr, she was again presented with a male at 12 hr, and so on). . femalegroup masstimemate 1 control 5.7 0 1 1 control 5.7 6 1 . How, then, to determine whether treatment females display different mating patterns over time than control females? Here's my crack at it: foo1 - lmer2(mate ~ group * mass * time + (time | female), family=binomial) And what happened post-crack? Error ~...singular? In case, did you try to replace the * by + as a first try? Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gcc 4.3 any known issues?
Stefan Grosse wrote: Hi, Fedora is for Fedora 9 switching to gcc 4.3. Before I test it (rawhide) I want to be sure that R is running. So my question is whether there have been issues compiling R + packages using 4.3? I suspect that not many have tried. There is an R-2.6.2 RPM in Fedora 9 alpha, so _something_ seems to work. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R on Mac PRO does anyone have experience with R on such a platform ?
On 2008-February-11 , at 19:14 , Roger Day wrote: My experience with R.app on a MACbook has been mostly very positive. I like the interface much better than that of Windows-- with two exceptions. a) I use stepping thru code with control-R. It's not as convenient on Mac- the code you want to run has to be actually selected; not good enough just to be on the line you want. That slows down code-stepping. b) saveHistory() doesn't save the history of the current session -- beware, I lost some work that way. you have to actually click a button. c) no resizing graphs post-hoc, d) saving graphics to a file is inconvenient except for pdf output. Some plusses are: a) better built-in editor (if you're not using ESS), including delimiter matching b) the history pane is nice, c) the package installer and manager are nicer than on Win, d) autocompletion with ctrl-period, e) you can select text on the current or past command line much easier, f) attractive interface with lots of cosmetic options. I've done some tkrplot work in both (using X11 in OSX) -- some inconsistencies with placement of widgets show up. This is off the top of my head. Check out the mailing list R-sig-mac for more info. After using R via R-app (which is indeed very nice to start with) I eventually switched to a combination of TextMate + Terminal + CarbonEL - TextMate[1] is a very powerful editor, well worth the $40 price tag, and has nice goodies for R besides syntax highlighting such as command autocompletion, command templates, plenty of snippets, etc. - I run R in a regular Terminal window. This way I get command line editing and searching through history. In addition it makes it as easy to run R on my local machine that on a remote server (useful to run demanding tasks on a large CPU). I can send code from TextMate to the terminal prompt using AppleScript commands in TextMate[2]. This allows to send selected text _or_ current line directly to the Terminal with just a keystroke. - CarbonEL is a package which allows to plot to a quartz window even from a simple Terminal (quartz is Mac OS X graphics engine). The plots on quartz look gorgeous and going back to X11 would have been a pain. Another similar solution would be to use the Cairo package. All in all, I fond it a very convenient and flexible way to use R. It has the added bonus that the same combination (TM+Terminal) works for anything that can run in a terminal window (MATLAB, Scilab, python etc.). So, even if you don't use only R, you can keep the same habits with a nice editor. I haven't tried Emacs+ESS. I've heard a lot of good things about it but learning Emacs is a task in itself. [1] http://macromates.com/ [2] modification of those http://jo.irisson.free.fr/?p=32 for the built-in Terminal, since Terminal on Leopard finally has tabs JiHO --- http://jo.irisson.free.fr/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Interpretation of log odds
Hallo, fit12-lmFit(qrg[,1:2]) t12-toptable(fit12,adjust=fdr,number=15000,genelist=qrg$genes[,1]) t12 ID logFC t P.Value adj.P.ValB 1560orf6.2714 -5,95911144 -7,5045373620,0616459272630 0,00430961073320568 20,85141454 8689SW232,709344216 3,41198098 0,000644926129763921000 0,03967585550307640 -0,62704052 The data example comes from one experiment, where I want to know if genes are differentially expressed. As I saw in the onlinehelp for toptable the value B is the log odds that the gene is differentially expressed. When I now look at the B value 20,85141454 it says that the gene orf6.2714 is in 20,85% differentially expressed. Is it right? But how should I interpret the second example SW23 with a negative B value? Can anyone discribe it to me in easy word? ;-) Thanks, Corinna [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using R in a university course: dealing with proposal comments
On 2/11/08, Paul Gilbert [EMAIL PROTECTED] wrote: Stas Kolenikov wrote: ... Training researchers of tomorrow might be great, but ifyour students get on the market in the end of the semester, they won't have the luxury of waiting until R becomes THE package of choice. Not being a teacher, I usually follow these discussions with a bit of amusement and some befuddlement. We hire young people hoping they will bring in bright new ideas from academia, and academics are training the students based on what they think are the old things we use. Fortunately, R is already one of the packages of choice many places. Another point that needs more emphasis is that R is actually a programming language, like Matlab and and APL, so it really has more general usefulness than statistics packages that one might use in the narrower context of a statistics course. There are people who would be developing and pricing some novel financial derivatives -- your young people are probably Ph.D. in finance or statistics or economics, and yes, programming is a must at research level, and R is a great choice (although economists might say that GAUSS or Stata is an even greater choice). The original question was about the first and most likely the only statistics class the health students will ever take, and the words graduate level should not be fooling anybody -- that will have to be a non-calculus data analysis class (Arin Basu can surprise me here now if it is different!!!). I would predict the students coming out of it will run the routine analysis that are spelled out by FDA and the likes, and I would think the FDA regulations could go as far as specific SAS syntax, or at least to specify SAS PROCs to be used. The GPL software does not necessarily thrive in commercial and even academic environments -- I have plenty acquantainces of mine in academia who prefer to use some commercial flavors of LaTeX over the free miktex distribution for the illusion of technical support they get for their money; I expect those people to prefer SPSS or SAS over R for similar reasons (plus the GUI). I don't argue that R is a greal tool for innovative work, but rather that it is the best tool for the basic stats class to a not-so-technical audience, and in the perspective work the students would be doing. Of course if you are a full professor you can dismiss any of the comments and teach the way you like. That's what I'll be planning to do when I get there :)) -- Stas Kolenikov, also found at http://stas.kolenikov.name Small print: Please do not reply to my Gmail address as I don't check it regularly. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Viable Approach to Parallel R?
All, We are researching approaches to parallel R with the end goal of running R in a distributed manner on a Linux cluster. We expect of course to do some work decomposing our problems to be task-parallel or data-parallel, but wouldn't mind getting an initial boost working with embarrassingly parallel code sections and one of the approaches below. Incidentally our environment includes R 2.6.1, RHEL 5.1, Solaris 10, SGE (Sun Grid Engine) and OpenMPI 1.2.4 (SunHPC 7.1)). In researching previous work, the most promising approaches seem to be: A. Snow (with Rmpi or Rpvm) (as described in http://www.r-project.org/useR-2006/Slides/Harrington+Salibian-Barrera.pd f from the 2006 R User Conference) It is my understanding that this approach is viable, and works with OpenMPI 1.2.4. Is anyone using this method with good results? B. taskpR, RScaLAPACK, pMatrix I read a paper http://sdm.lbl.gov/sdmcenter/projects/SDM.center.parallel.r.2-pager.4.do c coming out of the ORNL, describing what they call parallel R, which included taskpr, RScaLAPACK, pMatrix. I notice that taskpR is no longer available in contrib, nor is pMatrix. An old link indicates the packages are available at http://www.ASPECT-SDM.org/Parallel-R but that site displays a notice that the server is migrating. Has this work been discontinued? Anyone using this? I see RScaLAPACK is still available, from reading the above it seems that was bundled with taskpR. Does it function without the other components? (Guess I'll try it and find out :) C. Sleigh NetworkSpaces I see that SCAI (Scientific Computing Associates) offers a parallel R package based on something they call NetworkSpaces and Sleigh (inspired by Snow). They sell services around the product but it is open source. They have an enhanced version that they sell support. http://www.lindaspaces.com/hp/BenchmarksWithCharts.pdf. Has anyone investigated this approach or it's open source components? TIA for any information, direction, suggestions, and if I've missed any other approaches please advise. Dan Lewis [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Logistic regression with repeated measures
Hello R list, I am hoping to conduct a logistic regression with repeated measures, and would love an actual code run through for such an analysis. I found only one related post on this list, but a full answer was never provided. I understand that the routine lmer (or lmer2) in the lme4 package is often recommended in such a case, but actually implementing it is where I've hit a wall. In a nutshell, the experiment involved presenting females from two groups (treatment, control) with an opportunity to mate with a virgin male every 6 hours for 48 hours. Every female was presented this opportunity at every time step (i.e., whether or not she mated at 6 hr, she was again presented with a male at 12 hr, and so on). In addition to which group a female belongs to, we have an a priori reason to want to test the effect of her initial body mass as a covariate. A subset of the data looks like this: female group masstimemate 1 control 5.7 0 1 1 control 5.7 6 1 1 control 5.7 12 0 1 control 5.7 18 0 1 control 5.7 24 0 1 control 5.7 30 1 1 control 5.7 36 0 1 control 5.7 42 1 1 control 5.7 48 0 2 treatm 5.3 0 1 2 treatm 5.3 6 0 2 treatm 5.3 12 0 2 treatm 5.3 18 0 2 treatm 5.3 24 0 2 treatm 5.3 30 1 2 treatm 5.3 36 0 2 treatm 5.3 42 0 2 treatm 5.3 48 0 3 control 6.1 0 1 3 control 6.1 6 0 3 control 6.1 12 0 3 control 6.1 18 0 3 control 6.1 24 1 3 control 6.1 30 1 3 control 6.1 36 0 3 control 6.1 42 1 3 control 6.1 48 0 ... How, then, to determine whether treatment females display different mating patterns over time than control females? Here's my crack at it: foo1 - lmer2(mate ~ group * mass * time + (time | female), family=binomial) Thanks in advance, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] barplot or maybe related?
I have two distributions, represented by heights of several intervals. e.g. the distribution is partitioned into 10 segments, I have numbers(freq or counts) associated with each region in the format as: 0.2 0.3 0.1 0.1 . 0.01 0.02 I want to plot the two distributions side by side in meaning that, for each region,the two bars(in barplot) from the two distribution are adjacent to each other. If you do barplot(beside=T), the two distribution are plotted side by side, not interleaved. I was wondering there are ways to do what I want Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loading Data to R
On Microsoft Windows systems, it may be more convenient to install and use the XLSReadWRite packge. For non-windows systems, the gdata package provides this function, but requires perl to be present. -Greg (Maintainer of gdata) On Feb 9, 2008, at 1:09PM , Henrique Dallazuanna wrote: You need library(gdata) before On 08/02/2008, Wensui Liu [EMAIL PROTECTED] wrote: # READ DATA FROM XLS FILE # xls - read.xls(file = C:/projects/Rintro/Part01/export.xls, sheet = 3, type = data.frame, from = 1, colNames = TRUE) On Feb 8, 2008 3:49 PM, Christine Lynn [EMAIL PROTECTED] wrote: This is the most basic question ever...I haven't used R in a couple years since college so I forget and haven't been able to find what I'm looking for in any of the manuals. I just need to figure out how to load a dataset into the program from excel! Thanks! CL [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- === WenSui Liu ChoicePoint Precision Marketing Phone: 678-893-9457 Email : [EMAIL PROTECTED] Blog : statcompute.spaces.live.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Hastie - Tibshirani - Friedman pg 141 nnet question
Dear helper, I am working with a nnet using large data set (23K) and have some questions. I have a binary response (occurrence non-occurrence of event) with 8 predictors. (1) How can I reproduce plot in Hastie et al. (page 141), i.e. natural cubic splines of tensor product? (2) How does nnet treat the response. It seems that by default it treats Y as numeric (?). Can I change the response as factor? or, does it matter? Thank you, Ilham __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dendrogram for agglomerative hierarchical clustering result
Thank you for your reply Wolfgang I've seen these examples but my problem is that I don't know how to make the input data out of my given data. According to the example below hclust is making the clustering and will result in hclust object hc. In my case the clustering is already done and I need to create the hclust object out of my clustering result. So I probably have to study how to create the hclust object first.. dndrgr hc - hclust(dist(USArrests), ave) dndrgr (dend1 - as.dendrogram(hc)) Risto On 11 veebr, 19:18, Wolfgang Huber [EMAIL PROTECTED] wrote: Hi Risto, You could try example(dendrogram) best wishes Wolfgang noorpiilur scripsit: Hey group, I have a problem of drawing dendrogram as the result of my program written in C. My algorithm is a approximation algorithm for single linkage method. AS a result I will get the following data: [Average distance] [cluster A] [cluster B] For example: 42.593141 1 26 42.593141 4 6 42.593141 123 124 42.593141 4 113 74.244206 1 123 74.244206 4 133 74.244206 1 36 So far I have used C to generate a bitmap output but I would like to use the computed result as an input for R to just draw the dendrogram. As I'm new to R any help is appreciated. Thanks, Risto __ [EMAIL PROTECTED] mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Difference between P.Value and adj.P.Value
Hallo, fit12-lmFit(qrg[,1:2]) t12-toptable(fit12,adjust=fdr,number=25,genelist=qrg$genes[,1]) t12 ID logFC t P.Value adj.P.ValB 522PLAU_OP -6.836144 -8.420414 5.589416e-05 0.01212520 2.054965 1555 CD44_WIZ -6.569622 -8.227938 6.510169e-05 0.01212520 1.944046 Can anyone tell me what the difference is between P.Value and adj.P.Value? I need to analyse microarrays and should say if there exist differential expressed genes. Which P.Value should I use? Thanks, Corinna __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Length problem
if data was your data.frame, data[4:length(data)] was also a data.frame. but, c(data[4:length(data)] ) coerces it to a list. therefore coppie is a list. coppie[1] is also a list of length 1... compare that to: coppie[[1]] b On Feb 11, 2008, at 10:38 AM, milton ruser wrote: Ciao Paolo, How about you show some row of your data? How many columns have your data.frame? One? By the way data is not a so good name for your data frame. We will be very happy to help you Kindly, Miltinho Brasile On 2/11/08, Paolo Grillo [EMAIL PROTECTED] wrote: Hi all I have this problem: In my database .dta, called data I have five rows data-read.dta(C:\\2_CO_mmobile_ALL_Rid.dta) # From this database I wuold like to create another coppie-c(data[4:length(data)]) but I find this # Length of original data length(data[,4]) 5 RIGHT!! # Length of new data length(coppie[1]) 1 WHY?? Thank you all for your help Paolo Grillo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Length problem
You were asking for the length of the first element of the vector coppie, which is of course 1. Did you mean to say lgngth(coppie)? length(data[,4]) is asking how many elements in that column, which seems to be 5. also your statement coppie - c(data[4:length(data)]) seems strange. What did you intend to do? On 2/11/08, Paolo Grillo [EMAIL PROTECTED] wrote: Hi all I have this problem: In my database .dta, called data I have five rows data-read.dta(C:\\2_CO_mmobile_ALL_Rid.dta) # From this database I wuold like to create another coppie-c(data[4:length(data)]) but I find this # Length of original data length(data[,4]) 5 RIGHT!! # Length of new data length(coppie[1]) 1 WHY?? Thank you all for your help Paolo Grillo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Length problem
Ciao Milthinho Here it is data yy mm dd C.531C.542 C.558C.565 1 2003 1 1 0.9941125 1.412338 0.8996750 2.258200 2 2003 1 2 1.7931375 2.786900NA 3.108725 3 2003 1 3NA 3.657775 1.7269750 2.541938 4 2003 1 4 1.0840625 1.766925 1.2313375 2.321300 5 2003 1 5 1.1558000 2.128488 0.9670375 NA # New data coppie-c(data[4:length(data)]) # Length of original data data[,4] [1] 0.9941125 1.7931375NA 1.0840625 1.1558000 length(data[,4]) [1] 5 5 # Right !!! [1] 5 # Length of new data coppie[1] $C.531 [1] 0.9941125 1.7931375NA 1.0840625 1.1558000 length(coppie[1]) [1] 1 1 # Why ?? Thank you for your help Paolo Italia milton ruser wrote: Ciao Paolo, How about you show some row of your data? How many columns have your data.frame? One? By the way data is not a so good name for your data frame. We will be very happy to help you Kindly, Miltinho Brasile On 2/11/08, Paolo Grillo [EMAIL PROTECTED] wrote: Hi all I have this problem: In my database .dta, called data I have five rows data-read.dta(C:\\2_CO_mmobile_ALL_Rid.dta) # From this database I wuold like to create another coppie-c(data[4:length(data)]) but I find this # Length of original data length(data[,4]) 5 RIGHT!! # Length of new data length(coppie[1]) 1 WHY?? Thank you all for your help Paolo Grillo __ [EMAIL PROTECTED] mailing list [3]https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide [4]http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. References 1. mailto:[EMAIL PROTECTED] 2. mailto:R-help@r-project.org 3. https://stat.ethz.ch/mailman/listinfo/r-help 4. http://www.R-project.org/posting-guide.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R programming style
Hi, I think using Emacs+ESS [1,2] is always a good starting point for a clear layout with consistent and meaningful indentation. I don't know how other people think about it, but in my opinion, Elements of Programming Style by Kernighan and Plauger is still an interesting read -- although their programs are either Fortran or PL/1 and the book itself is 30 years or old. Of course, I am not always successful but at least I try to incorporate their 'mantras': - write clearly, don't be too clever [3] - say what you mean, simply and directly - use library functions - write clearly -- don't sacrifice clarity for efficiency - let the machine do the dirty work - parenthesize to avoid ambiguity - 10.0 times 0.1 is hardly ever 1.0 - ... I hope this helps? Best, Roland [1] http://www.gnu.org/software/emacs/ [2] http://ess.r-project.org/ [3] I guess this is what Kernighan meant in his famous(?) quote: Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it? (http://en.wikiquote.org/wiki/Brian_W._Kernighan ) David Scott wrote: I am aware of one (unofficial) guide to style for R programming: http://www1.maths.lth.se/help/R/RCC/ from Henrik Bengtsson. Can anyone provide further pointers to good style? Views on Bengtsson's ideas would interest me as well. David Scott _ David Scott Department of Statistics, Tamaki Campus The University of Auckland, PB 92019 Auckland 1142,NEW ZEALAND Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000 Email:[EMAIL PROTECTED] Graduate Officer, Department of Statistics Director of Consulting, Department of Statistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: question_encoding
Hi Is it only a question of PDF export or are the glyphs distorted in plot window too? If it is in plot window try to look into etc folder to Rdevga file. If it is during export from plot window to PDF then try to produce PDF file with pdf() plot(1,1, type=n) text(1,1, Ě Š Ť Č Ř Ň Á Í É Ó Ý Ž) dev.off() Although I am on WXP the export does not work however direct formation through pdf command seems to work. Regards Petr [EMAIL PROTECTED] [EMAIL PROTECTED] napsal dne 08.02.2008 23:32:48: Hallo, I would like to ask you, for one question. When I export graph to .pdf and I need some czech font, I use a parameter encoding=ISOLatin2.enc for these special fonts. But exported text is bad. I try ISOLatin1 and MacRoman, but it is some one. I don't know, what Iam doing bad, because in quartz is the graph ok. SorryI forgetI have a Mac with Leopard and R ver. 2.6.1. Thank you. jena __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep etc.
I don't understand exactly what you are asking you can change v from 'insd-otsd' 'sppr-unsp' to 'insd--otsd', 'sppr--unsp' with sub(-, --,v) However do you want to change the entire assignment statement? --- Michael Kubovy [EMAIL PROTECTED] wrote: Dear R-helpers, How do I transform v - c('insd-otsd', 'sppr-unsp') into c('insd--otsd', 'sppr--unsp') ? _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using R in a university course: dealing with proposal comments
You can use a GUI to teach R, so the programming-style is gone. But using the command line aproach, it forces you to think about your analysis. In an GUI, it's easy to point and click, without knowing what you are doing. With the command line, you know where you start, and from there you go to the next step, and so on. I think you learn more this way. And of course, its free, so if they are off school or somewhat further, at work, they still have the possibility to use what they have learned (in contrary of SPSS maybe). Bart Arin Basu-3 wrote: Hi All, I am scheduled to teach a graduate course on research methods in health sciences at a university. While drafting the course proposal, I decided to include a brief introduction to R, primarily with an objective to enable the students to do data analysis using R. It is expected that enrolled students of this course have all at least a formal first level introduction to quantitative methods in health sciences and following completion of the course, they are all expected to either evaluate, interpret, or conduct primary research studies in health. The course would be delivered over 5 months, and R was proposed to be taught as several laboratory based hands-on sessions along with required readings within the coursework. The course proposal went to a few colleagues in the university for review. I received review feedbacks from them; two of them commented about inclusion of R in the proposal. In quoting parts these mails, I have masked the names/identities of the referees, and have included just part of the relevant text with their comments. Here are the comments: Comment 1: In my quick glance, I did not see that statistics would be taught, but I did see that R would be taught. Of course, R is a statistics programme. I worry that teaching R could overwhelm the class. Or teaching R would be worthless, because the students do not understand statistics. (Prof LR) Comment 2: Finally, on a minor point, why is R the statistical software being used? SPSS is probably more widely available in the workplace – certainly in areas of social policy etc. (Prof NB) I am interested to know if any of you have faced similar questions from colleagues about inclusion of R in non-statistics based university graduate courses. If you did and were required to address these concerns, how you would respond? TIA, Arin Basu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Using-R-in-a-university-course%3A-dealing-with-proposal-comments-tp15405138p15412757.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Power law, lognormal end exponenntial statistical testing in one sample population
Hello, My name is George Pantopoulos, i am a phd student in the Dept. of Geology, University of Patras, Greece. I am studying the statistical behaviour of bed thickness datasets taken from outcrops. Until now, 4 statistical distributions seems to fit my datasets: power law, lognormal, lognormal mixture with 2 modes and exponential. I already used the MIX package of R to detect a possible 2 mode lognormal mixture in the datasets. My questions are : Can i do ONE POPULATION non-parametric tests (ks or x2) in R for the above distributions, with the distribution parameters NOT estimated from the data ? (especially for the ks test). If yes , is it possible for someone to show me the exact steps that i must follow in R ? Also, must i generate artificial populations of data to do what i want ? The datasets are in text files, one variable (bed thicknesses only). Please if somebody knows how to do something of the above in R , it would be valuable for my work. Thank you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional rows
Because you need test = 0.2 | test 0.3 See ?| Gabor On Mon, Feb 11, 2008 at 09:12:57PM +0800, Stanley Ng wrote: That works beautfully. Why using test=0.2 || test 0.3 gives error ? -Original Message- From: Gabor Csardi [mailto:[EMAIL PROTECTED] Sent: Monday, February 11, 2008 18:27 To: Ng Stanley Cc: r-help Subject: Re: [R] Conditional rows which(apply(test=0.2, 1, all)) See ?which, ?all, and in particular ?apply. Gabor On Mon, Feb 11, 2008 at 06:22:09PM +0800, Ng Stanley wrote: Hi, Given a simple example, test - matrix(c(0.1, 0.2, 0.1, 0.2, 0.1, 0.1, 0.3, 0.1, 0.1), 3, 3) How to generate row indexes for which their corresponding row values are less than or equal to 0.2 ? For this example, row 2 and 3 are the correct ones. Thanks [...] -- Csardi Gabor [EMAIL PROTECTED]UNIL DGM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RGTK2 and glade on Windows - GUI newbie
On 2/11/08, Anja Kraft [EMAIL PROTECTED] wrote: I'd like to write a GUI (first choice with GTK+). There is also pmg [1] that uses GTK+. And, albeit more specific, playwith [2]. Also, creating a GUI under R issues were discussed previously, specifically this reference [3] may give you useful ideas. Liviu [1] http://wiener.math.csi.cuny.edu/pmg [2] http://cran.r-project.org/src/contrib/Descriptions/playwith.html [3] https://stat.ethz.ch/pipermail/r-sig-gui/2005-October/000504.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] Release 3.2.0 of randomSurvivalForest is now availablle
Dear useRs: Release 3.2.0 of the CRAN package randomSurvivalForest is now available. -- Release 3.2.0 represents a significant upgrade in the functionality of the product. Key changes are as follows: o A second method of perturbing the data set in order to calculate variable importance (VIMP) has been implemented. In addition to permuting the values for a single variable, a random split approach has been taken in which a data point is randomly assigned to the left or right daughter node when a split occurs on the specified variable. o The joint VIMP among multiple variables of a (potentially proper) subset of the GROW data can now be calculated using the new function interaction.rsf(). This represents a third mode of operation for the application, and follows rsf.default (GROW) and predict.rsf (PREDICT). See the documentation for details. o An additional option in GROW mode can now be specified. The option 'varUsed' allows users to quantify which variables have been split upon within a single tree or over the entire forest. See the documentation for more details. o The ability to multiply impute data has been implemented. This involves imputing data while growing a forest and using the results to grow a new forest in order to better impute the data. o In GROW mode, the application now outputs both the in-bag and OOB summary imputed values. o An additional split rule 'randomsplit' has been implemented. See the documentation for more details. o The split rule 'logrankscore' is now calculated correctly. o The split rule 'logrankapprox' has been removed and replaced by the new split rule 'logrankrandom'. See the documentation for more details. [EMAIL PROTECTED] Udaya B. Kogalur, Ph.D. Kogalur Shear Corporation 5425 Nestleway Drive, Suite L1 Clemmons, NC 27012 ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linux, UNIX, XP32, Vista X64 or ...? [SEC=UNCLASSIFIED]
On Tue, 12 Feb 2008, [EMAIL PROTECTED] wrote: Thanks to all for your kind suggestions. After some discussion with our IT staff, I was told the UNIX system we have is Solaris and installation of R is very time consuming because Given that this software is not standard, and given the amount of time required to compile the software (and potentially it's dependencies), it will need to be resourced as a project ... From my experience with IT staff, it may take quite a long time for them to set up such project, let alone the installation. Prebuilt versions of R are available for Solaris -- and the 'R Installation and Administration' manual told them so. Given that, I wonder if it is possible to install it myself. As I have mentioned before, I have no experience in using UNIX, but I will have an access to the UNIX system soon. Any suggestions and help are greatly appreciated. It is easy to install R from the sources if you have the compilers and e.g. Tcl/Tk installed. But a Solaris box quite possibly does not, and then a binary install is much easier. Regards, Jin -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Monday, 28 January 2008 11:38 To: Li Jin Cc: r-help@r-project.org Subject: Re: [R] Linux, UNIX, XP32, Vista X64 or ...? [SEC=UNCLASSIFIED] On the PC there is a builtin GUI but not on UNIX and there are some packages that are OS specific in which case you might get more or less selection but probably more. Also depending on the specific system you may have greater difficulty installing certain packages due to the need to compile them on UNIX and the possibility exists that you don't quite have the right libraries. On Windows you get binaries so this is not a problem. I have repeatedly found that common packages that I took for granted on Windows had some problem with installation on UNIX and I had to hunt around and figure out what the problem was with my UNIDX libraries or possibly some other problem. For all R packages this won't be a problem but for packages that use C and FORTRAN this can be. Although I am lumping all UNIX systems together I think this varies quite a bit from one particular type/distro of UNIX/Linux to another and I suspect if you are careful in picking out the right one (if you have a choice) you will actually have zero problems. On Jan 23, 2008 6:08 PM, [EMAIL PROTECTED] wrote: Dear All, I am currently using R in Windows PC with a 2 GB of RAM. Some pretty large datasets are expected soon, perhaps in an order of several GB. I am facing a similar situation like Ralph, either to get a new PC with a bigger RAM or else. I am just wondering if R is getting faster in other systems like UNIX or Linux. Any suggestions are appreciated. Regards, Jin Jin Li, PhD Spatial Modeller/ Computational Statistician Marine Coastal Environment Geoscience Australia Ph: 61 (02) 6249 9899 Fax: 61 (02) 6249 9956 email: [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Prof Brian Ripley Sent: Thursday, 24 January 2008 12:05 To: Ralph79 Cc: r-help@r-project.org Subject: Re: [R] Problems with XP32-3GB-patch?/ Worth upgrading to Vista X64? On Wed, 23 Jan 2008, Ralph79 wrote: Dear R-Users, as I will start a huge simulation in a few weeks, I am about to buy a new and fast PC. I have noticed, that the RAM has been the limiting factor in many of my calculations up to now (I had 2 GB in my old system, but Windows still used quite a lot of virtual memory), hence my new computer will have 4 GB of fast DDR2-800 RAM. However, I know that 1.) Windows 32 bit cannot make use of more than about 3,2 GB RAM and 2.) it is normally not allowed to allocate more than 2 GB of RAM to one single application (at least under XP, I don't know if that has changed under Vista?). I remember from the R-FAQ that you can manually adjust XP so that it allocates up to 3 GB to one application (the 3GB patch), but I read in a PC-magazine and some message boards that this may cause problems. Does anybody of you successfully use this trick without any problems? Yes, many people: most 32-bit Exchange servers use it. Please don't rate the advice in the R documentation below tittle-tattle you read on the web. Would it be wise to use a 64bit OS, as e.g. Vista X64? I think, under Vista X64 it should be no problem to allocate 4 GB of RAM to R. Any experiences with that? That's what the rw-FAQ says, and we do write answers based on experience! Thanks in advance, Ralph Wirth - Ralph Wirth University Erlangen-Nuremberg, Chair of Statistics GfK Group, Department of Methods and Product Development -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford,
Re: [R] gcc 4.3 any known issues?
On Mon, 11 Feb 2008, Peter Dalgaard wrote: Stefan Grosse wrote: Hi, Fedora is for Fedora 9 switching to gcc 4.3. Before I test it (rawhide) I want to be sure that R is running. So my question is whether there have been issues compiling R + packages using 4.3? I suspect that not many have tried. There is an R-2.6.2 RPM in Fedora 9 alpha, so _something_ seems to work. But enough have to have sorted out many of the issues. For example, the gcc 4.3 series uses C99-style inlining and that has been implemented (conditionally on GCC capabilities), and various bits of R have been rewritten to work around mis-compiles by the gcc 4.3 branch. But note that gcc 4.3 is not released and not even branched with rather a lot of remaining regressions. I'd be wary of using high levels of optimization: last time I looked R failed to build correctly at -O3 on x86_64, and there were more problems with packages. Surely this is a topic that the posting guide directs to R-devel -- all the issues mentioned are programming ones, not R ones. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] WG: Tinn-R not working well with latest R
Or try source(clipboard) On Feb 11, 2008 3:30 PM, Schmitt, Corinna [EMAIL PROTECTED] wrote: I am in R command window and just make Crt+V. Corinna -Ursprüngliche Nachricht- Von: Farrel Buchinsky [mailto:[EMAIL PROTECTED] Gesendet: Mo 11.02.2008 21:16 An: Schmitt, Corinna Betreff: Re: Tinn-R not working well with latest R I can easily get R to open without an error. I simply removed the Tinn-R related lines from the Rprofile.site file C:\Program Files\R-2.6.2\etc\Rprofile.site but then when I try to manually load the svIDE library by entering library(svIDE) from the command line, I get a similar error. So when you say Than paste in the command, what command are you referring to? What do you change it to? Schmitt, Corinna [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Hallo, I had the same problems before. I think the best solution is that you just copy the needed codepart out of Tinn-R with Ctr+C. Then open R directly from your desktop NOT from Tinn-R. Than paste in the command. you can still make changes in the command when you have not pressed enter by using the arrow buttons of the keyboard. put the curse where you want in the command line and change it. Hope that is what you want. I cannot imitate your example. Corinna [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ** This email and any files transmitted with it are confide...{{dropped:16}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] image quality
Have a look at the smoothScatter() function in the 'geneplotter' (Bioconductor) package. That might be sufficient for you. Alternatively, generate a bitmap (e.g. PNG) image plot instead (at least pdflatex can import those as is). /Henrik On Feb 11, 2008 2:18 AM, John Lande [EMAIL PROTECTED] wrote: dear all, I am writing a sweave documentation for my analysis, and I am plotting huge scatter plot data for microarray. unlucly this take a lot of resource to my pc because of the quality of the image which is to high (I see the PC get stuck for each single spot). how can I overcome this problem? is there a way to make lighter image? john [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linux, UNIX, XP32, Vista X64 or ...? [SEC=UNCLASSIFIED]
At 10:27 AM +1100 2/12/08, [EMAIL PROTECTED] wrote: Thanks to all for your kind suggestions. After some discussion with our IT staff, I was told the UNIX system we have is Solaris and installation of R is very time consuming because Given that this software is not standard, and given the amount of time required to compile the software (and potentially it's dependencies), it will need to be resourced as a project ... Even if pre-built versions of R were not available, I think your IT staff is exaggerating -- or at least being overly cautious when faced with something unfamiliar. Although I have used unix for many years, I am not a trained or experienced Solaris system administrator. Yet I have been able to install R from source on a modern Solaris using not more than a day or so. And most of that time was spent doing things that an experienced Solais sysadmin should be able to do relatively quickly. -Don From my experience with IT staff, it may take quite a long time for them to set up such project, let alone the installation. Given that, I wonder if it is possible to install it myself. As I have mentioned before, I have no experience in using UNIX, but I will have an access to the UNIX system soon. Any suggestions and help are greatly appreciated. Regards, Jin -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Monday, 28 January 2008 11:38 To: Li Jin Cc: r-help@r-project.org Subject: Re: [R] Linux, UNIX, XP32, Vista X64 or ...? [SEC=UNCLASSIFIED] On the PC there is a builtin GUI but not on UNIX and there are some packages that are OS specific in which case you might get more or less selection but probably more. Also depending on the specific system you may have greater difficulty installing certain packages due to the need to compile them on UNIX and the possibility exists that you don't quite have the right libraries. On Windows you get binaries so this is not a problem. I have repeatedly found that common packages that I took for granted on Windows had some problem with installation on UNIX and I had to hunt around and figure out what the problem was with my UNIDX libraries or possibly some other problem. For all R packages this won't be a problem but for packages that use C and FORTRAN this can be. Although I am lumping all UNIX systems together I think this varies quite a bit from one particular type/distro of UNIX/Linux to another and I suspect if you are careful in picking out the right one (if you have a choice) you will actually have zero problems. On Jan 23, 2008 6:08 PM, [EMAIL PROTECTED] wrote: Dear All, I am currently using R in Windows PC with a 2 GB of RAM. Some pretty large datasets are expected soon, perhaps in an order of several GB. I am facing a similar situation like Ralph, either to get a new PC with a bigger RAM or else. I am just wondering if R is getting faster in other systems like UNIX or Linux. Any suggestions are appreciated. Regards, Jin Jin Li, PhD Spatial Modeller/ Computational Statistician Marine Coastal Environment Geoscience Australia Ph: 61 (02) 6249 9899 Fax: 61 (02) 6249 9956 email: [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Prof Brian Ripley Sent: Thursday, 24 January 2008 12:05 To: Ralph79 Cc: r-help@r-project.org Subject: Re: [R] Problems with XP32-3GB-patch?/ Worth upgrading to Vista X64? On Wed, 23 Jan 2008, Ralph79 wrote: Dear R-Users, as I will start a huge simulation in a few weeks, I am about to buy a new and fast PC. I have noticed, that the RAM has been the limiting factor in many of my calculations up to now (I had 2 GB in my old system, but Windows still used quite a lot of virtual memory), hence my new computer will have 4 GB of fast DDR2-800 RAM. However, I know that 1.) Windows 32 bit cannot make use of more than about 3,2 GB RAM and 2.) it is normally not allowed to allocate more than 2 GB of RAM to one single application (at least under XP, I don't know if that has changed under Vista?). I remember from the R-FAQ that you can manually adjust XP so that it allocates up to 3 GB to one application (the 3GB patch), but I read in a PC-magazine and some message boards that this may cause problems. Does anybody of you successfully use this trick without any problems? Yes, many people: most 32-bit Exchange servers use it. Please don't rate the advice in the R documentation below tittle-tattle you read on the web. Would it be wise to use a 64bit OS, as e.g. Vista X64? I think, under Vista X64 it should be no problem to allocate 4 GB of RAM to R. Any experiences with that? That's what the rw-FAQ says, and we do write answers based on experience! Thanks in advance, Ralph Wirth - Ralph Wirth
Re: [R] R programming style
Hi, Earl F. Glynn wrote: Instead of using 1 or 2 in an apply, I'll write something like this trying for some sort of mnemonic apply(x, BY.ROW-1, sum) or apply(z, BY.COL-2, mean) It think it makes sense to use those magic numbers in the given case. Please let me give you several arguments: - In such a setting, I'd probably also use more mnemonic functions: rowMeans rowSums colMeans colSums - The numbering of the MARGINs (the name of the second argument) is what I remember from maths: 1 is for rows, 2 index is for columns, ... So I don't think the numbering is counter-intuitive. For sure, you have to check the help page at least once. But this is also the case for using mnemonic arguments. - The first argument in apply() is an array which is not restricted to two dimensions. For example, if you are working with three dimension, how would you specify it? BY.LAYER? Maybe, but then four dimensions or five dimensions?[1] Please don't consider this as a personal criticism. I am sure that users' criticism improves R. But using mnemonics instead of the margins in the apply() case is not a convincing example, I think. Maybe you have another example? Best, Roland [1] If you are curious whether there practical applications of four- or fivedimensional arrays, I can write to you off-list how useful they were in real world projects. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] overdispersion + GAM
No. Binomial data can indeed be overdispersed. See McCullagh Nelder (1989, section 4.5). Accounting for over(under)dispersion in binomial and Poisson distributions is, in fact, one of the original impetus for GEE type developments. See also a nice paper by Liang McCullagh (Biometrics 1993, p. 623-630), which discusses numerous examples of overdispersion in binary data. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gavin Simpson Sent: Monday, February 11, 2008 12:37 PM To: anna banana Cc: r-help@r-project.org Subject: Re: [R] overdispersion + GAM On Mon, 2008-02-11 at 07:35 -0800, anna banana wrote: Hi, there are a lot of messages dealing with overdispersion, but I couldn't find anything about how to test for overdispersion. I applied a GAM with binomial distribution on my presence/absence data, and would like to check for overdispersion. Does anyone know the command? Bernoulli data (presence/absence of single species say) can't be overdispersed, so there is no need to test or correct for it. G Many thanks, Anna -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] image quality
On 2/11/08, John Lande [EMAIL PROTECTED] wrote: I am writing a sweave documentation for my analysis, and I am plotting huge scatter plot data for microarray. unlucly this take a lot of resource to my pc because of the quality of the image which is to high (I see the PC get stuck for each single spot). how can I overcome this problem? is there a way to make lighter image? john John, You may try to plot random samples of your data. E.g.: df1 - data.frame(x=rnorm(1), y=rnorm(1)) df1.small - df1[sample(nrow(df1),1000), ] with(df1.small, plot(x,y)) HTH, Philippe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R programming style
David Scott [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Can anyone provide further pointers to good style? While not written for R specifically, the book Code Complete: A Practical Handbook of Software Construction (2nd Edition) discusses a number of good concepts for writing good code in any language: http://www.amazon.com/Code-Complete-Practical-Handbook-Construction/dp/0735619670 In particular, Part IV Statements gives a number of useful suggestions by type of statement, e.g., straight-line code, conditionals, loops, ... There are some practices used in R that I think should be improved. For example, many years ago I was taught in a software engineering class that the use of magic numbers was a bad practice, yet we find magic numbers used in R in many places. Instead of using 1 or 2 in an apply, I'll write something like this trying for some sort of mnemonic apply(x, BY.ROW-1, sum) or apply(z, BY.COL-2, mean) I find BY.ROW or BY.COL to be more mnemonic than the magic numbers 1 and 2. The sides 1, 2, 3, and 4 in an axis statement should have some sort of mnemonic definition, too, perhaps: axis(BOTTOM-1, ...) But I believe I was ostracized in this E-mail list the last time I suggested such mnemonics instead of magic numbers. efg Earl F. Glynn Bioinformatics Stowers Institute for Medical Research __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to fit discrete distribution to data in R having non standard shape?
Hello, I have 421 readings of time and no of requests coming at perticular time.Basically I have data with interval of one minute and corresponding no of requests arriving per minute.It is discrete in nature.I am collecting data from 9AM to 4PM.But some of readings are coming as 0.When I plotted histogram of data I could not get shape of any standard distribution.Now,my aim is to find distribution which is best fit to my data. How can I do that with R?Because major problem is shape is not standard one.I am not able to fit any standard model.Do I need to use empirical distribution?Is there any other way to fit appropriate model by dividing data and trying to fit model to each part? Please help me on this issue. Thank You. Aswad [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] controlling the edge linewidth in Rgraphviz
Check out: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/111006.html On Feb 11, 2008 9:56 PM, Adrian Dragulescu [EMAIL PROTECTED] wrote: Hello, I would like to have different linewidths for the edges of my graph. I read the documentation but could not find how to control this. On the Graphviz help page I've seen that there is something called penwidth but I could not find it in the R edge attributes. Thanks a lot for any help. Adrian Dragulescu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] controlling the edge linewidth in Rgraphviz
Hello, I would like to have different linewidths for the edges of my graph. I read the documentation but could not find how to control this. On the Graphviz help page I've seen that there is something called penwidth but I could not find it in the R edge attributes. Thanks a lot for any help. Adrian Dragulescu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R on Mac PRO does anyone have experience with R on such a platform ?
My experience with R.app on a MACbook has been mostly very positive. I like the interface much better than that of Windows-- with two exceptions. a) I use stepping thru code with control-R. It's not as convenient on Mac- the code you want to run has to be actually selected; not good enough just to be on the line you want. That slows down code-stepping. b) saveHistory() doesn't save the history of the current session -- beware, I lost some work that way. you have to actually click a button. c) no resizing graphs post-hoc, d) saving graphics to a file is inconvenient except for pdf output. Some plusses are: a) better built-in editor (if you're not using ESS), including delimiter matching b) the history pane is nice, c) the package installer and manager are nicer than on Win, d) autocompletion with ctrl-period, e) you can select text on the current or past command line much easier, f) attractive interface with lots of cosmetic options. I've done some tkrplot work in both (using X11 in OSX) -- some inconsistencies with placement of widgets show up. This is off the top of my head. Check out the mailing list R-sig-mac for more info. Maura E Monville wrote: I saw there exists an R version for Mac/OS. I'd like to hear from someone who is running R on a Mac/OS before venturing on getting the following computer system. I am in the process of choosing a powerful laptop 17 MB PRO 2.6GHZ(dual-core) 4GBRAM Thank you so much, -- Maura E.M [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/R-on-Mac-PRO-does-anyone-have-experience-with-R-on-such-a-platform---tp15392360p15417362.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] User defined split function in rpart
I had a similar problem, trying to use lme within a custom rpart function. I got around it by passing the dataframe I needed through the parms option in rpart, and then using the parms option in evaluation, init and split as a dataset. It's not the most elegant solution, but it will work. Have you (or anyone else) figured out the details of the summary and text options in the init function? I know that they are used to fill out the summary of the model and the text.rpart plotting, but I can't seem to use any of the variables being passed to them efficiently (or at all). Hope that helps, Sam Stewart On Feb 20, 2007 2:47 PM, Tobias Guennel [EMAIL PROTECTED] wrote: I have made some progress with the user defined splitting function and I got a lot of the things I needed to work. However, I am still stuck on accessing the node data. It would probably be enough if somebody could tell me, how I can access the original data frame of the call to rpart. So if the call is: fit0 - rpart(Sat ~Infl +Cont+ Type, housing, control=rpart.control(minsplit=10, xval=0), method=alist) how can I access the housing data frame within the user defined splitting function? Any input would be highly appreciated! Thank you Tobias Guennel -Original Message- From: Tobias Guennel [mailto:[EMAIL PROTECTED] Sent: Monday, February 19, 2007 3:40 PM To: '[EMAIL PROTECTED]' Subject: [R] User defined split function in rpart Maybe I should explain my Problem a little bit more detailed. The rpart package allows for user defined split functions. An example is given in the source/test directory of the package as usersplits.R. The comments say that three functions have to be supplied: 1. The 'evaluation' function. Called once per node. Produce a label (1 or more elements long) for labeling each node, and a deviance. 2. The split function, where most of the work occurs. Called once per split variable per node. 3. The init function: fix up y to deal with offsets return a dummy parms list numresp is the number of values produced by the eval routine's label. I have altered the evaluation function and the split function for my needs. Within those functions, I need to fit a proportional odds model to the data of the current node. I am using the polr() routine from the MASS package to fit the model. Now my problem is, how can I call the polr() function only with the data of the current node. That's what I tried so far: evalfunc - function(y,x,parms,data) { pomnode-polr(data$y~data$x,data,weights=data$Freq) parprobs-predict(pomnode,type=probs) dev-0 K-dim(parprobs)[2] N-dim(parprobs)[1]/K for(i in 1:N){ tempsum-0 Ni-0 for(l in 1:K){ Ni-Ni+data$Freq[K*(i-1)+l] } for(j in 1:K){ tempsum-tempsum+data$Freq[K*(i-1)+j]/Ni*log(parprobs[i,j]*Ni/data$Freq[K*(i -1)+j]) } dev=dev+Ni*tempsum } dev=-2*dev wmean-1 list(label= wmean, deviance=dev) } I get the error: Error in eval(expr, envir, enclos) : argument data is missing, with no default How can I use the data of the current node? Thank you Tobias Guennel __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mistake during subscription
On 11-Feb-08 16:34:36, Anna Meissner wrote: Dear R-helper, I made a mistake during my subscription, and think that I turned off from the mailing list. I confirm that I want to join the mailing list and wish to post some emails. Cheers, Anna Meissner Anna, If you [re-]visit the info/subscription web page at http://stat.ethz.ch/mailman/listinfo/r-help and enter your details in the section Subscribing to R-help, and then click the Subscribe button, then you should get subscribed to the list. This may not take effect immediately, since I think the list administrator (Martin Maechler) may have to approve you first. (And this may be why nothing has happened for you yet, since Martin has been away for a week and, I think, has only just returned). In any case, if you post an email to the list even though you are not subscribed, it will reach the list once it has been approved (though you would have to visit the archives at https://stat.ethz.ch/pipermail/r-help/ to see any replies). Hoping this helps, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 11-Feb-08 Time: 17:20:03 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to generate a column based on other columns in a data frame
Assuming this data frame: DF - data.frame(X = c(36.435, 36.435, 36.435, 35.329, 35.329, 36.431, 36.431, 35.421, 35.421, 35.421), Y = c(30.118, 30.118, 30.118, 29.657, 29.657, 30.111, 30.111, 29.797, 29.797, 29.797)) # Try this: DF$site - as.numeric(factor(interaction(DF$X, DF$Y))) If X and Y can vary slightly while still referring to the same site then round them first to k decimal places first. See ?round On Feb 11, 2008 11:30 AM, Weidong Gu [EMAIL PROTECTED] wrote: HI, I am working on a data set with multiple collections of mosquitoes at sampling sites. Each row represents a collection of individual samples with coordinates for each collection. ... X, Y,... 1 36.435 30.118 2 36.435 30.118 3 36.435 30.118 4 35.329 29.657 5 35.329 29.657 6 36.431 30.111 7 36.431 30.111 8 35.421 29.797 9 35.421 29.797 10 35.421 29.797 Unfortunately, there is no 'site' entry. I would like to add a column of 'site' based on the coordinates of samples so that samples from the same sites have the same site ID like S1, S2, How to do this in R way? Thanks. Weidong Gu, Department of Medicine University of Alabama, Birmingham 1900 University Blvd., Birmingham, Alabama 35294 Email: [EMAIL PROTECTED] PH: (205)-975-9053 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Length problem
Ciao Paulo, my.data-read.table(stdin(),head=T,sep=,) yy,mm,dd,C.531,C.542,C.558,C.565 2003,1,1,0.9941125,1.412338,0.8996750,2.258200 2003,1,2,1.7931375,2.786900,NA,3.108725 2003,1,3,NA,3.657775,1.7269750,2.541938 2003,1,4,1.0840625,1.766925,1.2313375,2.321300 2003,1,5,1.1558000,2.128488,0.9670375,NA coppie-c(my.data[4:length(my.data)]) my.data[,4] length(my.data[,4]) coppie[1] length(coppie[1]) #here you get 1 because you have one object ($C.531) length(coppie[[1]]) #here you get what you want. Good luck Miltinho On 2/11/08, Paolo Grillo [EMAIL PROTECTED] wrote: Ciao Milthinho Here it is data yy mm dd C.531C.542 C.558C.565 1 2003 1 1 0.9941125 1.412338 0.8996750 2.258200 2 2003 1 2 1.7931375 2.786900NA 3.108725 3 2003 1 3NA 3.657775 1.7269750 2.541938 4 2003 1 4 1.0840625 1.766925 1.2313375 2.321300 5 2003 1 5 1.1558000 2.128488 0.9670375 NA # New data coppie-c(data[4:length(data)]) # Length of original data data[,4] [1] 0.9941125 1.7931375NA 1.0840625 1.1558000 length(data[,4]) [1] 5 5 # Right !!! [1] 5 # Length of new data coppie[1] $C.531 [1] 0.9941125 1.7931375NA 1.0840625 1.1558000 length(coppie[1]) [1] 1 1 # Why ?? Thank you for your help Paolo Italia milton ruser wrote: Ciao Paolo, How about you show some row of your data? How many columns have your data.frame? One? By the way data is not a so good name for your data frame. We will be very happy to help you Kindly, Miltinho Brasile On 2/11/08, Paolo Grillo [EMAIL PROTECTED] wrote: Hi all I have this problem: In my database .dta, called data I have five rows data-read.dta(C:\\2_CO_mmobile_ALL_Rid.dta) # From this database I wuold like to create another coppie-c(data[4:length(data)]) but I find this # Length of original data length(data[,4]) 5 RIGHT!! # Length of new data length(coppie[1]) 1 WHY?? Thank you all for your help Paolo Grillo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mistake during subscription
Dear R-helper, I made a mistake during my subscription, and think that I turned off from the mailing list. I confirm that I want to join the mailing list and wish to post some emails. Cheers, Anna Meissner [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using R in a university course: dealing with proposal comments
On Mon, Feb 11, 2008 at 07:37:04AM -0800, Neil Shephard wrote: Arin Basu-3 wrote: Comment 2: Finally, on a minor point, why is R the statistical software being used? SPSS is probably more widely available in the workplace – certainly in areas of social policy etc. (Prof NB) What struck me in the above is the probably. How probable is it, anything to substantiate the claim? Anyway, whether one package is more widely available in the workplace than another is somewhat of a moot point. If a student learns how to use one software package then they start to get pigeon-holed into using that particular software package. Many jobs are advertised with SPSS/SAS/Stata/S-Plus (add/subtract at will) skills/knowledge required (or at least desirable). The prospective job applicant may think Well I don't know how to use that so I shan't bother applying or they may be unwilling to re-learn how to use a new stats package after months/years of investment in learning how to use another package, alternatively they may well just loose out to someone who already has the experience/skills. (Most) of this problem isn't negated when using R. Start a new job and use the (excellent, extensible, and free) software that you've been using for years. And you could even argue that learning R means you'll be able to do more with SPSS: http://www.spss.com/spss/data_management_book.htm [I have not read this book so I don't know anything about the details of how they implement this, I just came across this by accident, but I was intrigued by the idea of extending SPSS using R.] David I'd stick with using R to teach your statistics, in the long-run any of them who continue to perform statistical analysis will be grateful. Neil -- David Whiting, Ph.D. Advancing Research in Chronic Disease Epidemiology (ARCHEPI) programme Institute of Health and Society, The Medical School, Newcastle University, Framlington Place, Newcastle upon Tyne, NE2 4HH. Tel: +44 191 222 7045; Extn: 7375; Fax: +44 191 222 8211. http://research.ncl.ac.uk/archepi www.ncl.ac.uk/ihs __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to generate a column based on other columns in a data frame
HI, I am working on a data set with multiple collections of mosquitoes at sampling sites. Each row represents a collection of individual samples with coordinates for each collection. ... X, Y,... 1 36.435 30.118 2 36.435 30.118 3 36.435 30.118 4 35.329 29.657 5 35.329 29.657 6 36.431 30.111 7 36.431 30.111 8 35.421 29.797 9 35.421 29.797 10 35.421 29.797 Unfortunately, there is no 'site' entry. I would like to add a column of 'site' based on the coordinates of samples so that samples from the same sites have the same site ID like S1, S2, How to do this in R way? Thanks. Weidong Gu, Department of Medicine University of Alabama, Birmingham 1900 University Blvd., Birmingham, Alabama 35294 Email: [EMAIL PROTECTED] PH: (205)-975-9053 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Difference between P.Value and adj.P.Value
Hi Corinna The p.adjusted value is the the p-value adjusted for Multiple Comparisons. Enter ?p.adjust to get more of an explanation. Regards JS --- -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Schmitt, Corinna Sent: 11 February 2008 16:02 To: r-help@r-project.org Subject: [R] Difference between P.Value and adj.P.Value Hallo, fit12-lmFit(qrg[,1:2]) t12-toptable(fit12,adjust=fdr,number=25,genelist=qrg$genes[,1]) t12 ID logFC t P.Value adj.P.ValB 522PLAU_OP -6.836144 -8.420414 5.589416e-05 0.01212520 2.054965 1555 CD44_WIZ -6.569622 -8.227938 6.510169e-05 0.01212520 1.944046 Can anyone tell me what the difference is between P.Value and adj.P.Value? I need to analyse microarrays and should say if there exist differential expressed genes. Which P.Value should I use? Thanks, Corinna __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ROracle for windows
.. Somebody can help me to connect Oracle data base with R ? i`m just a user of this software and i don`t know about this especial thinks! sorry about my english... att -- Atenciosamente Daniel Ito Estatística UNICAMP EPR - CPFL [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Length problem
Ciao Paolo, How about you show some row of your data? How many columns have your data.frame? One? By the way data is not a so good name for your data frame. We will be very happy to help you Kindly, Miltinho Brasile On 2/11/08, Paolo Grillo [EMAIL PROTECTED] wrote: Hi all I have this problem: In my database .dta, called data I have five rows data-read.dta(C:\\2_CO_mmobile_ALL_Rid.dta) # From this database I wuold like to create another coppie-c(data[4:length(data)]) but I find this # Length of original data length(data[,4]) 5 RIGHT!! # Length of new data length(coppie[1]) 1 WHY?? Thank you all for your help Paolo Grillo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] overdispersion + GAM
Hi, there are a lot of messages dealing with overdispersion, but I couldn't find anything about how to test for overdispersion. I applied a GAM with binomial distribution on my presence/absence data, and would like to check for overdispersion. Does anyone know the command? Many thanks, Anna -- View this message in context: http://www.nabble.com/overdispersion-%2B-GAM-tp15413120p15413120.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Length problem
Hi all I have this problem: In my database .dta, called data I have five rows data-read.dta(C:\\2_CO_mmobile_ALL_Rid.dta) # From this database I wuold like to create another coppie-c(data[4:length(data)]) but I find this # Length of original data length(data[,4]) 5 RIGHT!! # Length of new data length(coppie[1]) 1 WHY?? Thank you all for your help Paolo Grillo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using R in a university course: dealing with proposal comments
Hi Arin, Others have commented wisely an your first issue. As for your 2nd issue, I had my own concerns about using R in undergraduate teaching because I had always used a point-and-click program for that level. I should not have worried. The current generation has been typing on their keyboards and their phones for a long time; they are very skilled. They LIKE a command-line interface, so long as someone gives them an initial cheat sheet to get them going. They like the price, they like having it on their own computers, and they like that they can use it other courses. Some students are sometimes upset that no one has ever told them about R before. Two hours after the first lab in which I had students download R to their laptops, I received an email from a student telling me about how she had used R to do her physics homework. I like the (almost) platform-independence of R. I've resisted using Rcmdr and JGR because I want students to be able to use base R well. If they want to customize later, then fine. But what I teach them will apply wherever they next encounter R, whereas if were to use a lot of packages--especially one I would be tempted to create to match my teaching more closely--then they wouldn't be sure what to expect later. gary mcclelland Colorado [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using R in a university course: dealing with proposal comments
I've been teaching an intro stats class to engineering students (who are better in calculus and math than med students, I would imagine), and use of R has never been received very warmly. I might not be teaching it right, but their (quite valid, from their standpoint) concerns were that they would have to learn a tool that they will never use (so they might have been better off with statistics toolbox from Matlab say, as they use the latter in their DiffEq, Circuits and other classes), and that did not get enough credit points for doing those (and indeed I was suggesting using R as an extra credit, essentially as a bypass so as not to use the tables in the end of the book). With health sciences people, I would expect they would want to learn the tool that they would use for life -- at least that's my impression with the applied researchers that I've interacted with: their computer literacy is often limited to a small number of software titles, but they know each of them quite well. R might be just too dynamic for them. Again, it's not terribly clear whether they will use it at all if that's the only statistics class they take for breadth requirement. If anything, I would expect SAS and Stata to be more widely used in biostatistics, so teaching any of those might be of greater service and use to your students. Training researchers of tomorrow might be great, but ifyour students get on the market in the end of the semester, they won't have the luxury of waiting until R becomes THE package of choice. -- Stas Kolenikov, also found at http://stas.kolenikov.name Small print: Please do not reply to my Gmail address as I don't check it regularly. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using R in a university course: dealing with proposal comments
I will also evaluate what did the students used before in the introductory statistics class and how proficient they have become in using it. If they only barely touched it, I will use my class as a chance to further refine their familiarity with the software they saw before. Tool is tool, I consider it's more important to use at least one tool, regardless of if it's trendy or free or user-friendly, really well rather than being able to juggle many softwares superficially. SPSS, despite of its price, is still widely recognized. I wouldn't feel too bad teaching it. However, I will definitely shift the focus to syntax writing from GUI so that the students will be better prepared for other command based softwares. Ken __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R programming style
I just got a copy of A First Course in Statistical Programming with R by W. John Braun and Duncan J. Murdoch. Cambridge. at amazon: http://www.amazon.com/First-Course-Statistical-Programming-R/dp/0521694248/ first couple of chapters are base R that most everyone would know before wanting to program but then the other chapters on programming itself seem pretty good so far. gary mcclelland colorado On Mon, Feb 11, 2008 at 3:47 AM, David Scott [EMAIL PROTECTED] wrote: I am aware of one (unofficial) guide to style for R programming: http://www1.maths.lth.se/help/R/RCC/ from Henrik Bengtsson. Can anyone provide further pointers to good style? Views on Bengtsson's ideas would interest me as well. David Scott _ David Scott Department of Statistics, Tamaki Campus The University of Auckland, PB 92019 Auckland 1142,NEW ZEALAND Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000 Email: [EMAIL PROTECTED] Graduate Officer, Department of Statistics Director of Consulting, Department of Statistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lower Confidence Bound
Hi, I'm doing analysis of microarray data with affylmGUI package, and to make a comparison by using affylmGUI and dChip ( http://biosun1.harvard.edu/complab/dchip/ another tool to analyse microarrays). I'm trying to use the same criteria as many as I can, but there's a 90% lower confidence bound to filter genes. Does anyone know how to calculate this lower confidence bound in R or by using a specific package. Thanks! Feifei Ding __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linux, UNIX, XP32, Vista X64 or ...? [SEC=UNCLASSIFIED]
Thanks to all for your kind suggestions. After some discussion with our IT staff, I was told the UNIX system we have is Solaris and installation of R is very time consuming because Given that this software is not standard, and given the amount of time required to compile the software (and potentially it's dependencies), it will need to be resourced as a project ... From my experience with IT staff, it may take quite a long time for them to set up such project, let alone the installation. Given that, I wonder if it is possible to install it myself. As I have mentioned before, I have no experience in using UNIX, but I will have an access to the UNIX system soon. Any suggestions and help are greatly appreciated. Regards, Jin -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Monday, 28 January 2008 11:38 To: Li Jin Cc: r-help@r-project.org Subject: Re: [R] Linux, UNIX, XP32, Vista X64 or ...? [SEC=UNCLASSIFIED] On the PC there is a builtin GUI but not on UNIX and there are some packages that are OS specific in which case you might get more or less selection but probably more. Also depending on the specific system you may have greater difficulty installing certain packages due to the need to compile them on UNIX and the possibility exists that you don't quite have the right libraries. On Windows you get binaries so this is not a problem. I have repeatedly found that common packages that I took for granted on Windows had some problem with installation on UNIX and I had to hunt around and figure out what the problem was with my UNIDX libraries or possibly some other problem. For all R packages this won't be a problem but for packages that use C and FORTRAN this can be. Although I am lumping all UNIX systems together I think this varies quite a bit from one particular type/distro of UNIX/Linux to another and I suspect if you are careful in picking out the right one (if you have a choice) you will actually have zero problems. On Jan 23, 2008 6:08 PM, [EMAIL PROTECTED] wrote: Dear All, I am currently using R in Windows PC with a 2 GB of RAM. Some pretty large datasets are expected soon, perhaps in an order of several GB. I am facing a similar situation like Ralph, either to get a new PC with a bigger RAM or else. I am just wondering if R is getting faster in other systems like UNIX or Linux. Any suggestions are appreciated. Regards, Jin Jin Li, PhD Spatial Modeller/ Computational Statistician Marine Coastal Environment Geoscience Australia Ph: 61 (02) 6249 9899 Fax: 61 (02) 6249 9956 email: [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Prof Brian Ripley Sent: Thursday, 24 January 2008 12:05 To: Ralph79 Cc: r-help@r-project.org Subject: Re: [R] Problems with XP32-3GB-patch?/ Worth upgrading to Vista X64? On Wed, 23 Jan 2008, Ralph79 wrote: Dear R-Users, as I will start a huge simulation in a few weeks, I am about to buy a new and fast PC. I have noticed, that the RAM has been the limiting factor in many of my calculations up to now (I had 2 GB in my old system, but Windows still used quite a lot of virtual memory), hence my new computer will have 4 GB of fast DDR2-800 RAM. However, I know that 1.) Windows 32 bit cannot make use of more than about 3,2 GB RAM and 2.) it is normally not allowed to allocate more than 2 GB of RAM to one single application (at least under XP, I don't know if that has changed under Vista?). I remember from the R-FAQ that you can manually adjust XP so that it allocates up to 3 GB to one application (the 3GB patch), but I read in a PC-magazine and some message boards that this may cause problems. Does anybody of you successfully use this trick without any problems? Yes, many people: most 32-bit Exchange servers use it. Please don't rate the advice in the R documentation below tittle-tattle you read on the web. Would it be wise to use a 64bit OS, as e.g. Vista X64? I think, under Vista X64 it should be no problem to allocate 4 GB of RAM to R. Any experiences with that? That's what the rw-FAQ says, and we do write answers based on experience! Thanks in advance, Ralph Wirth - Ralph Wirth University Erlangen-Nuremberg, Chair of Statistics GfK Group, Department of Methods and Product Development -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] Conditional rows
That works beautfully. Why using test=0.2 || test 0.3 gives error ? -Original Message- From: Gabor Csardi [mailto:[EMAIL PROTECTED] Sent: Monday, February 11, 2008 18:27 To: Ng Stanley Cc: r-help Subject: Re: [R] Conditional rows which(apply(test=0.2, 1, all)) See ?which, ?all, and in particular ?apply. Gabor On Mon, Feb 11, 2008 at 06:22:09PM +0800, Ng Stanley wrote: Hi, Given a simple example, test - matrix(c(0.1, 0.2, 0.1, 0.2, 0.1, 0.1, 0.3, 0.1, 0.1), 3, 3) How to generate row indexes for which their corresponding row values are less than or equal to 0.2 ? For this example, row 2 and 3 are the correct ones. Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Csardi Gabor [EMAIL PROTECTED]UNIL DGM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using R in a university course: dealing with proposal comments
Dear Arin, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of Arin Basu Sent: February-10-08 10:41 PM To: r-help@r-project.org Subject: [R] Using R in a university course: dealing with proposal comments Hi All, I am scheduled to teach a graduate course on research methods in health sciences at a university. While drafting the course proposal, I decided to include a brief introduction to R, primarily with an objective to enable the students to do data analysis using R. It is expected that enrolled students of this course have all at least a formal first level introduction to quantitative methods in health sciences and following completion of the course, they are all expected to either evaluate, interpret, or conduct primary research studies in health. The course would be delivered over 5 months, and R was proposed to be taught as several laboratory based hands-on sessions along with required readings within the coursework. The course proposal went to a few colleagues in the university for review. I received review feedbacks from them; two of them commented about inclusion of R in the proposal. In quoting parts these mails, I have masked the names/identities of the referees, and have included just part of the relevant text with their comments. Here are the comments: Comment 1: In my quick glance, I did not see that statistics would be taught, but I did see that R would be taught. Of course, R is a statistics programme. I worry that teaching R could overwhelm the class. Or teaching R would be worthless, because the students do not understand statistics. (Prof LR) As others have pointed out, this is potentially a valid point, but it is applicable to all statistical software. I use R in several different courses for social-science undergraduates and grad students, but the focus is on the statistical methods, with R as a tool. In introductory courses, I use the Rcmdr package to simplify students' interaction with R. Beyond that level, I want students to learn to use R as a practical tool for data analysis, so I teach them to write commands. In all courses, students have much more difficulty with the substantive course content than with R, which they pick up readily. Comment 2: Finally, on a minor point, why is R the statistical software being used? SPSS is probably more widely available in the workplace - certainly in areas of social policy etc. (Prof NB) I don't have concrete data on this, and I'm sure that usage varies by field, but I'd bet that R is now more widely used overall (and internationally) than SPSS. Moreover, it wouldn't take students long to learn to point-and-click their way through SPSS if they have to use it in future. I hope this helps, John I am interested to know if any of you have faced similar questions from colleagues about inclusion of R in non-statistics based university graduate courses. If you did and were required to address these concerns, how you would respond? TIA, Arin Basu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RGTK2 and glade on Windows - GUI newbie
Yes, a GUI based on GTK+ (with or without Glade) will work on Windows XP. If what you want to do is relatively straightforward (say, without any fancy formatting, or advanced event handling) then you should consider gWidgets. Look at the vignette in the gWidgets package. If you do decide to go with RGtk2 directly rather than gWidgets, look at demo(package=RGtk2). If you are using Glade, you might want to look at the source code for Rattle, which is a GUI built with Glade: see http://rattle.googlecode.com/ and http://datamining.togaware.com/survivor/Installation_Details.html (Another example is hydrosanity: http://hydrosanity.googlecode.com/) By the way, there is a special mailing list for GUI issues: https://stat.ethz.ch/mailman/listinfo/r-sig-gui Felix On Mon, Feb 11, 2008 at 9:52 PM, Anja Kraft [EMAIL PROTECTED] wrote: Hallo, I'd like to write a GUI (first choice with GTK+). I've surfed through the R- an Omegahat-Pages, because I'd like to use RGTK2, GTK 2.10.11 in combination with glade on Windows XP (perhaps later Unix, Mac). I've found a lot of different information. Because of the information I'm not sure, if this combination is running on Windows XP and I'm unsure how it works. Is there anyone, who has experience with this combination (if it works) and could tell me, where I could find something like a tutorial, how this combination is used together and how it works? Thank you very much, Anja Kraft __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Felix Andrews / 安福立 PhD candidate Integrated Catchment Assessment and Management Centre The Fenner School of Environment and Society The Australian National University (Building 48A), ACT 0200 Beijing Bag, Locked Bag 40, Kingston ACT 2604 http://www.neurofractal.org/felix/ 3358 543D AAC6 22C2 D336 80D9 360B 72DD 3E4C F5D8 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] PDF with computationally expensive normalizing constant
Hi I am writing some functionality for a multivariate PDF. One problem is that evaluating the normalizing constant (NC) is massively computationally intensive [one recent example took 4 hours and bigger examples would take much much longer] and it would be good allow for this in the design of the package somehow. For example, the likelihood function doesn't need the NC but (eg) the moment generating function does. So a user wanting a maximum-likelihood estimate shouldn't have to evaluate the NC but a user wanting a mean has to. Some simple forms of the PDF have an easily-evaluated analytical expression for the NC. And once the NC is evaluated, it would be good to store it somehow. I thought perhaps I could define an S4 class with a slot for the parameters and a slot for the NC; and if the NC is unknown this would have an NA entry. Then a user could execute something like a - CalculateNormalizingConstant(a) and after this, object a would then have the numerically computed NC in place. Is this a Good Idea? Are there any PDFs implemented in R in which this is an issue? -- Robin Hankin Uncertainty Analyst and Neutral Theorist, National Oceanography Centre, Southampton European Way, Southampton SO14 3ZH, UK tel 023-8059-7743 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dendrogram for agglomerative hierarchical clustering result
Hi Risto, You could try example(dendrogram) best wishes Wolfgang noorpiilur scripsit: Hey group, I have a problem of drawing dendrogram as the result of my program written in C. My algorithm is a approximation algorithm for single linkage method. AS a result I will get the following data: [Average distance] [cluster A] [cluster B] For example: 42.593141 1 26 42.593141 4 6 42.593141 123 124 42.593141 4 113 74.244206 1 123 74.244206 4 133 74.244206 1 36 So far I have used C to generate a bitmap output but I would like to use the computed result as an input for R to just draw the dendrogram. As I'm new to R any help is appreciated. Thanks, Risto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] local variance estimation using gam or locfit
Hi, I appreciate if any one could give me clues about the following problem. I have a map data, x, y, z, and d, where (x,y) is the coordinate of a point and d is a distance from the urban center (0,0), and z is population density. Then I would like to calculate local standard deviations of these points. Let me say hypothetically, x - rnorm(100) y - rnorm(100) z - runif(100) d - sqrt(x^2+y^2)*runif(100,1,1.5) mod - gam(z~s(x,y,by=d)) std.res.loc - residuals/loc.std So, I would like to calculate loc.std. Is there any function available for this? Or should I manually compute it? I am reading Generalized Additive Model: Introduction to R by Dr. Wood. Thank you very much. Tk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with write.csv
I am new to R. I am using the impute package with data contained in csv file. I have followed the example in the impute package as follows: mydata = read.csv(sample_impute.csv, header = TRUE) mydata.expr - mydata[-1,-(1:2)] mydata.imputed - impute.knn(as.matrix(mydata.expr)) The impute is succesful. Then I try to write the imputation results (mydata.imputed) to a csv file such as follows.. write.csv(mydata.imputed, file = sample_imputed.csv) Error in data.frame(data = c(-0.07, -1.22, -0.09, -0.6, 0.65, -0.36, 0.25, : arguments imply differing number of rows: 18, 1, 0 When you use write.csv, the object that you are writing to a file must look something like a data frame or a matrix, i.e. a rectangle of data. The error message suggests that different columns of the thing you are trying to write have different numbers of rows. This means that mydata.imputed isn't the matrix it is supposed to be. You'll have to do some detective work to figure out what mydata.imputed really is. Try this: mydata.imputed class(mydata.imputed) dim(mydata.imputed) Then you need to see why mydata.imputed isn't a matrix. Here there are two possibilities 1. There are some lines of code that you didn't tell us about, where you overwrote mydata.imputed with another value. 2. The impute wasn't as successful as you thought. Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] svm: is this right?
Hi, I have a question on using svm{e1071} for a classification task: No matter how I split the data into training and test, I always end with a perfect accuracy in training but sensitivity = 0 for test. One example is like this 1 2 1 209 0 2 0 67 pred1 1 2 1 47 0 2 17 0 My question is, is there anything wrong with the following call: m2 - best.svm(class~., data=x1, gamma=2^(-3:3), cost=2^(0:5)) # x1 is training data pred1 - predict(m2, x3) # x3 is test data Thanks! -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] learning S4
Christophe you might find the Brobdingnag package on CRAN helpful here. I wrote the package partly to teach myself S4; it includes a vignette that builds the various S4 components from scratch, in a step-by-step annotated cookbook. HTH rksh On 8 Feb 2008, at 15:30, [EMAIL PROTECTED] wrote: Hi the list. I try to learn the S4 programming. I find the wiki and several doc. But I still have few questions... 1. To define 'representation', we can use two syntax : - representation=list(temps = 'numeric',traj = 'matrix') - representation(temps = 'numeric',traj = 'matrix') Is there any difference ? 2. 'validityMethod' check the intialisation of a new object, but not the latter modifications. Is it possible to set up a validation that check every modifications ? 3. When we use setMethod('initialize',...) does the validityMethod become un-used ? 4. Is it possible to set up several initialization processes ? One that build an objet from a data.frame, one from a matrix... Thanks Christophe Ce message a ete envoye par IMP, grace a l'Universite Paris 10 Nanterre __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Robin Hankin Uncertainty Analyst and Neutral Theorist, National Oceanography Centre, Southampton European Way, Southampton SO14 3ZH, UK tel 023-8059-7743 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tinn-R not working well with latest R
I can easily get R to open without an error. I simply removed the Tinn-R related lines from the Rprofile.site file C:\Program Files\R-2.6.2\etc\Rprofile.site but then when I try to manually load the svIDE library by entering library(svIDE) from the command line, I get a similar error. So when you say Than paste in the command, what command are you referring to? What do you change it to? Schmitt, Corinna [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Hallo, I had the same problems before. I think the best solution is that you just copy the needed codepart out of Tinn-R with Ctr+C. Then open R directly from your desktop NOT from Tinn-R. Than paste in the command. you can still make changes in the command when you have not pressed enter by using the arrow buttons of the keyboard. put the curse where you want in the command line and change it. Hope that is what you want. I cannot imitate your example. Corinna [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R programming style
I second that, Code Complete is a great book! For anyone interested in improving their code no matter what language, (it has a C++/Java-type focus but is definitely applicable to R), it would definitely be a good place to start. I've read some negative reviews claiming that everything he writes is 'obvious' (use good variable names, short concise functions, limit nested conditionals, etc) but on more than one occasion I've gone back over the book and thought of new places to improve my code. HTH, John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Earl F. Glynn Sent: Monday, February 11, 2008 2:30 PM To: [EMAIL PROTECTED] Subject: Re: [R] R programming style David Scott [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Can anyone provide further pointers to good style? While not written for R specifically, the book Code Complete: A Practical Handbook of Software Construction (2nd Edition) discusses a number of good concepts for writing good code in any language: http://www.amazon.com/Code-Complete-Practical-Handbook-Construction/dp/0 735619670 In particular, Part IV Statements gives a number of useful suggestions by type of statement, e.g., straight-line code, conditionals, loops, ... There are some practices used in R that I think should be improved. For example, many years ago I was taught in a software engineering class that the use of magic numbers was a bad practice, yet we find magic numbers used in R in many places. Instead of using 1 or 2 in an apply, I'll write something like this trying for some sort of mnemonic apply(x, BY.ROW-1, sum) or apply(z, BY.COL-2, mean) I find BY.ROW or BY.COL to be more mnemonic than the magic numbers 1 and 2. The sides 1, 2, 3, and 4 in an axis statement should have some sort of mnemonic definition, too, perhaps: axis(BOTTOM-1, ...) But I believe I was ostracized in this E-mail list the last time I suggested such mnemonics instead of magic numbers. efg Earl F. Glynn Bioinformatics Stowers Institute for Medical Research __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This e-mail and any attachments are confidential, may contain legal, professional or other privileged information, and are intended solely for the addressee. If you are not the intended recipient, do not use the information in this e-mail in any way, delete this e-mail and notify the sender. CEG-IP1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Viable Approach to Parallel R?
We've also had substantial success with the Condor project [http://www.cs.wisc.edu/condor/], not just with R, but as a generic computation grid. John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Lewis, Daniel (IS Consultant) Sent: Monday, February 11, 2008 1:09 PM To: r-help@r-project.org Subject: [R] Viable Approach to Parallel R? All, We are researching approaches to parallel R with the end goal of running R in a distributed manner on a Linux cluster. We expect of course to do some work decomposing our problems to be task-parallel or data-parallel, but wouldn't mind getting an initial boost working with embarrassingly parallel code sections and one of the approaches below. Incidentally our environment includes R 2.6.1, RHEL 5.1, Solaris 10, SGE (Sun Grid Engine) and OpenMPI 1.2.4 (SunHPC 7.1)). In researching previous work, the most promising approaches seem to be: A. Snow (with Rmpi or Rpvm) (as described in http://www.r-project.org/useR-2006/Slides/Harrington+Salibian-Barrera.pd f from the 2006 R User Conference) It is my understanding that this approach is viable, and works with OpenMPI 1.2.4. Is anyone using this method with good results? B. taskpR, RScaLAPACK, pMatrix I read a paper http://sdm.lbl.gov/sdmcenter/projects/SDM.center.parallel.r.2-pager.4.do c coming out of the ORNL, describing what they call parallel R, which included taskpr, RScaLAPACK, pMatrix. I notice that taskpR is no longer available in contrib, nor is pMatrix. An old link indicates the packages are available at http://www.ASPECT-SDM.org/Parallel-R but that site displays a notice that the server is migrating. Has this work been discontinued? Anyone using this? I see RScaLAPACK is still available, from reading the above it seems that was bundled with taskpR. Does it function without the other components? (Guess I'll try it and find out :) C. Sleigh NetworkSpaces I see that SCAI (Scientific Computing Associates) offers a parallel R package based on something they call NetworkSpaces and Sleigh (inspired by Snow). They sell services around the product but it is open source. They have an enhanced version that they sell support. http://www.lindaspaces.com/hp/BenchmarksWithCharts.pdf. Has anyone investigated this approach or it's open source components? TIA for any information, direction, suggestions, and if I've missed any other approaches please advise. Dan Lewis [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This e-mail and any attachments are confidential, may contain legal, professional or other privileged information, and are intended solely for the addressee. If you are not the intended recipient, do not use the information in this e-mail in any way, delete this e-mail and notify the sender. CEG-IP2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using R in a university course: dealing with proposal comments
Neil Shephard wrote: (Most) of this problem isn't negated when using R. Start a new job and use the (excellent, extensible, and free) software that you've been using for years. Apologies for the double negative, that should have read (Most) of this problem _is_ negated when using R. Neil -- View this message in context: http://www.nabble.com/Using-R-in-a-university-course%3A-dealing-with-proposal-comments-tp15405138p15416301.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to generate a column based on other columns in a data frame
Henrique Dallazuanna wrote: Try this: x2 - merge(x, cbind(unique(x), Site=sprintf(S%d, seq_len(nrow(unique(x), by=c(X, Y)) x2[order(x2$site)] That was (close to) my first thought as well. But what about site - with(x, interaction(X,Y, drop=TRUE)) levels(site) - paste(S, seq_len(length(levels(site))), sep=) -p On 11/02/2008, Weidong Gu [EMAIL PROTECTED] wrote: HI, I am working on a data set with multiple collections of mosquitoes at sampling sites. Each row represents a collection of individual samples with coordinates for each collection. ... X, Y,... 1 36.435 30.118 2 36.435 30.118 3 36.435 30.118 4 35.329 29.657 5 35.329 29.657 6 36.431 30.111 7 36.431 30.111 8 35.421 29.797 9 35.421 29.797 10 35.421 29.797 Unfortunately, there is no 'site' entry. I would like to add a column of 'site' based on the coordinates of samples so that samples from the same sites have the same site ID like S1, S2, How to do this in R way? Thanks. Weidong Gu, Department of Medicine University of Alabama, Birmingham 1900 University Blvd., Birmingham, Alabama 35294 Email: [EMAIL PROTECTED] PH: (205)-975-9053 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [OT] good reference for mixed models and EM algorithm
Hi Doug Ted, The multivariate Aitken accelerator suggested by Ted is numerically ill-conditioned. I have written a globally-convergent, general-purpose EM accelerator that works well. It is quite simple to implement for any EM-type algorithm (e.g. ECM, ECME which are all monotone in likelihood). My paper on that should be coming out soon in Scandinavian J of Stats. I would be interested in helping with its implementation for EM acceleration in large data sets with non-nested random effects. Best, Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ted Harding Sent: Monday, February 11, 2008 11:19 AM To: r-help@r-project.org Subject: Re: [R] [OT] good reference for mixed models and EM algorithm On 11-Feb-08 15:07:37, Douglas Bates wrote: [...] Except that Doug Bates doesn't use the EM algorithm for fitting mixed models any more. The lme4 package previously had an option for starting with EM (actually ECME, which is a variant of EM) iterations but I have since removed it. For large data sets and especially for models with non-nested random effects, the EM iterations just slowed things down relative to direct optimisation of the log-likelihood. [...] The raw EM Algorithm can be slow. I have had good success using Aitken Acceleration for it. The basic principle is that, once an interative algorithm gets to a stage where (approximately) [A] (X[n+1] - X) = k*(X[n] -X) where X[n] is the result at the n-th iteration, -1 k 1, and X is the limit, then you can use recent results to predict the limit. Taking the above equation literally, along with its analogue for the next step: [B] (X[n+2] - X) = k*(X[n+1] -X) from which k = (X[n+2] - X[[n+1])/(X[n+1] - X[n]) and then [C] X = (X[n+1] - X[n])/(1 - k). If X is multidimensional (say dimension = p), then k is a pxp matrix, and you want all its eigenvalues to be less than 1 in modulus. Then you use the matrix analogues of the above equations, based it on (p+1) successive iterations X[n], X[n+1], ... , X[n+p+1]), i.e. on the p-vector c(X[n+1]-X[n], X[n+1]-X[n+1], ... , X[n+p+1]-X[n+p]) I have had good experience with this too! The best method of proceeding is: Stage 1: Monitor the sequence {X[n]} until it seems that equation [A] is beginning to be approximately true; Stage 2: Apply equations [A], [B], [C] to estimate X. Stage 3: Starting at this X, run a few more iterations so that you get a better (later) estimate of k, and then apply [A], [B], [C] aqain to re-estimate X. Repeat stage 3 until happy (or bored). The EM Algorithm, in most cases, falls into the class of procedures to which Aitken Acceleration is applicable. Best wishes to all, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 11-Feb-08 Time: 16:18:45 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using R in a university course: dealing with proposal comments
Stas Kolenikov wrote: ... Training researchers of tomorrow might be great, but ifyour students get on the market in the end of the semester, they won't have the luxury of waiting until R becomes THE package of choice. Not being a teacher, I usually follow these discussions with a bit of amusement and some befuddlement. We hire young people hoping they will bring in bright new ideas from academia, and academics are training the students based on what they think are the old things we use. Fortunately, R is already one of the packages of choice many places. Another point that needs more emphasis is that R is actually a programming language, like Matlab and and APL, so it really has more general usefulness than statistics packages that one might use in the narrower context of a statistics course. Paul Gilbert La version française suit le texte anglais. This email may contain privileged and/or confidential in...{{dropped:26}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [OT] good reference for mixed models and EM algorithm
On 11-Feb-08 15:07:37, Douglas Bates wrote: [...] Except that Doug Bates doesn't use the EM algorithm for fitting mixed models any more. The lme4 package previously had an option for starting with EM (actually ECME, which is a variant of EM) iterations but I have since removed it. For large data sets and especially for models with non-nested random effects, the EM iterations just slowed things down relative to direct optimisation of the log-likelihood. [...] The raw EM Algorithm can be slow. I have had good success using Aitken Acceleration for it. The basic principle is that, once an interative algorithm gets to a stage where (approximately) [A] (X[n+1] - X) = k*(X[n] -X) where X[n] is the result at the n-th iteration, -1 k 1, and X is the limit, then you can use recent results to predict the limit. Taking the above equation literally, along with its analogue for the next step: [B] (X[n+2] - X) = k*(X[n+1] -X) from which k = (X[n+2] - X[[n+1])/(X[n+1] - X[n]) and then [C] X = (X[n+1] - X[n])/(1 - k). If X is multidimensional (say dimension = p), then k is a pxp matrix, and you want all its eigenvalues to be less than 1 in modulus. Then you use the matrix analogues of the above equations, based it on (p+1) successive iterations X[n], X[n+1], ... , X[n+p+1]), i.e. on the p-vector c(X[n+1]-X[n], X[n+1]-X[n+1], ... , X[n+p+1]-X[n+p]) I have had good experience with this too! The best method of proceeding is: Stage 1: Monitor the sequence {X[n]} until it seems that equation [A] is beginning to be approximately true; Stage 2: Apply equations [A], [B], [C] to estimate X. Stage 3: Starting at this X, run a few more iterations so that you get a better (later) estimate of k, and then apply [A], [B], [C] aqain to re-estimate X. Repeat stage 3 until happy (or bored). The EM Algorithm, in most cases, falls into the class of procedures to which Aitken Acceleration is applicable. Best wishes to all, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 11-Feb-08 Time: 16:18:45 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Length problem
I think that coppie is a list, so length(coppie[[1]]) On 11/02/2008, Paolo Grillo [EMAIL PROTECTED] wrote: Hi all I have this problem: In my database .dta, called data I have five rows data-read.dta(C:\\2_CO_mmobile_ALL_Rid.dta) # From this database I wuold like to create another coppie-c(data[4:length(data)]) but I find this # Length of original data length(data[,4]) 5 RIGHT!! # Length of new data length(coppie[1]) 1 WHY?? Thank you all for your help Paolo Grillo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] genetics package not working
Finally I found something that provides lower level examples. I was looking around the genetics package. I came across write.pop.file(genetics) and there I found the format of 'pedigree' files is documented at http://www.sph.umich.edu/csg/abecasis/GOLD/docs/pedigree.html That reference lays out exactly the format that is being used. Farrel Buchinsky [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] From crawling around the internet it appears to me as if genetics has given way to GeneticsBase and is part of bioconductor. The basic data structure has changed to something called geneSet class. There is a pdf document that promises to help me. http://www.bioconductor.org/packages/2.1/bioc/vignettes/GeneticsBase/inst/doc/SummaryTables.pdf. Unfortunately it does not. My dataset which was created using genetics package does not seem to fit (or should I say does not seem to easily fit) the read in formats demonstrated in the document: standard pedigree format, hapmap format, Pfizer format, Perlegen format. Can anyone point me to a resource with lower level instructions and examples? My format is as follows (rs numbers are not correct but do not worry about that detail) str(ped.seq[,2:15]) 'data.frame': 608 obs. of 14 variables: $ pedigree : int 1 1 2 3 3 4 4 5 6 6 ... $ id: Factor w/ 30 levels 1,2,3,4,..: 3 2 3 3 2 3 2 3 3 2 ... $ id.father : int 1 0 1 1 0 1 0 1 1 0 ... $ id.mother : int 2 0 2 2 0 2 0 2 2 0 ... $ PtCode: Factor w/ 608 levels AJM16001FA,AJM16001MO,..: 74 73 77 117 116 80 79 83 86 85 ... $ HS.nr : int 32940 32941 32960 32963 32964 32967 32968 32970 32972 32973 ... $ affected : int 2 1 2 2 1 2 1 2 2 1 ... $ sex : int 2 2 1 1 2 1 2 2 2 2 ... $ rs11684: Factor w/ 1 level C/C: 1 1 1 1 1 1 1 1 1 1 ... ..- attr(*, allele.names)= chr C ..- attr(*, allele.map)= chr [1, 1:2] C C $ rs1144: Factor w/ 3 levels A/A,G/A,G/G: 3 3 3 3 3 2 3 3 3 3 ... ..- attr(*, allele.names)= chr G A ..- attr(*, allele.map)= chr [1:3, 1:2] A G G A ... $ rs120: Factor w/ 2 levels A/A,A/G: 1 1 1 1 1 1 1 1 1 1 ... ..- attr(*, allele.names)= chr A G ..- attr(*, allele.map)= chr [1:2, 1:2] A A A G Farrel Buchinsky [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Has something changed in R that requires an update in the genetics package by Gregory Warnes? I am using R version 2.5.0 This used to work summary(founders[,59]) to prove that it is a genotype class class(founders[,59]) [1] genotype factor Now when I issue the command: summary(founders[,59]) I get: Error in attr(retval, which) - which : attempt to set an attribute on NULL In addition: Warning message: $ operator is deprecated for atomic vectors, returning NULL in: x$allele.names Clearly, I am missing something. What am I missing? -- Farrel Buchinsky [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [OT] good reference for mixed models and EM algorithm
On Feb 10, 2008 2:32 PM, Spencer Graves [EMAIL PROTECTED] wrote: Hi, Erin: Have you looked at Pinheiro and Bates (2000) Mixed-Effects Models in S and S-Plus (Springer)? As far as I know, Doug Bates has been the leading innovator in this area for the past 20 years. Pinheiro was one of his graduate students. The 'nlme' package was developed by him or under his supervision, and 'lme4' is his current development platform. The ~R\library\scripts subdirectory contains ch01.R, ch02.R, etc. = script files to work the examples in the book (where ~R = your R installation directory). There are other good books, but I recommend you start with Pinheiro and Bates. Except that Doug Bates doesn't use the EM algorithm for fitting mixed models any more. The lme4 package previously had an option for starting with EM (actually ECME, which is a variant of EM) iterations but I have since removed it. For large data sets and especially for models with non-nested random effects, the EM iterations just slowed things down relative to direct optimization of the log-likelihood. Spencer Graves Erin Hodgess wrote: Dear R People: Sorry for the off-topic. Could someone recommend a good reference for using the EM algorithm on mixed models, please? I've been looking and there are so many of them. Perhaps someone here can narrow things down a bit. Thanks in advance, Sincerely, Erin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] building packages for Linux vs. Windows
Erin Hodgess wrote: Hi R People: I sure that this is a really easy question, but here goes: I'm trying to build a package that will run on both Linux and Windows. However, there are several commands in a section that will be different in Linux than they are in Windows. Erin Several people have indicated how to do this, but I encourage you to be sure you really need to do it. Many things can be made to work the same way on all OSs, and packages are much easier to maintain if you do not have several variants. You might consider posting a few example of where you find this necessary, and ask if there is an OS independent way to do it. Paul Gilbert Would I be better off just to build two separate packages, please? If just one is needed, how could I determine which system is running in order to use the correct command, please? Thanks in advance, Erin La version française suit le texte anglais. This email may contain privileged and/or confidential in...{{dropped:26}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Tinn-R not working well with latest R
Hallo, I had the same problems before. I think the best solution is that you just copy the needed codepart out of Tinn-R with Ctr+C. Then open R directly from your desktop NOT from Tinn-R. Than paste in the command. you can still make changes in the command when you have not pressed enter by using the arrow buttons of the keyboard. put the curse where you want in the command line and change it. Hope that is what you want. I cannot imitate your example. Corinna [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nlme special case of corARMA?
Dear All: I am trying to fit a special case of a 2-banded Toeplitz correlation structure. A 2-banded Toeplitz has ones on the diagonal, a correlation, RHO1, on the first off-diagonal, and a correlation, RHO2, on the second off-diagonal, with zeros on all subsequent off-diagonals. After reading relevant sections in Mixed-Effects Models in S and S-PLUS (Pinheiro Bates, 2000) and searching on the R-help archives, I've figured out how to get the 2-banded Toeplitz, but not the desired special case. In the example below, the initial value RHO1 = 0 and RHO2= -0.3. The output matrix below is an example of the special case I'd like to fit -- a 2-banded Toeplitz constraining RHO1=0. Fitting the 2-banded Toeplitz structure to the ``Orthodont'' example dataset provided in R-Help, we estimate RHO1 also (since the example matrix below contains INITIAL values). -Start R-code output --- #This intilizes a 2-banded Toeplitz structure cs1ARMA - corARMA(value = c(0,-.3), form = ~ 1 | Subject, p = 2, q = 0) cs1ARMA - Initialize(cs1ARMA, data = Orthodont) corMatrix(cs1ARMA)$M01 [,1] [,2] [,3] [,4] [1,] 1.0 0.0 -0.3 0.0 [2,] 0.0 1.0 0.0 -0.3 [3,] -0.3 0.0 1.0 0.0 [4,] 0.0 -0.3 0.0 1.0 TOEP2 - gls(distance ~ Sex * I(age - 11), Orthodont, +correlation = corARMA(value = c(0,-.3), +form = ~ 1 | Subject, p = 2, q = 0), + weights = varIdent(form = ~ 1 | age)) #-- Selected output follows- Correlation Structure: ARMA(2,0) Formula: ~1 | Subject Parameter estimate(s): Phi1 Phi2 0.3269544 0.4897645 - End R-code output I cannot figure out how to restrict RHO1 = 0, while allowing estimation of RHO2. Maybe an answer lies in specifying a different ``position vector'' other than the default: corARMA(..., form = ~ 1 | Subject ...). (See p226 of Pinheiro Bates, 2000 for explanation of a position vector.) But I'm not totally sure if I understand the position vector and I know I don't know how it works in R. Then again, there is likely a completely different way to solve this problem. Any help will be appreciated! Reid [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using R in a university course: dealing with proposal comments
Arin Basu-3 wrote: Comment 2: Finally, on a minor point, why is R the statistical software being used? SPSS is probably more widely available in the workplace – certainly in areas of social policy etc. (Prof NB) What struck me in the above is the probably. How probable is it, anything to substantiate the claim? Anyway, whether one package is more widely available in the workplace than another is somewhat of a moot point. If a student learns how to use one software package then they start to get pigeon-holed into using that particular software package. Many jobs are advertised with SPSS/SAS/Stata/S-Plus (add/subtract at will) skills/knowledge required (or at least desirable). The prospective job applicant may think Well I don't know how to use that so I shan't bother applying or they may be unwilling to re-learn how to use a new stats package after months/years of investment in learning how to use another package, alternatively they may well just loose out to someone who already has the experience/skills. (Most) of this problem isn't negated when using R. Start a new job and use the (excellent, extensible, and free) software that you've been using for years. I'd stick with using R to teach your statistics, in the long-run any of them who continue to perform statistical analysis will be grateful. Neil -- View this message in context: http://www.nabble.com/Using-R-in-a-university-course%3A-dealing-with-proposal-comments-tp15405138p15413122.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Gini index of frequencies in a data frame
Dear All, I wish to calculate the Gini index (ineq from same package) and some other indices for the diameter distribution of each plot (df dgtot). dgtot: IDPlotDiameter(cm) 14 34.0 24 23.0 34 38.0 ... 51 5 16.0 52 5 8.0 53 5 9.0 ... 5301 140 25.0 5302 140 12.0 5303 140 7.0 I use: aggregate(dgtot,by=list(dgtot$IDSupr),FUN=ineq(dsp)) where dsp - function(x) # compute frequency distribution for each plot { cd-seq(5,max(x),by=2) Fi - table(cut(x, br = seq(5, max(x)+1, 2), right = FALSE)) K - length(names(Fi)) } but, the result was: Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 'x' must be atomic I'm at the beginning in R and I kindly request your experienced help. Thank you, Marius Teodosiu Looking for last minute shopping deals? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tree() producing NA's
Take a look at the levels of 'owner'. On Mon, 11 Feb 2008, Amnon Melzer wrote: Hi Hoping someone can help me (a newbie). I am trying to construct a tree using tree() in package tree. One of the fields is a factor field (owner), with many levels. In the resulting tree, I see many NA's (see below), yet in the actual data there are none. You are misinterpreting this: those are level names. Using a tree with a factor with many levels is a very bad idea: it takes a long time to compute (unless the response is binary) and almost surely overfits. rr200.tr - tree(backprof ~ ., rr200) rr200.tr 1) root 200 1826.00 -0.2332 ... [snip] ... 5) owner: Cliveden Stud,NA,NA,NA,NA,NA,NA,NA,NA 10 14.25 1.5870 * 3) owner: B E T Partnership,Flaming Sambuca Syndicate,NA,NA,NA,NA,NA,NA,NA,NA 11 384.40 10.5900 6) decodds 12 5 74.80 6.3000 * 7) decodds 12 6 140.80 14.1700 * Can anyone tell me why this happens and what I can do about it? Well, you could follow the request at the footer of this and every R-help message. Regards Amnon [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Any one is porting or has ported prtools (http://www.prtools.org/) to R ?
Hi, Any one is porting or has ported prtools (http://www.prtools.org/) to R ? Thanks Stanley __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scatterplot in CAR
Dear Aimin, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of Aimin Yan Sent: February-11-08 7:22 AM To: r-help@r-project.org Subject: [R] scatterplot in CAR I am trying to use scatterplot function in CAR like the following: scatterplot(X~Y) I want to label X points and Y ponits using the different color. Any idea for this? Aimin I'm afraid that I don't understand the question: scatterplot(X~Y) will make a scatterplot with the variable X on the vertical axis and Y on the horizontal axis. Did you really want to do that? Moreover, as in any scatterplot, the variables Y and X will define the coordinates of the points -- there are not distinct X points and Y points. Regards, John John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] scatterplot in CAR
I am trying to use scatterplot function in CAR like the following: scatterplot(X~Y) I want to label X points and Y ponits using the different color. Any idea for this? Aimin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The function predict
Carla Rebelo crebelo at liaad.up.pt writes: May you help me? I need to understand the function predict. I need to understand the algorithm implemented, the calculations associated. Where can I find this information? In the documentation: predict is a generic function for predictions from the results of various model fitting functions. The function invokes particular methods which depend on the class of the first argument. There is no information available for the default predict function, but there is information for the predict.XXX implementations mentioned further below: See Also predict.glm, predict.lm, predict.loess, predict.nls, predict.poly, predict.princomp, predict.smooth.spline. For time-series prediction, predict.ar, predict.Arima, predict.arima0, predict.HoltWinters, predict.StructTS. For details, you should look into the examples provided with predict.lm (as the simplest starter), and the code. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.