[R] rpart.object help
Hi, Suppose i have generated an object using the following : fit - rpart(Kyphosis ~ Age + Number + Start, data=kyphosis) And when i print fit, i get the following : n= 81 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 81 17 absent (0.7901235 0.2098765) 2) Start=8.5 62 6 absent (0.9032258 0.0967742) 4) Start=14.5 29 0 absent (1.000 0.000) * 5) Start 14.5 33 6 absent (0.8181818 0.1818182) 10) Age 55 12 0 absent (1.000 0.000) * 11) Age=55 21 6 absent (0.7142857 0.2857143) 22) Age=111 14 2 absent (0.8571429 0.1428571) * 23) Age 111 7 3 present (0.4285714 0.5714286) * 3) Start 8.5 19 8 present (0.4210526 0.5789474) * Is it possible to extract the splits alone as a matrix using rpart.object? If so, how? Regards, Jagdeesh -- View this message in context: http://r.789695.n4.nabble.com/rpart-object-help-tp3085054p3085054.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rpart.object help
On Sun, 12 Dec 2010, jagdeesh_mn wrote: Hi, Suppose i have generated an object using the following : fit - rpart(Kyphosis ~ Age + Number + Start, data=kyphosis) And when i print fit, i get the following : n= 81 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 81 17 absent (0.7901235 0.2098765) 2) Start=8.5 62 6 absent (0.9032258 0.0967742) 4) Start=14.5 29 0 absent (1.000 0.000) * 5) Start 14.5 33 6 absent (0.8181818 0.1818182) 10) Age 55 12 0 absent (1.000 0.000) * 11) Age=55 21 6 absent (0.7142857 0.2857143) 22) Age=111 14 2 absent (0.8571429 0.1428571) * 23) Age 111 7 3 present (0.4285714 0.5714286) * 3) Start 8.5 19 8 present (0.4210526 0.5789474) * Is it possible to extract the splits alone as a matrix using rpart.object? If so, how? What do you think 'rpart.object' is? There is no such function in R. If you read help(rpart.object) it describes the returned object. You are probably looking for fit$frame, but if you want something else, study rpart:::print.rpart to see how that output is computed. Regards, Jagdeesh -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] descriptive statistics
Hi. In a data set I have a variable that takes values from 1 to 14. For each subgroup of values of this variable, I would like to obtain some descriptive statistics of other variables present in the data set. I've been trying with a for loop but I couldn't get nothing. Could you please suggest me some lines? -- View this message in context: http://r.789695.n4.nabble.com/descriptive-statistics-tp3085197p3085197.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] descriptive statistics
?aggregate ?doBy::summaryBy Le 12/13/2010 11:04, effeesse a écrit : Hi. In a data set I have a variable that takes values from 1 to 14. For each subgroup of values of this variable, I would like to obtain some descriptive statistics of other variables present in the data set. I've been trying with a for loop but I couldn't get nothing. Could you please suggest me some lines? -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] package sampling
Which version of R and sampling? Where is the reproducible code? Uwe Ligges On 11.12.2010 16:16, andrija djurovic wrote: Hi R users. I have a problem with function strata in sampling packages. st0 = strata(dom, stratanames=stratas, size=sample.size, method=systematic,pik, FALSE) Error in sort.list(y) : 'x' must be atomic for 'sort.list' Have you called 'sort' on a list? In previous version of R 2.9.1 and previous version of package sampling this code worked well and now I don't know what is a problem? Any ideas how to solve this problem will be very useful? Thanks in advance Andrija [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot's aspect ratio and pty
It does for me under R-2.12.0 32-bit on Windows with the windows() device, so: Which version of R, which OS, which device do you use? Uwe Ligges On 13.12.2010 07:30, Marcin Kozak wrote: Dear All, I've been playing with pty, and it seems it does not produce square plots as it is expected to (or at least as I expect it to). Consider this simple example: par(pty=s); plot(1:10, 1:10) This should produce a square plot, right? Well, if you have a look at the graph, it is not square! So, maybe the limits? par(pty=s); plot(1:10, 1:10, xlim = c(0,11), ylim=c(0,11)) No, again not. So let's try and help to equalize everything, just to be sure: windows(6, 6); par(mar=c(3, 3, 3, 3), pty=s); plot(1:10, 1:10, xlim = c(0, 11), ylim = c(0, 11)) Again not! pty = s is to generate a square plotting region, and it does not seem to do that. Where is my mistake? Thanks in advance, Marcin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] descriptive statistics
On 12/13/2010 09:04 PM, effeesse wrote: Hi. In a data set I have a variable that takes values from 1 to 14. For each subgroup of values of this variable, I would like to obtain some descriptive statistics of other variables present in the data set. I've been trying with a for loop but I couldn't get nothing. Could you please suggest me some lines? Hi effeesse, Sure: testmat-data.frame(sample(1:14,50,TRUE),rnorm(50),runif(50)) by(testmat[,-1],testmat[,1],mean) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot's aspect ratio and pty
R-11.1 on both 32-bit and 64-bit on Windows with the windows() device. Best Marcin 2010/12/13 Uwe Ligges lig...@statistik.tu-dortmund.de: It does for me under R-2.12.0 32-bit on Windows with the windows() device, so: Which version of R, which OS, which device do you use? Uwe Ligges On 13.12.2010 07:30, Marcin Kozak wrote: Dear All, I've been playing with pty, and it seems it does not produce square plots as it is expected to (or at least as I expect it to). Consider this simple example: par(pty=s); plot(1:10, 1:10) This should produce a square plot, right? Well, if you have a look at the graph, it is not square! So, maybe the limits? par(pty=s); plot(1:10, 1:10, xlim = c(0,11), ylim=c(0,11)) No, again not. So let's try and help to equalize everything, just to be sure: windows(6, 6); par(mar=c(3, 3, 3, 3), pty=s); plot(1:10, 1:10, xlim = c(0, 11), ylim = c(0, 11)) Again not! pty = s is to generate a square plotting region, and it does not seem to do that. Where is my mistake? Thanks in advance, Marcin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re : descriptive statistics
A nice way to obtain summary for data is to use summary.formula in Hmisc package. Justin BEM BP 1917 Yaoundé Tél (237) 76043774 De : Jim Lemon j...@bitwrit.com.au À : effeesse scarpin...@gmail.com Cc : r-help@r-project.org Envoyé le : Lun 13 décembre 2010, 11h 23min 15s Objet : Re: [R] descriptive statistics On 12/13/2010 09:04 PM, effeesse wrote: Hi. In a data set I have a variable that takes values from 1 to 14. For each subgroup of values of this variable, I would like to obtain some descriptive statistics of other variables present in the data set. I've been trying with a for loop but I couldn't get nothing. Could you please suggest me some lines? Hi effeesse, Sure: testmat-data.frame(sample(1:14,50,TRUE),rnorm(50),runif(50)) by(testmat[,-1],testmat[,1],mean) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summary (Re: (S|odf)weave : how to intersperse (\LaTeX{}|odf) comments in source code ? Delayed R evaluation ?)
Dear Emmanuel and dear list, Therefore, I let this problem to sleep. However, I Cc this answer (with the original question below) to Max Kuhn and Friedrich Leisch, in the (faint) hope that this feature, which does not seem to have been missed by anybody in 8 years, I've been missing it every once in a while, but till now I could always rephrase the problem with expand = FALSE or functions, and the chunk that does the actual calculation at the end. Most often, however, I'm just lazy and use R comments. If math should go in there, I use listings instead of fancyvrb with the modified Sweave.sty that hopefully is attached (if not, see below). Here's an example chunk: keep.source=TRUE= 1 / 2 # $\frac{1}{x}$ 4 + 4 # Here may come lots of explanations, that are in a \LaTeX\ paragraph\footnote{blabla}: even long lines are properly broken.\\ Though the new lines start at the beginning of the line. \\[6pt] And a line break in the chunk source will of course be interpreted as R again: so no new paragraphs inside the same comment. # But there can be new commented lines. 3 + 6 # Note that comment only lines at the end of a code chunk seem to be lost. # Not only one but all that aren't followed by R code @ (the second line should be very long, I somehow can't keep thunderbird from inserting line breaks) Hope that helps a bit, Claudia === modified Sweave.sty === \NeedsTeXFormat{LaTeX2e} \ProvidesPackage{Sweave}{} \RequirePackage{ifthen} \newboolean{swe...@gin} \setboolean{swe...@gin}{true} \newboolean{swe...@ae} \setboolean{swe...@ae}{true} \declareoption{nogin}{\setboolean{swe...@gin}{false}} \declareoption{noae}{\setboolean{swe...@ae}{false}} \ProcessOptions \RequirePackage{graphicx,listings} \IfFileExists{upquote.sty}{\RequirePackage{upquote}}{} \ifthenelse{\boolean{swe...@gin}}{\setkeys{gin}{width=0.8\textwidth}}{}% \ifthenelse{\boolean{swe...@ae}}{% \RequirePackage[T1]{fontenc} \RequirePackage{ae} }{}% \lstnewenvironment{Sinput}{\lstset{language=R,basicstyle=\sl,texcl, commentstyle=\upshape}}{} \lstnewenvironment{Soutput}{\lstset{language=R}}{} \lstnewenvironment{Scode}{\lstset{language=R,basicstyle=\sl}}{} \newenvironment{Schunk}{}{} \newcommand{\Sconcordance}[1]{% \ifx\pdfoutput\undefined% \csname newcount\endcsname\pdfoutput\fi% \ifcase\pdfoutput\special{#1}% \else\immediate\pdfobj{#1}\fi} -- Claudia Beleites Dipartimento dei Materiali e delle Risorse Naturali Università degli Studi di Trieste Via Alfonso Valerio 6/a I-34127 Trieste phone: +39 0 40 5 58-37 68 email: cbelei...@units.it \NeedsTeXFormat{LaTeX2e} \ProvidesPackage{Sweave}{} \RequirePackage{ifthen} \newboolean{swe...@gin} \setboolean{swe...@gin}{true} \newboolean{swe...@ae} \setboolean{swe...@ae}{true} \declareoption{nogin}{\setboolean{swe...@gin}{false}} \declareoption{noae}{\setboolean{swe...@ae}{false}} \ProcessOptions \RequirePackage{graphicx,listings} \IfFileExists{upquote.sty}{\RequirePackage{upquote}}{} \ifthenelse{\boolean{swe...@gin}}{\setkeys{gin}{width=0.8\textwidth}}{}% \ifthenelse{\boolean{swe...@ae}}{% \RequirePackage[T1]{fontenc} \RequirePackage{ae} }{}% \lstnewenvironment{Sinput}{\lstset{language=R,basicstyle=\sl,texcl, commentstyle=\upshape}}{} \lstnewenvironment{Soutput}{\lstset{language=R}}{} \lstnewenvironment{Scode}{\lstset{language=R,basicstyle=\sl}}{} \newenvironment{Schunk}{}{} \newcommand{\Sconcordance}[1]{% \ifx\pdfoutput\undefined% \csname newcount\endcsname\pdfoutput\fi% \ifcase\pdfoutput\special{#1}% \else\immediate\pdfobj{#1}\fi} \documentclass{article} \begin{document} keep.source=TRUE= 1 / 2 # $\frac{1}{x}$ 4 + 4 # Here may come lots of explanations, that are in a \LaTeX\ paragraph\footnote{blabla}: even long lines are properly broken.\\ Though the new lines start at the beginning of the line. \\[6pt] And a line break in the chunk source will of course be interpreted as R again: so no new paragraphs inside the same comment. # But there can be new commented lines. 3 + 6 # Note that comment only lines at the end of a code chunk seem to be lost. # Not only one but all that aren't followed by R code @ \end{document} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to compare two square matrices
Hi, I have two matrices containing some probabilities score obtained from two different prediction programs. Now, I want to compare these two matrices to measure the difference. Could you please suggest some method to do this in R. Thanks Sabari [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] descriptive statistics
Another way is the remix function of the remix package. On Monday, December 13, 2010, justin bem justin_...@yahoo.fr wrote: A nice way to obtain summary for data is to use summary.formula in Hmisc package. Justin BEM BP 1917 Yaoundé Tél (237) 76043774 De : Jim Lemon j...@bitwrit.com.au À : effeesse scarpin...@gmail.com Cc : r-help@r-project.org Envoyé le : Lun 13 décembre 2010, 11h 23min 15s Objet : Re: [R] descriptive statistics On 12/13/2010 09:04 PM, effeesse wrote: Hi. In a data set I have a variable that takes values from 1 to 14. For each subgroup of values of this variable, I would like to obtain some descriptive statistics of other variables present in the data set. I've been trying with a for loop but I couldn't get nothing. Could you please suggest me some lines? Hi effeesse, Sure: testmat-data.frame(sample(1:14,50,TRUE),rnorm(50),runif(50)) by(testmat[,-1],testmat[,1],mean) Jim __ r-h...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot's aspect ratio and pty
On 13.12.2010 11:29, Marcin Kozak wrote: R-11.1 This one does not exist. Please try R-2.12.0 (but it also worked with R-2.11.1 if you meant that). My guess is that you are confusing plotting region with device region. In order to get a squared device region, you have to ask the device function for a squared region. Best, Uwe Ligges on both 32-bit and 64-bit on Windows with the windows() device. Best Marcin 2010/12/13 Uwe Liggeslig...@statistik.tu-dortmund.de: It does for me under R-2.12.0 32-bit on Windows with the windows() device, so: Which version of R, which OS, which device do you use? Uwe Ligges On 13.12.2010 07:30, Marcin Kozak wrote: Dear All, I've been playing with pty, and it seems it does not produce square plots as it is expected to (or at least as I expect it to). Consider this simple example: par(pty=s); plot(1:10, 1:10) This should produce a square plot, right? Well, if you have a look at the graph, it is not square! So, maybe the limits? par(pty=s); plot(1:10, 1:10, xlim = c(0,11), ylim=c(0,11)) No, again not. So let's try and help to equalize everything, just to be sure: windows(6, 6); par(mar=c(3, 3, 3, 3), pty=s); plot(1:10, 1:10, xlim = c(0, 11), ylim = c(0, 11)) Again not! pty = s is to generate a square plotting region, and it does not seem to do that. Where is my mistake? Thanks in advance, Marcin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Browsing through a dataframe page by page (like with shell command more)
Hello, I'm looking for an easy way to display a data.frame (or other variables) page by page, similarly to what is possible on a file using the more command in a standard UNIX shell. Any help would be greatly appreciated. Thanks Alexandre [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot's aspect ratio and pty
On Mon, 13 Dec 2010, Uwe Ligges wrote: It does for me under R-2.12.0 32-bit on Windows with the windows() device, so: Which version of R, which OS, which device do you use? Since (s)he used windows() we know the OS. But I think one possible explanation is on the help page, arguments 'xpinch' and 'ypinch': most likely this is one of those devices with a mendacious Windows driver. Of course. we simply do not kwow what (s)her saw, and it is possible that this is simply a misunderstanding of 'plot region'. Uwe Ligges On 13.12.2010 07:30, Marcin Kozak wrote: Dear All, I've been playing with pty, and it seems it does not produce square plots as it is expected to (or at least as I expect it to). Consider this simple example: par(pty=s); plot(1:10, 1:10) This should produce a square plot, right? Well, if you have a look at the graph, it is not square! So, maybe the limits? par(pty=s); plot(1:10, 1:10, xlim = c(0,11), ylim=c(0,11)) No, again not. So let's try and help to equalize everything, just to be sure: windows(6, 6); par(mar=c(3, 3, 3, 3), pty=s); plot(1:10, 1:10, xlim = c(0, 11), ylim = c(0, 11)) Again not! pty = s is to generate a square plotting region, and it does not seem to do that. Where is my mistake? Thanks in advance, Marcin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] overlap different line in a xyplot (lattice)
From: fe...@nfrac.org Date: Sun, 12 Dec 2010 11:47:55 +1100 Subject: Re: [R] overlap different line in a xyplot (lattice) To: ehl...@ucalgary.ca CC: nutini.france...@gmail.com; r-help@r-project.org On 12 December 2010 00:08, Peter Ehlers ehl...@ucalgary.ca wrote: On 2010-12-11 03:12, Francesco Nutini wrote: mmmh, yes this method works... but I have to overlap this two graphs: xyplot(a ~b |sites, data=dataset, col=red) xyplot(c ~b |sites, data=dataset, col=blue) a, b and c are columns in the same dataset. Sites is also a column in the dataset, but it's a factorial variables. How can I use your method? The idea is the same: you need to get your data into long format with a grouping variable and then use the 'groups' argument to xyplot. Here's fake data frame (you should have provided one): DF - data.frame(y1 = rnorm(30), y2 = rnorm(30) + 2, x = rep(1:10, 3), sites = gl(3, 10, lab=LETTERS[1:3])) ## Use the reshape2 package to melt the data: ## (or use reshape() in base R) require(reshape2) DF1 - melt(DF, measure.vars = c('y1', 'y2'), variable.name = 'grp', value.name = 'y') ## and plot: require(lattice) p - xyplot( y ~ x | sites, data = DF1, groups = grp, col = c(red, blue), type = b) print(p) Peter Ehlers By the way, in this particular case there is a shortcut which does the reshaping internally: xyplot(y1 + y2 ~ x | sites, DF, type = b) Great Felix! this is what I was looking for! But if y1 and y2 have a different scales? Can I plot, for example y2, on secondary axis? Thanks for your help, Francesco Nutini sorry for my ignorance! Francesco Nutini Date: Fri, 10 Dec 2010 10:13:00 -0800 From: ehl...@ucalgary.ca To: nutini.france...@gmail.com CC: r-help@r-project.org Subject: Re: [R] [r] overlap different line in a xyplot (lattice) On 2010-12-10 07:04, Francesco Nutini wrote: dear [R] users, is there a way to plot different data (but with the same x-variables) in the same xyplot window? There are already a similar question, but the answer is not enought explanatory... Something like this? x - rep(1:10, 2) y1 - rnorm(10); y2 - rnorm(10) + 2 y - c(y1, y2) g - gl(2, 10) xyplot( y ~ x, groups = g, type = 'b') Peter Ehlers Thanks a lot, Francesco __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Felix Andrews / $B0BJ!N)(B http://www.neurofractal.org/felix/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rpart.object help
Prof Brian Ripley wrote: On Sun, 12 Dec 2010, jagdeesh_mn wrote: Hi, Suppose i have generated an object using the following : fit - rpart(Kyphosis ~ Age + Number + Start, data=kyphosis) And when i print fit, i get the following : n= 81 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 81 17 absent (0.7901235 0.2098765) 2) Start=8.5 62 6 absent (0.9032258 0.0967742) 4) Start=14.5 29 0 absent (1.000 0.000) * 5) Start 14.5 33 6 absent (0.8181818 0.1818182) 10) Age 55 12 0 absent (1.000 0.000) * 11) Age=55 21 6 absent (0.7142857 0.2857143) 22) Age=111 14 2 absent (0.8571429 0.1428571) * 23) Age 111 7 3 present (0.4285714 0.5714286) * 3) Start 8.5 19 8 present (0.4210526 0.5789474) * Is it possible to extract the splits alone as a matrix using rpart.object? If so, how? What do you think 'rpart.object' is? There is no such function in R. If you read help(rpart.object) it describes the returned object. You are probably looking for fit$frame, but if you want something else, study rpart:::print.rpart to see how that output is computed. Regards, Jagdeesh -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Thanks Mr. Brian. That kind of answers my query. On the same note I would like to ask few other questions. Sorry if you find them naive, I am a novice in this subject and am trying to get a grip on things. 1. I am using R package using my code and the fitted object looks like this : The Model representation : n= 60 node), split, n, deviance, yval * denotes terminal node 1) root 60 983551500 12615.670 2) dataFrame[, 6]='Small' 13 21804710 7682.385 * 3) dataFrame[, 6]='Compact','Large','Medium','Sporty','Van' 47 557851600 13980.190 6) dataFrame[, 3]='Japan/USA','Korea','USA' 29 13105 12673.030 12) dataFrame[, 6]='Compact','Sporty' 14 11426050 11055.570 * 13) dataFrame[, 6]='Large','Medium','Van' 15 48812470 14182.670 * 7) dataFrame[, 3]='France','Germany','Japan','Sweden' 18 297418200 16086.170 * What does the term deviance here stand for? 2. Could you also suggest me some readings on the topic of CnR trees specific to R with case studies? Regards, Jagdeesh -- View this message in context: http://r.789695.n4.nabble.com/rpart-object-help-tp3085054p3085183.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with retrieve.nc of clim.pact
Goodmorning to everyone, I'm new so sorry for bad english and formulation. I go to the point: I am using since months the retrieve.nc function from clim.pact to extract data from netcdf files. I always had no problem, but some days ago I updated both R and the package to the latest versione (don't ask me what was the previous R version, cause I don't remember). However, now i am in serious trouble. When i try opening the netcdf file, i use a-retrieve.nc(filename=/Users/P4blo/Desktop/DATA/Raw/Z500/Z500-1979-2010.nc,v.nam=hgt,t.rng=c(1,31)) just as was doing in the past. but know, i obtain not more a 3D matrix but a 4D one! [1] ordinary [1] Attribute time_origin not found [1] The chronology is not straight forward: dt= 24 interval span= 275376 data points= 11475 [1] ncid$dim$'time$units' [1] Time, units: hours since 1-1-1 00:00:0.0 [1] torg= 1-1-1 00:00:0.0 [1] Time origin: (year-month-day) 1 - 1 - 1 [1] Time unit: hours [1] Latitudes: 0 - 90 degrees_north [1] Longitudes: 0 - 357.5 degrees_east [1] Reading hgt [1] read the data from EASTERN hemisphere start count varsize [1,] 1 144 144 [2,] 137 37 [3,] 1 1 1 [4,] 131 11475 [1] dim dat: [1] 31 1 37 144 [1] 144 0 37 1 31 [1] 0.0 357.5 [1] 144 37 1 31 [1] 4D: [1] 31 1 37 144 [1] Sort longs and lats [1] First last records: 1979 1 1 1979 1 31 [1] BEFORE scale adjustment weeding Min. 1st Qu. MedianMean 3rd Qu.Max. 468451725491547358185911 [1] AFTER scale adjustment weeding [1] Scaling: dat - dat * 1 [1] Offset: dat - dat + 32066 Min. 1st Qu. MedianMean 3rd Qu.Max. 36750 37240 37560 37540 37880 37980 [1] dimensions 31 1 37 144 but it is not over: the biggest problem is that if i try to plot the matrix, i obtain a complete a completely fuzzy field, because i suppose that somewhere the reading of the file was gone bad. for instance, i use image.plot(a$dat[1,1,,]) has anyone a similar problem? thank you to everyone who will answer me. Paolo Davini -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-retrieve-nc-of-clim-pact-tp3085201p3085201.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] post-hoc test for ANCOVA method
Dear [R] Users, I have implemented a linear model with this syntax: model- lm (var_dependent ~ var_indipendent + factor + var_indipendent : factor, dataframe) anova (model) Response: var_dependent Df Sum Sq Mean Sq F value Pr(F) var_indipendent 1 20.5522 20.5522 87.8701 1.167e-14 *** factor 1 0.10600.1060 0.4530 0.50277 var_indipendent:factor 1 1.38611.3861 5..92610.01706 * Residuals 83 19.4132 0.2339 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 The factor variable influence significatvly the regression. Which test I have to use to understand whom factors (i.e. in my dataset factors are the different sampling sites) influence the correlation? Any suggestions how to perform post-hoc comparions? Thanks a lot! Francesco Nutini P.S. numbers have no significance, it's just an example [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Wrong contrast matrix for nested factors in lm(), rlm(), and lmRob()
This message also reports wrong estimates produced by lmRob.fit.compute() for nested factors when using the correct contrast matrix. And in these respects, I have found that S-Plus behaves the same way as R. Using the three available contrast types (sum, treatment, helmert) with lm() or lm.fit(), but just contr.sum with rlm() and lmRob(), and small examples, I generated contrast matrices for four models involving nested factors with fixed effects. For three of the models the matrices were incorrect. - Details - For lm() and rlm() I used two data frames: In same.df the nested factor, D, has the same number of levels for each level of the nesting factor, G: G D T1 1 g1 d1 -60 2 g1 d2 -50 3 g1 d3 -40 4 g2 d1 30 5 g2 d2 50 6 g2 d3 70 In diff.df the nested factor, D, has a different number of levels for the two levels of the nesting factor, G: G D T2 1 g1 d1 -60 2 g1 d2 -50 3 g1 d3 -40 4 g2 d1 20 5 g2 d2 40 6 g2 d3 60 7 g2 d4 80 (G, D are factors; T1, T2 are numeric) For lmRob() I expanded the data frames to 600 or 700 rows by replicating them 100 times and adding error to the observations. For lm() the models and commands were (1) model.matrix(lm(T1 ~ G + D%in%G, same.df)) (2) model.matrix(lm(T1 ~ D%in%G, same.df)) (3) model.matrix(lm(T2 ~ G + D%in%G, diff.df)) (4) model.matrix(lm(T2 ~ D%in%G, diff.df)) Using (1), all three types of contrast matrix were correctly generated Using (2), the same incorrect contrast matrix was generated for all three contrast types. Using (3), an incorrect contrast matrix was generated for each of the three contrast types. For contr.treatment the error was an extra column of zeros; for the others the error was more serious. Using (4), the same incorrect contrast matrix was generated for all three contrast types. I used only contr.sum with rlm() and lmRob(). For model (1) the programs worked correctly, but for models (2), (3), and (4) with the formulas above, rlm() and lmRob() both reported that x was singular. When x was the correct contrast matrix and T was the observation vector, rlm(x,T) worked correctly for (2), (3), and (4), as did lm.fit(x,T). However, whereas lmRob.fit.compute(x2=NULL,y=T,x1=x) worked correctly for (3), the estimates it produced for (2) and (4) were radically wrong (and were the same for different random seeds and initial algorithms). --- Questions: - (1) If there is a way to use lm(), rlm(), and lmRob() in such cases so that they generate the correct contrast matrices (and the desired parameter estimates), what is it? (2) If there is no way to do this, is the best alternative for the user to create the desired model matrices by hand and provide them as arguments to lim.fit(), rlm(), and lmRob.fit.compute()? This would also require that lmRob.fit.compute() generate the correct estimates. (3) If one uses lm.fit() and lmRob.fit.compute() directly in this way, then, given that one is warned against doing so, what are the dangers? (4) According to cran.r-project.org/web/views/Robust.html, lmRob() makes use of the M-S algorithm of Maronna and Yohai (2000), automatically when there are factors among the predictors (where S-estimators (and hence MM-estimators) based on resampling typically badly fail). Is there an alternative program that uses the M-S algorithm, if lmRob() or lmRob.fit.compute() cannot be made to work? R%%sessionInfo() R version 2.12.0 (2010-10-15) Platform: x86_64-redhat-linux-gnu (64-bit) I'll be very grateful for any help. Saul Sternberg, Psychology University of Pennsylvania __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simple plotting question
to add to Michael's response: http://www.statmethods.net/advgraphs/parameters.html On Mon, Dec 13, 2010 at 2:23 AM, Michael Bedward michael.bedw...@gmail.com wrote: Hello Erin, Try this... plot(x, y, type=b, pch=16) Michael On 13 December 2010 18:11, Erin Hodgess erinm.hodg...@gmail.com wrote: Dear R People: When I plot using type=b, I have circles and lines, which is as it should be. Is there a way to have filled in circles using the type argument, please? Or do I need to call the points function also, please? Thanks, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why do we have to turn factors into characters for various functions?
Hello Petr, don't want to convince you. If you like the following: x - factor(1:4, labels=c(one, two, three, four)) y - factor(3:5, labels=c(three, four, five)) data.frame(character=c(as.character(x), as.character(y)), numeric=c(x, y)) character numeric 1 one 1 2 two 2 3 three 3 4 four 4 5 three 1 6 four 2 7 five 3 For me the behaviour of character vectors is easier to follow and less errror prone. cx - c(one, two, three, four) cy - c(three, four, five) c(cx, cy) [1] one two three four three four five Anyway it is maybe more about personal habits than about bad factor features I agree with you regarding personal habits. It's not the features of factors. For me it's the rather inconsistent use in functions like c() or print(). If you print a factor, you see it's levels, but if you combine it using c(), you combine the famouse implementation specific underlying integer vector. best regards, Heinz At 13.12.2010 08:50 +0100, Petr PIKAL wrote: Hi r-help-boun...@r-project.org napsal dne 12.12.2010 21:00:37: At 12.12.2010 00:48 +0200, Tal Galili wrote: Hello dear R-help mailing list, My question is *not* about how factors are implemented in R (which is, if I understand correctly, that factors keeps numbers and assign levels to them). My question *is* about why so many functions that work on factors don't treat them as characters by default? Here are two simple examples: Example one turning the characters inside a factor into numeric: x - factor(4:6) as.numeric(x) # output: 1 2 3 as.numeric(as.character(x)) # output: 4 5 6 # isn't this what we wanted? Example two, using strsplit on a factor: x - factor(paste(letters[4:6], 4:6, sep=A)) strsplit(x, A) # will result in an error: # Error in strsplit(x, A) : non-character argument strsplit(as.character(x), A) # will work and split So what is the reason this is the case? Is it that implementing a switch of factors to characters as the default in some of the basic function will cause old code to break? Is it a better design in some other way? I am curious to know the reason for this. In my view the answer can be found implicitly in the language definition. Factors are currently implemented using an integer array to specify the actual levels and a second array of names that are mapped to the integers. Rather unfortunately users often make use of the implementation in order to make some calculations easier. It is the unfortunate use of factors that seems generally accepted, even if the language definition continues: This, however, is an implementation issue and is not guaranteed to hold in all implementations of R. Personally, like some others, I avoid factors, except in cases, where they represent a statistical concept. On contrary I find factors quite useful. Consider possibility to change its levels set.seed(111) x - factor(sample(1:4, 20, replace=T), labels=c(one, two, three, four)) x [1] three three two three two two one three two one three three [13] one one one two one four two three Levels: one two three four levels(x)[3:4] - more x [1] more more two more two two one more two one more more one one one [16] two one more two more Levels: one two more I believe that if x is character, it can be also done but factor way seems to me more convenient. I also use point distinction in plots by pch=as.numeric(some.factor) quite often. Anyway it is maybe more about personal habits than about bad factor features Regards Petr Certainly I would agree with you that, if only reading the R Language Definition and not the documentation of the function factor, one would rather expect functions like as.numeric or strsplit to operate on the levels of a factor and not on the underlying, implementation specific, integer array. Heinz Thank you for your reading, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) --- --- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __
Re: [R] Need help on nnet
On 10/12/10 02:56:13, jothy wrote: Am working on neural network. Below is the coding and the output [...] summary (uplift.nn) a 3-3-1 network with 16 weights options were - b-h1 i1-h1 i2-h1 i3-h1 16.646.62 149.932.24 b-h2 i1-h2 i2-h2 i3-h2 -42.79 -17.40 -507.50 -5.14 b-h3 i1-h3 i2-h3 i3-h3 3.451.87 18.890.61 b-o h1-o h2-o h3-o 402.81 41.29 236.766.06 Q1: How to interpret the above output The summary above is the list of internal weights that were learnt during the neural network training in nnet(). From my point of view I wouldn't really try to interpret any meaning into those weights, especially if you have multiple predictor variables. Q2: My objective is to know the contribution of each independent variable. You may try something like variable importance approaches (VI) or feature selection approaches. 1) In VI you have a training and test set as in normal cross-validation. You train your network on the training set. You use the trained network for predicting the test values. The clue in VI then is to pick one variable at a time, permute its values in the test set only (!) and see how much the prediction error deviates from the original prediction error on the unpermuted test set. Repeat this a lot of times to get a meaningful output and also be sure to use a lot of cross-validation permutations. The more the prediction error rises, the more important the respective variable was/is. This approach includes interactions between variables. 2) feature selection is essentially an exhaustive approach which tries every possible subset of your predictors, trains a network and sees what the prediction error is. The subset which is best (lowest error) is then chosen in the end. It normally (as a side-effect) also gives you something like an importance ranking of the variables when using backward or forward feature selection. But be careful of interactions between variables. Q3: Which package of neural network provides the AIC or BIC values You may try training with the multinom() function, as pointed out in msg09297: http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg09297.html I hope I could point out some keywords and places to look at. Regards, Georg. -- Research Assistant Otto-von-Guericke-Universität Magdeburg resea...@georgruss.de http://research.georgruss.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help..Neural Network
On 10/12/10 03:45:46, sadanandan wrote: I am trying to develop a neural network with single target variable and 5 input variables to predict the importance of input variables using R. I used the packages nnet and RSNNS. But unfortunately I could not interpret the out put properly and the documentation of that packages also not giving proper direction. Please help me to find a good package with a proper documentation for neural network. Hi, please see post http://r.789695.n4.nabble.com/Need-help-on-nnet-td3081744.html (title Need help on nnet by jothy) and see if that helps solving your problem. Otherwise you may try to provide some more input about what you're trying to do and ask again. Regards, Georg. -- Research Assistant Otto-von-Guericke-Universität Magdeburg resea...@georgruss.de http://research.georgruss.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help on SAS Macro in R
Dear Researchers, I am looking for to read a SAS macro in R. Although I searched it on web, I couldnât find anything. Can you help me or direct me? Thank you for your interest and patience. Best. Ozgur [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] spatial clusters
On 10/12/10 23:26:28, dorina.lazar wrote: I am looking for a clustering method usefull to classify the countries in some clusters taking account of: a) the geographical distance (in km) between countries and b) of some macroeconomic indicators (gdp, life expectancy...). Hi Dorina, before choosing R packages useful for this task, the task itself must be clarified. What does the data you're working with look like? I'm asking because it looks as if you're trying to mix spatial (spatial distances) and non-spatial information in a clustering algorithm. I've done a lot of research in this area because I needed something similar (combining spatial and non-spatial information) and the existing approaches weren't really useful in my case because I had equidistant spatial points with equal spatial density (management zone delineation in precision agriculture). There are a few algorithms which may be suitable for your work, maybe check out the references below (you should find those using only the title, otherwise please let me know): MOSAIC: A Proximity Graph Approach for Agglomerative Clustering ICEAGE: Interactive Clustering and Exploration of Large and High-Dimensional Geodata Efficient regionalization techniques for socio-economic geographical units using minimum spanning trees (SKATER) I haven't seen too many R implementations yet, though. You may also try the R-sig-geo mailing list, because your data look geo :-) https://stat.ethz.ch/mailman/listinfo/r-sig-geo Regards, Georg. -- Research Assistant Otto-von-Guericke-Universität Magdeburg resea...@georgruss.de http://research.georgruss.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Browsing through a dataframe page by page (like with shell command more)
On 2010-12-13 02:49, Alexandre CESARI wrote: Hello, I'm looking for an easy way to display a data.frame (or other variables) page by page, similarly to what is possible on a file using the more command in a standard UNIX shell. Any help would be greatly appreciated. How about View(mydata)? See help('View'). Peter Ehlers Thanks Alexandre [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help on SAS Macro in R
On 12/13/2010 07:14 AM, Özgür Asar wrote: Dear Researchers, I am looking for to read a SAS macro in R. Although I searched it on web, I couldn’t find anything. Are you hoping just to read it in, or to actually have it execute the macro as SAS would? What gives you the idea the latter is even possible? If you're just wanting to treat the macro as text for textual analysis (e.g., how many unique macro variables are there, or lines of code, etc.) you can use ?readLines __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help on SAS Macro in R
On Dec 13, 2010, at 8:14 AM, Özgür Asar wrote: Dear Researchers, I am looking for to read a SAS macro in R. Although I searched it on web, I couldn’t find anything. Can you help me or direct me? fortune(SAS) For almost 40 years SAS has been the primary tool for statisticians worldwide and its easy-to-learn syntax, unsurpassed graphical system, powerful macro language and recent graphical user interfaces have made SAS the number one statistical software choice for both beginners and advanced users. -- Rolf Poalis, Biostatistics Denmark (announcement of the SAS to R parser sas2R) R-help (April 1, 2004) (I admit that for a few moments I considered searching for this precious relic. The New Copenhagen Sarcastic font on my device seems to have gone missing.) -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] overlap different line in a xyplot (lattice)
On 2010-12-13 03:13, Francesco Nutini wrote: From: fe...@nfrac.org [...snip...] xyplot(y1 + y2 ~ x | sites, DF, type = b) Great Felix! this is what I was looking for! But if y1 and y2 have a different scales? Can I plot, for example y2, on secondary axis? There are probably good reasons why you should not do this, but one way is to use doubleYScale() in the latticeExtra package. Peter Ehlers Thanks for your help, Francesco Nutini [...snip...] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re : descriptive statistics
I am sorry, but I cannot understand how to use the summary function. Maybe, if I describe my needs, you could sketch a line that could work. In the data set variable V can take values 1 to 14. For the subgroup of individuals where V takes value =1 I want the mean and variance of a certain set of other variables (V1, V2, V3, V4, V5). And this for all the other subgroups for values 2 to 14. What do you suggest? -- View this message in context: http://r.789695.n4.nabble.com/descriptive-statistics-tp3085197p3085462.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How does R compute sums of squares?
Consider the following missing data problem: y = c(1, 2, 2, 2, 3) a = factor(c(1, 1, 1, 2, 2)) b = factor(c(1, 2, 3, 1, 2)) fit = lm(y ~ a + b) anova(fit) Analysis of Variance Table Response: y Df Sum Sq Mean SqF valuePr(F) a 1 0.8 0.8 1.3637e+33 2.2e-16 *** b 2 1.16667 0.58333 9.5461e+32 2.2e-16 *** Residuals 1 0.0 0.0 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Warning message: In anova.lm(fit) : ANOVA F-tests on an essentially perfect fit are unreliable I am trying to understand how R computes sums of squares. I know that R makes a FORTRAN call to dqrls to make a QR decomposition of the design matrix, which returns (among other things), fit$effects (Intercept)a2b2b3 -4.472136e+00 9.128709e-01 7.715167e-01 7.559289e-01 2.471981e-17 Can anyone elaborate on how R computes these effects? I am not satisfied with the explanation that R provides with the help(effects) command. Thanks in advance. Ethan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re : descriptive statistics
I would suggest what we already suggested to you: ?aggregate ?by ?doBy::summaryBy We could help you more precisely if you could provide a reproducible example, as explained in the posting guide (see link at the end of every email from this list) Ivan Le 12/13/2010 15:14, effeesse a écrit : I am sorry, but I cannot understand how to use the summary function. Maybe, if I describe my needs, you could sketch a line that could work. In the data set variable V can take values 1 to 14. For the subgroup of individuals where V takes value =1 I want the mean and variance of a certain set of other variables (V1, V2, V3, V4, V5). And this for all the other subgroups for values 2 to 14. What do you suggest? -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to compare two square matrices
Dear Sabari, if you need a single number for comparison then there could be many options. You can calculate smallest absolute eigen value, or may be the determinant (i.e. a measure of volumn of matrices) or may be the smallest element in absolute term, depending on your research need. Thanks, -- View this message in context: http://r.789695.n4.nabble.com/How-to-compare-two-square-matrices-tp3085236p3085481.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help requested
hi thanks for your suggestion and reply. let me try it out. With Warm Wishes and Regards A. Abdul Rasheed, M.C.A., M.E., Ph.D., Assistant Professor, Department of Computer Applications, Valliammai Engineering College, SRM Nagar, Kattankulathur - 603 203. Kancheepuram District. Tamil Nadu. INDIA. Contact: 91 - 44 - 27454784 Ext: 451 (O) / 996 23 000 55 Date: Sun, 12 Dec 2010 05:06:21 -0800 From: ml-node+3084270-787879815-204...@n4.nabble.com To: prof...@live.com Subject: Re: help requested On Sat, Dec 11, 2010 at 05:11:37AM -0800, profaar wrote: hi thanks for your reply. there are around 2 nodes in my dataset. will it work for conversion from edge list format to node list format? I am using R under Windows XP. Under Linux, with 20'000 nodes and 10 random edges from each of them, this took abuot 108 sec (CPU 2.4 GHz). The advantage of this solution is that there may be further functions in the package graph (see also class?graphNEL), which could be used in your application. If not, then the conversion itself may be done more efficiently, for example edges - read.table(file=stdin()) 1 2 1 3 1 4 1 5 2 3 2 4 3 2 4 1 4 3 4 5 5 2 5 4 out1 - split(edges$V2, edges$V1) out1 $`1` [1] 2 3 4 5 $`2` [1] 3 4 $`3` [1] 2 $`4` [1] 1 3 5 $`5` [1] 2 4 For the example with 20'000 nodes and 10 random edges from each, this took about 0.2 sec. The output out1 is a list of vectors. This may be transformed to a vector of strings, for example out2 - sapply(out1, paste, collapse= ) cbind(out2) # cbind() is only for a column output out2 1 2 3 4 5 2 3 4 3 2 4 1 3 5 5 2 4 and to a text (with a possible file= argument) cat(paste(names(out2), out2), sep=\n) 1 2 3 4 5 2 3 4 3 2 4 1 3 5 5 2 4 Petr Savicky. __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. View message @ http://r.789695.n4.nabble.com/help-requested-tp3082147p3084270.html To unsubscribe from help requested, click here. -- View this message in context: http://r.789695.n4.nabble.com/help-requested-tp3082147p3085414.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re : Re : descriptive statistics
With summary do this my.summary-function(x) c(mean(x),var(x)) summary(v1~V, fun=my.summary,data=df) summary(v2~V, fun=my.summary,data=df) summary(v3~V, fun=my.summary,data=df) summary(v4~V, fun=my.summary,data=df) summary(v5~V, fun=my.summary,data=df) If you want you get the mean of all variable together in all table my.summary-function(x) c(mean(x[,1]),mean(x[,2]),mean(x[,3]),mean(x[,4]),mean(x[,5])) summary(cbind(v1,v2,v3,v4,v5)~v,data=df) Justin BEM BP 1917 Yaoundé Tél (237) 76043774 De : effeesse scarpin...@gmail.com À : r-help@r-project.org Envoyé le : Lun 13 décembre 2010, 15h 14min 22s Objet : Re: [R] Re : descriptive statistics I am sorry, but I cannot understand how to use the summary function. Maybe, if I describe my needs, you could sketch a line that could work. In the data set variable V can take values 1 to 14. For the subgroup of individuals where V takes value =1 I want the mean and variance of a certain set of other variables (V1, V2, V3, V4, V5). And this for all the other subgroups for values 2 to 14. What do you suggest? -- View this message in context: http://r.789695.n4.nabble.com/descriptive-statistics-tp3085197p3085462.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] check for item in vector
Dear R users, Suppose I have an vector like this: animal - c(Tiger,Panda) I would like to know is there any function that check for the existence of certain item in a vector. e.g. func(Tiger,animal) # check for the existence of Tiger TRUE func(Acacia,animal) #Acacia is not an item of the animal vector FALSE I know that it can be done by for loop. But I would like to know is there any built-in function for that. Thank you very much. CH -- CH Chan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] check for item in vector
Hi, See ?%in% or ?match animal - c(Tiger,Panda) Tiger %in% animal [1] TRUE Acacia %in% animal [1] FALSE Panda %in% animal [1] TRUE HTH, Ivan Le 12/13/2010 15:48, C.H. a écrit : Dear R users, Suppose I have an vector like this: animal- c(Tiger,Panda) I would like to know is there any function that check for the existence of certain item in a vector. e.g. func(Tiger,animal) # check for the existence of Tiger TRUE func(Acacia,animal) #Acacia is not an item of the animal vector FALSE I know that it can be done by for loop. But I would like to know is there any built-in function for that. Thank you very much. CH -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reg : null values in kmeans
Hi, I am using k means algorithm for clustering.My data contains a few null/NA values.kmeans doesnt cluster with those values.Are there any option like na.omit which can avoid these null values and cluster the remaining values? Thanks, Raji -- View this message in context: http://r.789695.n4.nabble.com/Reg-null-values-in-kmeans-tp3085518p3085518.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] check for item in vector
Hi CH, Check ?is.element ?%in% HTH, Jorge On Mon, Dec 13, 2010 at 9:48 AM, C.H. wrote: Dear R users, Suppose I have an vector like this: animal - c(Tiger,Panda) I would like to know is there any function that check for the existence of certain item in a vector. e.g. func(Tiger,animal) # check for the existence of Tiger TRUE func(Acacia,animal) #Acacia is not an item of the animal vector FALSE I know that it can be done by for loop. But I would like to know is there any built-in function for that. Thank you very much. CH -- CH Chan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] check for item in vector
CH, How about any: any(Tiger == animal) The function which will tell you the index if any match which(Tiger == animal. You should also look at the match funciton. Dave From: C.H. chainsawti...@gmail.com To: R-help r-help@r-project.org Date: 12/13/2010 08:50 AM Subject: [R] check for item in vector Sent by: r-help-boun...@r-project.org Dear R users, Suppose I have an vector like this: animal - c(Tiger,Panda) I would like to know is there any function that check for the existence of certain item in a vector. e.g. func(Tiger,animal) # check for the existence of Tiger TRUE func(Acacia,animal) #Acacia is not an item of the animal vector FALSE I know that it can be done by for loop. But I would like to know is there any built-in function for that. Thank you very much. CH -- CH Chan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] booklet on using R for biomedical statistics
Dear all, I've written a small booklet on using R for biomedical statistics (mostly focussed on cohort and case-control studies), available here under a Creative Commons Attribution 3.0 License : A Little Book of R for Biomedical Statistics http://a-little-book-of-r-for-biomedical-statistics.readthedocs.org/ I would be grateful for feedback and comments. Kind Regards, Avril Avril Coghlan University College Cork, Ireland [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] batchfiles 0.6-0
batchfiles is a set of batch, javascript and HTML Application files that are useful for running R and associated programs on Windows. Version 0.6-0 updates them for the new architecture specific directory structure in R 2.12.0 . A few of the lesser used utilities have been dropped. Each batchfile is self contained. To install just place all or just any that you wish to use anywhere on your path. The batch command: path will show you which folders are on your path. DOWNLOAD They can be downloaded individually from the svn repository available via the home page or they can be downloaded all at one in a zip file from CRAN here: http://cran.r-project.org/contrib/extra/batchfiles/ MORE INFO More info is available from the home page: http://batchfiles.googlecode.com LIST OF PROGRAMS Legend: h = no args gives help 0 = common usage is to enter command name without arguments d = in development * = all files marked with one star are the same. Program checks name by which its called to determine action. ** = all files marked with two stars are the same. Program checks name by which its called to determine action. #Rscript.bat - put at top of R file to make it a batch file (h) (*) clip2r.js - pastes clipboard into Rgui. See comments in file for use from vim. (0)(d) copydir.bat - copy a library from one version of R to another (h) el.js - run elevated - Vista and up, e.g. el Rgui runs R elevated find-miktex.hta - GUI to find MiKTeX (0) kopy.bat - copy Rcmd to other batch files (h)(d) movedir.bat - move library from one version of R to another (h) R.bat - like R.exe but finds R from registry (0) (*) Rcmd.bat - like Rcmd.exe but finds R from registry (h) (*) Rgui.bat - like Rgui.exe but finds R from registry (0) (*) RguiStart.bat - like Rgui.bat but arg1 defines folder to start R in (*) Rscript.bat - run .R script (h) (*) Rterm.bat - like rterm.exe but finds R from registry (h) (*) Rtidy.bat - reformat a .R file, e.g. Rtidy myfile.R outfile.R (d) Rtools.bat - place Rtools on path for remainder of console session (0) (*) Rversions.bat - list R and set R version in registry, e.g. on Vista: el cmd/c Rversions R-2.12.0 (0) show-svn-info.hta - show svn info if current folder is an svn checkout (0) Stangle.bat - run arg1 through Stangle (h) (**) Sweave.bat - run arg1 through Sweave (h) (**) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to plot Ellipsoid like function
Dear R-Users, I am currently trying to fit a tensorial function in its principal coorinate system. The function is given by: 1~(x1^2 + x2^2 + x3^2 - chi0*(x1*x2 + x1*x3 + x2*x3))/eps0^2 + (x1 + x2 + x3)/xi0 Where eps0 = 0.0066, chi0 = -0.66 and xi0 = 0.011 are obtained from experimental data using nls().I am able to plot the experimental points that delivered the parameters of the function. For my thesis, however, I need to overlay the fitted surface. So far I am using the following code which wonderfully plots the experimental points in 3D: === # from demo(bivar) require(rgl) require(misc3d) require(MASS); # New window open3d() # clear scene: clear3d(all) # setup env. That is, background, light and so on: bg3d(color=#88) light3d() # spheres at points in principal strain space #spheres3d(e1,e2,e3,radius=0.00025,color=#FF) # draws points alternatively plot3d(e1,e2,e3, col=#FF) === According to the examples on http://rgl.neoscientists.org/gallery.shtml I tried to overlay the point plot using surface3d. However, these were only functions of type y ~f(x1, x2). I think that the surface could be plotted if I could provide the gridpoints correctly. Using xyz.coords(1~(x1^2 + x2^2 + x3^2 - chi0*(x1*x2 + x1*x3 + x2*x3))/eps0^2 + (x1 + x2 + x3)/xi0, y = NULL, z = NULL) did unfortunately not solve the problem. Is there any function that can generate the surface for the given function such as ContourPlot3D in Mathematica. Thanks a million! Uwe -- Uwe Wolfram Dipl.-Ing. (Ph.D Student) __ Institute of Orthopaedic Research and Biomechanics Director and Chair: Prof. Dr. Anita Ignatius Center of Musculoskeletal Research Ulm University Hospital Ulm Helmholtzstr. 14 89081 Ulm, Germany Phone: +49 731 500-55301 Fax: +49 731 500-55302 http://www.biomechanics.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rapache on windows
I asked Jeff Horner that question a while back and he said it was Linux only. He doesn't have time to create a windows version. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Santosh Srinivas Sent: Saturday, December 11, 2010 8:37 AM To: r-help@r-project.org Subject: [R] Rapache on windows Hello all, I searched on the www but could not find installation instructution for rapache on windows. The page says that the release runs on UNIX/Linux and Mac OS X operating systems. Has anyone been able to configure it on windows? Any idea how to go about it? Thank you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** This message is for the named person's use only. It may\...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to plot Ellipsoid like function
On 13/12/2010 10:13 AM, Uwe Wolfram wrote: Dear R-Users, I am currently trying to fit a tensorial function in its principal coorinate system. The function is given by: 1~(x1^2 + x2^2 + x3^2 - chi0*(x1*x2 + x1*x3 + x2*x3))/eps0^2 + (x1 + x2 + x3)/xi0 Where eps0 = 0.0066, chi0 = -0.66 and xi0 = 0.011 are obtained from experimental data using nls().I am able to plot the experimental points that delivered the parameters of the function. For my thesis, however, I need to overlay the fitted surface. So far I am using the following code which wonderfully plots the experimental points in 3D: === # from demo(bivar) require(rgl) require(misc3d) require(MASS); # New window open3d() # clear scene: clear3d(all) # setup env. That is, background, light and so on: bg3d(color=#88) light3d() # spheres at points in principal strain space #spheres3d(e1,e2,e3,radius=0.00025,color=#FF) # draws points alternatively plot3d(e1,e2,e3, col=#FF) === According to the examples on http://rgl.neoscientists.org/gallery.shtml I tried to overlay the point plot using surface3d. However, these were only functions of type y ~f(x1, x2). I think that the surface could be plotted if I could provide the gridpoints correctly. Using xyz.coords(1~(x1^2 + x2^2 + x3^2 - chi0*(x1*x2 + x1*x3 + x2*x3))/eps0^2 + (x1 + x2 + x3)/xi0, y = NULL, z = NULL) did unfortunately not solve the problem. Is there any function that can generate the surface for the given function such as ContourPlot3D in Mathematica. See ?misc3d::contour3d __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tukey HSD not working
TukeyHSD(fit) Error in TukeyHSD(fit) : object 'fit' not found TukeyHSD(anova) Error in UseMethod(TukeyHSD) : no applicable method for 'TukeyHSD' applied to an object of class function TukeyHSD(group) Error in UseMethod(TukeyHSD) : no applicable method for 'TukeyHSD' applied to an object of class c('integer', 'numeric') TukeyHSD(y) Error in UseMethod(TukeyHSD) : no applicable method for 'TukeyHSD' applied to an object of class c('double', 'numeric') TukeyHSD(data) Error in UseMethod(TukeyHSD) : no applicable method for 'TukeyHSD' applied to an object of class function TukeyHSD(summary) Error in UseMethod(TukeyHSD) : no applicable method for 'TukeyHSD' applied to an object of class function -- View this message in context: http://r.789695.n4.nabble.com/Tukey-HSD-not-working-tp3084505p3085572.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Integration with LaTex and LyX
Hello, Are there any packages which allow for a good integration between R and LaTex / LyX? I'm interested mainly in automatic (automagic?) imports of plots/graphics. Thanks in advance and best regards, Eduardo de Oliveira Horta [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tukey HSD not working
TukeyHSD(fit) Error in UseMethod(TukeyHSD) : no applicable method for 'TukeyHSD' applied to an object of class lm TukeyHSD(anova) Error in UseMethod(TukeyHSD) : no applicable method for 'TukeyHSD' applied to an object of class function TukeyHSD(group) Error in UseMethod(TukeyHSD) : no applicable method for 'TukeyHSD' applied to an object of class c('integer', 'numeric') TukeyHSD(y) Error in UseMethod(TukeyHSD) : no applicable method for 'TukeyHSD' applied to an object of class c('double', 'numeric') TukeyHSD(data) Error in UseMethod(TukeyHSD) : no applicable method for 'TukeyHSD' applied to an object of class data.frame TukeyHSD(summary) Error in UseMethod(TukeyHSD) : no applicable method for 'TukeyHSD' applied to an object of class function -- View this message in context: http://r.789695.n4.nabble.com/Tukey-HSD-not-working-tp3084505p3085578.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 histograms
Hi Sandy, The way I'd describe it is that you expected the width parameter of the position adjustment to be relative to the binwidth of the histogram - but it's actually absolute, and it has to be this way because there's currently no way for the position adjustment to know about the parameters of the geom. Hadley On Wed, Dec 1, 2010 at 10:07 AM, Small Sandy (NHS Greater Glasgow Clyde) sandy.sm...@nhs.net wrote: Sorry this should have ben to the whole list: Hadley I think I've sorted it out in my head but for the record, and just to be sure... I guess what I was expecting was that the width parameter would be independent of binwidth. Thus a width parameter of 0.5 would always indicate an overlap of half the bar. In fact the width is determined as a fraction of the binwidth, so if width is greater than binwidth the overlap will be with adjacent bins not the bin it actually corresponds to. So in my example you can completely separate the data by putting ggplot(data=dafr, aes(x = d1, fill=d2)) + geom_histogram(binwidth = 2, position = position_dodge(width=7)) Obviously this isn't helpful. I think the rules are: 1. the width of each bar equals binwidth divided by number of fill factors (in my case two) 2. total width of the visible bars would be centred on the centre of the bin 3. overlap of the visible bars is governed by the width parameter of position_dodge with 0 being complete overlap and binwidth being complete (but touching) separation (More than binwidth would then mean space between the bars - and presumably overlap with adjacent bars - I don't think this would ever be useful). Hope this makes sense. Sandy Sandy Small Clinical Physicist NHS Forth Valley (Tel: 01324567002) and NHS Greater Glasgow and Clyde (Tel: 01412114592) From: h.wick...@gmail.com [h.wick...@gmail.com] On Behalf Of Hadley Wickham [had...@rice.edu] Sent: 01 December 2010 14:27 To: Small Sandy (NHS Greater Glasgow Clyde) Cc: ONKELINX, Thierry; r-help@r-project.org Subject: Re: [R] ggplot2 histograms However if you do: ggplot(data=dafr, aes(x = d1, fill=d2)) + geom_histogram(binwidth = 1, position = position_dodge(width=0.99)) The position of first bin which goes from 0-2 appears to start at about 0.2 (I accept that there is some white space to the left of this) while the position of the last bin (16-18) appears to start at about 15.8, so the whole histogram seems to be wrongly compressed into the scale. In my real data which has potentially 250 bins the problem becomes much more pronounced. Has any one else noticed this? Is there a work around? What do you expect this to do? The bars are one unit wide, but you've told position_dodge to treat them like they're only 0.99 units wide. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ This message may contain confidential information. If yo...{{dropped:21}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to ignore data
Dear list I have quite a small data set in which I need to have the following values ignored - not used when performing an analysis but they need to be included later in the report that I write. Can anyone help with a suggestion as to how this can be accomplished Values to be ignored 0 - zero and 1 this is in addition to NA (null) The reason is that I need to use the log10 of the values when performing the calculation. Currently I hand massage the data set, about a 100 values, of which less than 5 to 10 are in this category. The NA values are NOT the problem What I was hoping was that I did not have to use a series of if and ifelse statements. Perhaps there is a more elegant solution. Any ideas would be welcomed. Regards Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Testing an interaction with a random effect in lmer
Hi, I was hoping to get some advice regarding the testing of interactions, when one factor is modelled as a random effect... I have a model with binomial error structure where the response variable is the proportion of time spent at the main sett (animals were tracked for 28 consecutive days in each season, and were recorded either at the main sett or an outlier sett, so the response variable is a number out of 28). Animals from 9 social groups were tracked for 28 days in each of the four seasons of the year. Thus, in my model, 'individual' nested within 'social group' are my random error terms. model-lmer(binom~season+(1|group/individual),binomial,data=data1) Group explains some variation in the sett use patterns, and what i was wanting to test and display was an interaction between season and group, as the raw data suggests that different groups may behave differently in different seasons. Is there a way to do this in the lmer package? When i put it in directly: model-lmer(binom~season*group+(1|group/individual),binomial,data=data1) I get an error messgae: Warning message: In mer_finalize(ans) : gr cannot be computed at initial par (65) The model runs with the following: modelb-lmer(binom~season*(1|group)+(1|group/badger),binomial,data=data1) or modelb-lmer(binom~season+(1|group/badger)+(1|group:season),binomial,data=data1) but here I guess I am modelling it as part of the random effect so can't plug out coefficients for the different groups in different seasons. Thanks for any advice, Nicola [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integration with LaTex and LyX
?Sweave LyX is a bit harder, although you can probably export LyX docs to a *.tex and Sweave those fairly painlessly. -- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it. - Jubal Early, Firefly r-help-boun...@r-project.org wrote on 12/13/2010 10:27:43 AM: [image removed] [R] Integration with LaTex and LyX Eduardo de Oliveira Horta to: r-help 12/13/2010 10:29 AM Sent by: r-help-boun...@r-project.org Hello, Are there any packages which allow for a good integration between R and LaTex / LyX? I'm interested mainly in automatic (automagic?) imports of plots/graphics. Thanks in advance and best regards, Eduardo de Oliveira Horta [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] odot symbol as a subscript in axes labels
Dear R users, do you know how to print the latex \odot symbol subscripted to axes labels? Thank you in advance Gaetano __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re : Re : descriptive statistics
what am I supposed to put into function(x)? The indicator for extracting the subgroups? data is the df. cluster={1,...,14}. This is how I was compiling: for (i in 1:14) { my.summary-data$cluster==i c(mean(?),var(?)) summary(var_A~cluster, fun=my.summary,data=data) summary(var_B~cluster, fun=my.summary,data=data) summary(var_C~cluster, fun=my.summary,data=data) summary(var_D~cluster, fun=my.summary,data=data) summary(var_E~cluster, fun=my.summary,data=data) summary(var_F~cluster, fun=my.summary,data=data) summary(var_G~cluster, fun=my.summary,data=data) } thanks for your patience. -- View this message in context: http://r.789695.n4.nabble.com/descriptive-statistics-tp3085197p3085651.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integration with LaTex and LyX
On Mon, Dec 13, 2010 at 4:55 PM, Jonathan P Daily jda...@usgs.gov wrote: ?Sweave LyX is a bit harder, although you can probably export LyX docs to a *.tex and Sweave those fairly painlessly. LyX can play very nicely with Sweave and R. For the 1.6.x series, you could get started here [1]. If that's not enough, search the LyX wiki and ML archives. For the 2.0.x series, LyX will provide out-of-the-box support for Sweave. This is routinely discussed on the ML and in the bug tracker (if you're looking for examples). At the moment 2.0 is not yet stable, but beta2 is already out and you are free to experiment. Regards Liviu [1] http://cran.r-project.org/contrib/extra/lyx/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tukey HSD not working
On 2010-12-13 07:29, PGZC wrote: TukeyHSD(fit) Error in UseMethod(TukeyHSD) : no applicable method for 'TukeyHSD' applied to an object of class lm TukeyHSD(anova) Error in UseMethod(TukeyHSD) : no applicable method for 'TukeyHSD' applied to an object of class function TukeyHSD(group) Error in UseMethod(TukeyHSD) : no applicable method for 'TukeyHSD' applied to an object of class c('integer', 'numeric') TukeyHSD(y) Error in UseMethod(TukeyHSD) : no applicable method for 'TukeyHSD' applied to an object of class c('double', 'numeric') TukeyHSD(data) Error in UseMethod(TukeyHSD) : no applicable method for 'TukeyHSD' applied to an object of class data.frame TukeyHSD(summary) Error in UseMethod(TukeyHSD) : no applicable method for 'TukeyHSD' applied to an object of class function Okay, the basic problem appears to be that you know nothing about how to use R. Rather than take a scattergun approach and throw anything you can think of at the poor TukeyHSD function, why not take the time to work your way through 'An Introduction to R'? Or one of the many intro documents found on the web. (Normally, I would advise a careful reading of the help page and liberal use of the str() function, but it seems that your needs are more basic.) Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How does R compute sums of squares?
On Mon, Dec 13, 2010 at 8:20 AM, Ethan Arenson ethan.a.aren...@gmail.com wrote: Consider the following missing data problem: y = c(1, 2, 2, 2, 3) a = factor(c(1, 1, 1, 2, 2)) b = factor(c(1, 2, 3, 1, 2)) fit = lm(y ~ a + b) anova(fit) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(F) a 1 0.8 0.8 1.3637e+33 2.2e-16 *** b 2 1.16667 0.58333 9.5461e+32 2.2e-16 *** Residuals 1 0.0 0.0 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Warning message: In anova.lm(fit) : ANOVA F-tests on an essentially perfect fit are unreliable I am trying to understand how R computes sums of squares. I know that R makes a FORTRAN call to dqrls to make a QR decomposition of the design matrix, which returns (among other things), fit$effects (Intercept) a2 b2 b3 -4.472136e+00 9.128709e-01 7.715167e-01 7.559289e-01 2.471981e-17 Can anyone elaborate on how R computes these effects? I am not satisfied with the explanation that R provides with the help(effects) command. Q'y Thanks in advance. Ethan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to ignore data
Steve Sidney sbsidney at mweb.co.za writes: Dear list I have quite a small data set in which I need to have the following values ignored - not used when performing an analysis but they need to be included later in the report that I write. Can anyone help with a suggestion as to how this can be accomplished Values to be ignored 0 - zero and 1 this is in addition to NA (null) The reason is that I need to use the log10 of the values when performing the calculation. Currently I hand massage the data set, about a 100 values, of which less than 5 to 10 are in this category. The NA values are NOT the problem What I was hoping was that I did not have to use a series of if and ifelse statements. Perhaps there is a more elegant solution. It would help to have a more precise/reproducible example, but if your data set (a data frame) is d, and you want to ignore cases where the response variable x is either 0 or 1, you could say ds - subset(d,!x %in% c(0,1)) Some modeling functions (such as lm()), but not all of them, have a 'subset' argument so you can provide this criterion on the fly: lm(...,subset=(!x %in% c(0,1))) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data permutation
Hello R User, I am new in R and trying to migrate from SAS. I have to convert a table that look like this YEARFIRMID_NAME VALUE 1994Microsoft John Doe5 1994Microsoft Mark Smith 3 1994Microsoft David Ring 2 In this: YEARFIRMID1 vALUE ID2 VALUE 1994Microsoft John Doe5 Mark Smith 3 1994Microsoft John Doe5 David Ring 2 1994Microsoft Mark Smith 3 David Ring 2 I have to do it for all the possible pair combination of ID_Name linked to the same firm for any given year in my sample. Do you have any suggestion? Thank you very much, Matteo -- View this message in context: http://r.789695.n4.nabble.com/Data-permutation-tp3085707p3085707.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] library(Defaults) throws evaluation nested too deeply
Can anybody tell me why this is happening? library( Defaults) ls() [1] class foomlct [4] mlctTheme overlayFunctionpct [7] priprimaryFractionreclassFractions [10] reclassMatrix reclassNames sec [13] thumb thumbCheck thumbForest [16] thumbFractions thumbFractions5min thumbStack ls(all.names=TRUE) [1] .Random.seed .help.ESS class [4] foomlct mlctTheme [7] overlayFunctionpctpri [10] primaryFractionreclassFractions reclassMatrix [13] reclassNames secthumb [16] thumbCheck thumbForestthumbFractions [19] thumbFractions5min thumbStack setDefaults(ls, all.names=TRUE) ls() Error: evaluation nested too deeply: infinite recursion / options(expressions=)? I got similar results trying to set defaults for unlist() and threw up my hands, but keep thinking about how useful it would be. -- View this message in context: http://r.789695.n4.nabble.com/library-Defaults-throws-evaluation-nested-too-deeply-tp3085708p3085708.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re : Re : descriptive statistics
An alternative way of getting summary statistics by a grouping variable is to use describe.by in the psych package: using Jim Lemon's example: library(psych) testmat-data.frame(sample(1:14,50,TRUE),rnorm(50),runif(50)) #make up the data describe.by(test.mat,testmat[1]#get descriptive statistics At 8:17 AM -0800 12/13/10, effeesse wrote: what am I supposed to put into function(x)? The indicator for extracting the subgroups? data is the df. cluster={1,...,14}. This is how I was compiling: for (i in 1:14) { my.summary-data$cluster==i c(mean(?),var(?)) summary(var_A~cluster, fun=my.summary,data=data) summary(var_B~cluster, fun=my.summary,data=data) summary(var_C~cluster, fun=my.summary,data=data) summary(var_D~cluster, fun=my.summary,data=data) summary(var_E~cluster, fun=my.summary,data=data) summary(var_F~cluster, fun=my.summary,data=data) summary(var_G~cluster, fun=my.summary,data=data) } thanks for your patience. -- View this message in context: http://r.789695.n4.nabble.com/descriptive-statistics-tp3085197p3085651.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re : Re : descriptive statistics
Do it with aggregate(), something like this should do: aggregate(.~cluster, FUN=summary, data=data) Now if you don't want to run summary(), replace it with the function you'd like. HTH, Ivan Le 12/13/2010 17:17, effeesse a écrit : what am I supposed to put into function(x)? The indicator for extracting the subgroups? data is the df. cluster={1,...,14}. This is how I was compiling: for (i in 1:14) { my.summary-data$cluster==i c(mean(?),var(?)) summary(var_A~cluster, fun=my.summary,data=data) summary(var_B~cluster, fun=my.summary,data=data) summary(var_C~cluster, fun=my.summary,data=data) summary(var_D~cluster, fun=my.summary,data=data) summary(var_E~cluster, fun=my.summary,data=data) summary(var_F~cluster, fun=my.summary,data=data) summary(var_G~cluster, fun=my.summary,data=data) } thanks for your patience. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 15' lag of an irregular Time Series
Hi everyone, I am new to R and I have a beginner's question on Time Series: I have an irregular time series that goes like this: TIMESTAMP PRICE 2010-11-29 12:29:28 25.255 2010-11-29 12:30:47 25.255 2010-11-29 12:36:58 25.230 2010-11-29 12:43:14 25.235 2010-11-29 12:44:18 25.235 The first column is the xts date-time and second the is the actual series. I want to lag the PRICE by 15 minutes or more, i.e. for each datapoint to take the first observation after 15' have passed. Is there a neat way of doing that? I can obviously treat it a regular TS with a very small delta (1'' in this case) and back-fill all the missing observations - then I can take 15*60=900 lags⦠But I think it would be very inefficient, especially in cases where one needs to consider milliseconds. Thanks, Vassili [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to ignore data
Values to be ignored 0 - zero and 1 this is in addition to NA (null) The reason is that I need to use the log10 of the values when performing the calculation. Currently I hand massage the data set, about a 100 values, of which less than 5 to 10 are in this category. This is probably a bad idea, perhaps even a VERY bad idea, though without knowing the details of what you are doing, one cannot be sure. The reason is that by removing these values you may be biasing the analysis. For example, if they are values that are below some threshhold LOD (limit of detection) they are censored, and this censoring needs to be explicitly accounted for (e.g. with the survival package). If they represent in some sense unusual values (some call these outliers, a pejorative label that I believe should be avoided given all the scientfic and statistical BS associated with the term), one is then bound to ask, How unusual? Why unusual? What do they tell us about the scientific questions of concern? If they are just errors of some sort (like negative incomes or volumes), well then, you're probably OK removing them. The reason I mention this is that I have seen scientists too often use poor strategies for analyzing censored data, and this can end up producing baloney conclusions that don't replicate. It's a somewhat subtle, but surprisingly common issue (due to measurement limitations) that most scientists are neither trained to recognize nor deal with. So their problematical approaches are understandable, but unfortunate. Therefore take care ... and, if necessary, consuilt your local statistician for help. -- Bert The NA values are NOT the problem What I was hoping was that I did not have to use a series of if and ifelse statements. Perhaps there is a more elegant solution. It would help to have a more precise/reproducible example, but if your data set (a data frame) is d, and you want to ignore cases where the response variable x is either 0 or 1, you could say ds - subset(d,!x %in% c(0,1)) Some modeling functions (such as lm()), but not all of them, have a 'subset' argument so you can provide this criterion on the fly: lm(...,subset=(!x %in% c(0,1))) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics 467-7374 http://devo.gene.com/groups/devo/depts/ncb/home.shtml __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data permutation
On Mon, Dec 13, 2010 at 11:37 AM, matteop mpr...@iese.edu wrote: Hello R User, I am new in R and trying to migrate from SAS. I have to convert a table that look like this YEAR FIRM ID_NAME VALUE 1994 Microsoft John Doe 5 1994 Microsoft Mark Smith 3 1994 Microsoft David Ring 2 In this: YEAR FIRM ID1 vALUE ID2 VALUE 1994 Microsoft John Doe 5 Mark Smith 3 1994 Microsoft John Doe 5 David Ring 2 1994 Microsoft Mark Smith 3 David Ring 2 I have to do it for all the possible pair combination of ID_Name linked to the same firm for any given year in my sample. Do you have any suggestion? Here are a few possibilities: 1. merge/subset subset(merge(DF, DF, by = 1:2), as.character(ID_NAME.x) as.character(ID_NAME.y)) 2. sqldf with default names library(sqldf) sqldf(select * from DF a join DF b using(YEAR, FIRM) where a.ID_NAME b.ID_NAME, method = raw) Its important that you use method = raw to override the automatic class assignment heuristic which in this case tries to assign factors to the ID_NAME columns but gets the factor levels wrong. If you use method = raw it should work ok here. 3. sqldf with new names This also works and does not need method = raw: sqldf(select YEAR, FIRM, a.ID_NAME ID_NAME1, a.VALUE VALUE1, b.ID_NAME ID_NAME2, b.VALUE VALUE2 from DF a join DF b using(YEAR, FIRM) where a.ID_NAME b.ID_NAME) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Qs re writing/reading arrays, dataframes
Hi! I'm just getting started with R (and with the analysis of large datasets in general). I have several beginner-level questions whose answers I have not been able to find, and was hoping one of you would be kind enough to throw me a cluebrick or two. I have a 6-dimensional numeric array (which I'll call myarray) that is fully named. By this I mean that non-NULL dimnames are assigned to all dimensions, and, furthermore, the dimensions themselves are named. In fact, I created the dimnames attribute with an expression of the form: dimnames(myarray) - list(line=c(...), compound:name=c(...), compound:concentration=c(...), time=c(...), replicate=c(...), antibody:name=c(...)) ...where the values passed for attributes line, compound:name, .., antibody:name are all vectors with mode character. Question 1: I'd like to save this array in a file having an ASCII (i.e. non-binary) format that can be easily read by R. How can I format this file so that not only the dimnames are specified, but also the names of the dimensions (line, compound:name, ..., antibody:name) themselves? I thought that the output of write.table would give me a clue, but in fact this output does not mention the dimension names at all. Question 2: In fact, I don't think that write.table is the right function to use in this case, because it seems to be designed for dataframes rather than arrays. When write.table coerces myarray into a data.frame, dimensions 2-6 get collapsed into one. Hence, when the data is read back into R, it has the wrong dimensions. What's the best way to convert a fully named array like myarray into a data.frame, so as to preserve all the array's dimnames and dimension names? Question 3: I've come across several times the advice to the effect that data.frames are usually the best choice of representation for such data. In my case, however, I don't see what I would be gaining by casting my array into a data.frame. In what kind of situation is it advantageous to work with a data.frame rather than an array? Thanks in advance! Roy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rapache on windows
Thanks to R ... I just got myself a new ubuntu setup just an hour back! It feels good! :-) On Mon, Dec 13, 2010 at 8:46 PM, Bos, Roger roger@rothschild.comwrote: I asked Jeff Horner that question a while back and he said it was Linux only. He doesn't have time to create a windows version. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Santosh Srinivas Sent: Saturday, December 11, 2010 8:37 AM To: r-help@r-project.org Subject: [R] Rapache on windows Hello all, I searched on the www but could not find installation instructution for rapache on windows. The page says that the release runs on UNIX/Linux and Mac OS X operating systems. Has anyone been able to configure it on windows? Any idea how to go about it? Thank you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** This message is for the named person's use only. It may contain confidential, proprietary or legally privileged information. No right to confidential or privileged treatment of this message is waived or lost by an error in transmission. If you have received this message in error, please immediately notify the the sender by e-mail, delete the message and all copies from your system and destroy any hard copies. You must not, directly or indirectly, use, disclose, distribute, print or copy any part of this message if you are not the intended recipient. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to ignore data
Thanks for the questions. 1) The data represents micro-organism counts and a count of zero in this case is highly unlikely given the info we have; including the other participants. 2) The data is submitted in duplicate and then a standardised sum and difference is established and is used to calculate a Z-score which is used as a measure of performance. Given both 1) and 2) it is necessary to exclude a raw count of zero (since the log of 0 is meaningless) and a count of one (since the log of 1 of course is zero). I guess one can think of these values as outliers and that is what I am trying to exclude. There is ample evidence that such an approach is acceptable. Thanks for the interest Steve On 2010/12/13 06:47 PM, Stavros Macrakis wrote: If you need to take the log of the values for your calculation, then what does it mean that you have 0 values in the input? And why do you need to exclude the 1 values? Are you sure that a) you are doing the correct kind of analysis and b) the analysis is correct if you exclude 0 and 1? -s On Mon, Dec 13, 2010 at 10:38, Steve Sidneysbsid...@mweb.co.za wrote: Dear list I have quite a small data set in which I need to have the following values ignored - not used when performing an analysis but they need to be included later in the report that I write. Can anyone help with a suggestion as to how this can be accomplished Values to be ignored 0 - zero and 1 this is in addition to NA (null) The reason is that I need to use the log10 of the values when performing the calculation. Currently I hand massage the data set, about a 100 values, of which less than 5 to 10 are in this category. The NA values are NOT the problem What I was hoping was that I did not have to use a series of if and ifelse statements. Perhaps there is a more elegant solution. Any ideas would be welcomed. Regards Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Projecting data on a world map using long/lat
Hi Patrick, Thanks! That worked perfectly! M -- View this message in context: http://r.789695.n4.nabble.com/Projecting-data-on-a-world-map-using-long-lat-tp3081298p3085834.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to change leaf color by group in hclust plot or how to install A2R package in windows?
I want to change leaf color by group in hclust plot. I've seen several answers about A2R package but I cannot install A2R and Rtools in windows. Do you know how to install A2R package in windows or how to change leaf color by group in hclust plot? Thank you in advance, Soyeon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Complicated nls formula giving singular gradient message
I'm attempting to calculate a regression in R that I normally use Prism for, because the formula isn't pretty by any means. Prism presents the formula (which is in the Prism equation library as Heterologous competition with depletion, if anyone is curious) in these segments: KdCPM = KdnM*SpAct*Vol*1000 R=NS+1 S=(1+10^(X-LogKi))*KdCPM+Hot a=-1*R b=R*S+NS*Hot+BMax c = -1*Hot*(S*MS+BMax) Y = (-1*b+sqrt(b*b-4*a*c))/(2*a) I'm only trying to solve for NS, LogKi, and BMax. I have everything else (KdnM, SpAct, Vol, Hot). I would use the simple formula at the bottom and then backsolve for the terms I'm looking for, but the simple formula at the bottom takes out the X term, which is contained within S, which it itself contained in both b and c. So I tried to substitute all the terms back into Y and got the following formula-as.formula(Y ~ (-1*(((NS+1)*((1+10^(X-LogKi))*(KdnM*SpAct*Vol*1000)+Hot))+NS*Hot+BMax)+sqrtNS+1)*((1+10^(X-LogKi))*(KdnM*SpAct*Vol*1000)+Hot))+NS*Hot+BMax)*(((NS+1)*((1+10^(X-LogKi))*(KdnM*SpAct*Vol*1000)+Hot))+NS*Hot+BMax)-4*(-1*(NS+1))*(-1*Hot*(((1+10^(X-LogKi))*(KdnM*SpAct*Vol*1000)+Hot)*NS+BMax/(2*-1*(NS+1))) But trying to use that formula gives me the single gradient message, which I wasn't entirely surprised by. fit-nls(formula=formula,data=data,start=list(NS=.01,LogKi=-7,BMax=33000)) Error in nls(formula = formula, data = all_no_outliers, start = list(NS = 0.01, : singular gradient I've never worked with a formula this complicated in R. Is it even possible to do something like this? Any ideas or points in the right direction would be greatly appreciated. Thanks, Jared [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to ignore data
Thanks for the comments Please see my reply to Stavros - the counts represent organisms and btw both mean and the median are virtually unaffected by the removal of these valuse. Furthermore, experience rather than statistics indicates that these values are in fact gross errors and as you of course mention I think one can quite safely remove them. I totally agree about the question of what is an outlier but since these results are obtained from a Proficiency Testing programme, we are pretty sure what the aniticpated results. At least the range and in this case these values are considered errors. Steve On 2010/12/13 07:09 PM, Bert Gunter wrote: Values to be ignored 0 - zero and 1 this is in addition to NA (null) The reason is that I need to use the log10 of the values when performing the calculation. Currently I hand massage the data set, about a 100 values, of which less than 5 to 10 are in this category. This is probably a bad idea, perhaps even a VERY bad idea, though without knowing the details of what you are doing, one cannot be sure. The reason is that by removing these values you may be biasing the analysis. For example, if they are values that are below some threshhold LOD (limit of detection) they are censored, and this censoring needs to be explicitly accounted for (e.g. with the survival package). If they represent in some sense unusual values (some call these outliers, a pejorative label that I believe should be avoided given all the scientfic and statistical BS associated with the term), one is then bound to ask, How unusual? Why unusual? What do they tell us about the scientific questions of concern? If they are just errors of some sort (like negative incomes or volumes), well then, you're probably OK removing them. The reason I mention this is that I have seen scientists too often use poor strategies for analyzing censored data, and this can end up producing baloney conclusions that don't replicate. It's a somewhat subtle, but surprisingly common issue (due to measurement limitations) that most scientists are neither trained to recognize nor deal with. So their problematical approaches are understandable, but unfortunate. Therefore take care ... and, if necessary, consuilt your local statistician for help. -- Bert The NA values are NOT the problem What I was hoping was that I did not have to use a series of if and ifelse statements. Perhaps there is a more elegant solution. It would help to have a more precise/reproducible example, but if your data set (a data frame) is d, and you want to ignore cases where the response variable x is either 0 or 1, you could say ds- subset(d,!x %in% c(0,1)) Some modeling functions (such as lm()), but not all of them, have a 'subset' argument so you can provide this criterion on the fly: lm(...,subset=(!x %in% c(0,1))) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change leaf color by group in hclust plot or how to install A2R package in windows?
What error do you get when using: install.packages(A2R) ? Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Mon, Dec 13, 2010 at 7:54 PM, Soyeon Kim yunni0...@gmail.com wrote: I want to change leaf color by group in hclust plot. I've seen several answers about A2R package but I cannot install A2R and Rtools in windows. Do you know how to install A2R package in windows or how to change leaf color by group in hclust plot? Thank you in advance, Soyeon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to print colorful R output??
One possibility, though not as simple as what you ask for, is to use etxtStart and friends from the TeachingDemos package. Other possibilities include using gui interfaces to R, possibilities (though they may do more than you ask, and color might be different) include emacs/ess; vim; jgr; and others. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of casperyc Sent: Sunday, December 12, 2010 12:59 PM To: r-help@r-project.org Subject: Re: [R] How to print colorful R output?? Hi All, My aim is actually not that complicated as you guys understand. What I want is this, when I print it by clicking File-- Print... It gaves me a black white output. But what I want is 'red', for all the codes i typed in, 'blue', for the R output, just like the console. Thanks! (I am using windows xp) casper -- View this message in context: http://r.789695.n4.nabble.com/How-to- print-colorful-R-output-tp3082750p3084578.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] predict.lm[e] with formula passed as a variable
Dear all, In a function I paste a string and convert it to a formula which I pass to lm[e]. The idea is to write a function which takes the name of the response variable and the explanatory variable and the data frame as an argument and calculates an lm[e]. (see example below) This works fine, but if I want to make a prediction on this model, R complains that the object holding the formula (form) cannot be found. How can I circumvent this problem? I think I've to provide somehow an environment to predict holding the binding for the variable form, such that predict can resolve the variable, but I've no clue how to do this. Help is very much appreciated. BR + thanks, Thorn 8 df - data.frame(x=factor(rep(1:2, each=10)), y=c(rnorm(10), rnorm(10, 10)), z=rep(1:10,2)) test - function(df, resp, x, rf, LM = FALSE) { form - paste(resp, x, sep = ~ ) form - as.formula(form) if (LM) { mod - lm(form, data=df) } else { rand - as.formula(paste(~1, rf, sep = | )) mod - lme(form, data = df, random = rand) } x.new - data.frame(levels(df[[x]])) names(x.new) - x if (LM) predict(mod, x.new) else predict(mod, x.new, level=0) } test(df, y, x, z) Error in eval(expr, envir, enclos) : object 'form' not found 8 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to ignore data
Inline below. -- Bert On Mon, Dec 13, 2010 at 9:42 AM, Steve Sidney sbsid...@mweb.co.za wrote: Thanks for the questions. 1) The data represents micro-organism counts and a count of zero in this case is highly unlikely given the info we have; including the other participants. ?? Censoring or an experimental failure? Big difference. 2) The data is submitted in duplicate and then a standardised sum and difference is established and is used to calculate a Z-score which is used as a measure of performance. Z scores are usually inappropriate for count data, which are discrete and tend to be skew. Given both 1) and 2) it is necessary to exclude a raw count of zero (since the log of 0 is meaningless) and a count of one (since the log of 1 of course is zero). False. Correct statement is: Because I do not know the statistical methodology necessary to handle such discrete data with 0 counts, I exclude them. You are confusing your ignorance of statistical methodology with the need for spurious ad hoc treatments. 0 counts can and should be handled by appropriate statistical methods (e.g. possibly 0 inflated Poisson models via glm() or otherwise). I guess one can think of these values as outliers and that is what I am trying to exclude. This is a wholly unscientific statement, I'm afraid. There is ample evidence that such an approach is acceptable. What evidence, pray tell? -- a prior culture of inappropriate analyses, perhaps? I do not wish to engage in a debate about this, but, again, all I can say is that the above statement is not scientific. If I were consulting with you, I would say Please show me your 'evidence.' But, of course, I am not, and won't. None of this is to say that you aren't correct in all respects. It is just that you have raised all my usual warning flags, so that I am somewhat skeptical. But that's MY problem. This is the last I will say on the matter, so feel free to get in the final word, as I will not respond. And I wish you success in your efforts. -- Bert Thanks for the interest Steve On 2010/12/13 06:47 PM, Stavros Macrakis wrote: If you need to take the log of the values for your calculation, then what does it mean that you have 0 values in the input? And why do you need to exclude the 1 values? Are you sure that a) you are doing the correct kind of analysis and b) the analysis is correct if you exclude 0 and 1? -s On Mon, Dec 13, 2010 at 10:38, Steve Sidneysbsid...@mweb.co.za wrote: Dear list I have quite a small data set in which I need to have the following values ignored - not used when performing an analysis but they need to be included later in the report that I write. Can anyone help with a suggestion as to how this can be accomplished Values to be ignored 0 - zero and 1 this is in addition to NA (null) The reason is that I need to use the log10 of the values when performing the calculation. Currently I hand massage the data set, about a 100 values, of which less than 5 to 10 are in this category. The NA values are NOT the problem What I was hoping was that I did not have to use a series of if and ifelse statements. Perhaps there is a more elegant solution. Any ideas would be welcomed. Regards Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change leaf color by group in hclust plot or how to install A2R package in windows?
An example is described here that you can adapt: http://r.789695.n4.nabble.com/coloring-leaves-in-a-hclust-or-dendrogram-plot -tt795496.html#a795497 HTH. Bryan * Bryan Hanson Professor of Chemistry Biochemistry DePauw University, Greencastle IN USA On 12/13/10 12:54 PM, Soyeon Kim yunni0...@gmail.com wrote: I want to change leaf color by group in hclust plot. I've seen several answers about A2R package but I cannot install A2R and Rtools in windows. Do you know how to install A2R package in windows or how to change leaf color by group in hclust plot? Thank you in advance, Soyeon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Qs re writing/reading arrays, dataframes
On Dec 13, 2010, at 12:20 PM, Roy Shimizu wrote: Hi! I'm just getting started with R (and with the analysis of large datasets in general). I have several beginner-level questions whose answers I have not been able to find, and was hoping one of you would be kind enough to throw me a cluebrick or two. I have a 6-dimensional numeric array (which I'll call myarray) that is fully named. By this I mean that non-NULL dimnames are assigned to all dimensions, and, furthermore, the dimensions themselves are named. In fact, I created the dimnames attribute with an expression of the form: dimnames(myarray) - list(line=c(...), compound:name=c(...), compound:concentration=c(...), time=c(...), replicate=c(...), antibody:name=c(...)) ...where the values passed for attributes line, compound:name, .., antibody:name are all vectors with mode character. Question 1: I'd like to save this array in a file having an ASCII (i.e. non-binary) format that can be easily read by R. ?dump How can I format this file so that not only the dimnames are specified, but also the names of the dimensions (line, compound:name, ..., antibody:name) themselves? I thought that the output of write.table would give me a clue, but in fact this output does not mention the dimension names at all. Question 2: In fact, I don't think that write.table is the right function to use in this case, because it seems to be designed for dataframes rather than arrays. When write.table coerces myarray into a data.frame, dimensions 2-6 get collapsed into one. Hence, when the data is read back into R, it has the wrong dimensions. What's the best way to convert a fully named array like myarray into a data.frame, so as to preserve all the array's dimnames and dimension names? Question 3: I've come across several times the advice to the effect that data.frames are usually the best choice of representation for such data. In my case, however, I don't see what I would be gaining by casting my array into a data.frame. In what kind of situation is it advantageous to work with a data.frame rather than an array? When you are going to do regression. Thanks in advance! David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] curve
Hi All, I generated 5000 samples using the following script test- rnorm(5000,1000,100) test1 - subset(test, subset=(test 1100)) d - density(test) plot(d, main=Density of production) abline(v=mean(test1) I wanted to do the following but faced difficulties 1. to shade or color (blue) the curve using the criterion that any values greater than 11,000 2. I drew a vertical line but I wanted the v-line within the curve not to stick outside the curve 3. to suppress the output produced at the bottom of the curve( N=5000 and bandwidth =16.22) Thanks in advance Val [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Browsing through a dataframe page by page (like with shell command more)
For data frames the best is probably the View function (note capitol V) which opens the data frame in a spreadsheet like window that you can scroll through. For more complicated, list or list-like objects, look at TkListView in the TeachingDemos package. For more general investigation of data objects look at ?page and ?options specifically the pager section. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Alexandre CESARI Sent: Monday, December 13, 2010 3:49 AM To: r-help@r-project.org Subject: [R] Browsing through a dataframe page by page (like with shell command more) Hello, I'm looking for an easy way to display a data.frame (or other variables) page by page, similarly to what is possible on a file using the more command in a standard UNIX shell. Any help would be greatly appreciated. Thanks Alexandre [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survival package - calculating probability to survive a given time
- --- included message i try to calculate the probabilty to survive a given time by using the estimated survival curve by kaplan meier. What is the right way to do that? as far as is see i cannot use the predict-methods from the survival package? end inclusion The survfit function directly estimates the probability you want, no predict method is needed. It is similar to quantile(x, 1:9/10), in that you would not use predict on that either. As to getting the value at a specific time, use summary(fit, time=20) This correctly interpolates the step function when 20 is exactly one of the event times. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] curve
Here's one way to do what I think you want: test- rnorm(5000,1000,100) test1 - subset(test, subset=(test 1100)) d - density(test) plot(d, main=Density of production, xlab=) lines(d$x[d$x 1100], d$y[d$x 1100], col=blue, lwd=2) curveheight - d$y[abs((d$x - mean(test1))) == min(abs((d$x - mean(test1] segments(x0=mean(test1), y0=0, y1=curveheight) Sarah On Mon, Dec 13, 2010 at 1:44 PM, Val valkr...@gmail.com wrote: Hi All, I generated 5000 samples using the following script test- rnorm(5000,1000,100) test1 - subset(test, subset=(test 1100)) d - density(test) plot(d, main=Density of production) abline(v=mean(test1) I wanted to do the following but faced difficulties 1. to shade or color (blue) the curve using the criterion that any values greater than 11,000 2. I drew a vertical line but I wanted the v-line within the curve not to stick outside the curve 3. to suppress the output produced at the bottom of the curve( N=5000 and bandwidth =16.22) Thanks in advance Val -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] predict.lm[e] with formula passed as a variable
Thorn, Here's how I do it: retval - list(as.name('lm'), formula=as.formula(paste(Response, ~, Explan, sep='')), data=as.name(Data)) #... optionally add other arguments retval - eval(as.call(retval)) Dave From: Thaler, Thorn, LAUSANNE, Applied Mathematics thorn.tha...@rdls.nestle.com To: r-help@r-project.org Date: 12/13/2010 12:16 PM Subject: [R] predict.lm[e] with formula passed as a variable Sent by: r-help-boun...@r-project.org Dear all, In a function I paste a string and convert it to a formula which I pass to lm[e]. The idea is to write a function which takes the name of the response variable and the explanatory variable and the data frame as an argument and calculates an lm[e]. (see example below) This works fine, but if I want to make a prediction on this model, R complains that the object holding the formula (form) cannot be found. How can I circumvent this problem? I think I've to provide somehow an environment to predict holding the binding for the variable form, such that predict can resolve the variable, but I've no clue how to do this. Help is very much appreciated. BR + thanks, Thorn 8 df - data.frame(x=factor(rep(1:2, each=10)), y=c(rnorm(10), rnorm(10, 10)), z=rep(1:10,2)) test - function(df, resp, x, rf, LM = FALSE) { form - paste(resp, x, sep = ~ ) form - as.formula(form) if (LM) { mod - lm(form, data=df) } else { rand - as.formula(paste(~1, rf, sep = | )) mod - lme(form, data = df, random = rand) } x.new - data.frame(levels(df[[x]])) names(x.new) - x if (LM) predict(mod, x.new) else predict(mod, x.new, level=0) } test(df, y, x, z) Error in eval(expr, envir, enclos) : object 'form' not found 8 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] curve
Thanks Sarah, 1. to shade or color (blue) the curve using the criterion that any values greater than 11,000 I think I was not clear in the above point. I want shade not the line but the area under the curve, and Your last line of code, segments(x0=mean(test1), y0=0, y1=curveheight) gave me the following error message Error in segments(x0 = mean(test1), y0 = 0, y1 = curveheight) : element 3 is empty; the part of the args list of '.Internal' being evaluated was: (x0, y0, x1, y1, col = col, lty = lty, lwd = lwd, ...) could you check it please On Mon, Dec 13, 2010 at 2:01 PM, Sarah Goslee sarah.gos...@gmail.com wrote: Here's one way to do what I think you want: test- rnorm(5000,1000,100) test1 - subset(test, subset=(test 1100)) d - density(test) plot(d, main=Density of production, xlab=) lines(d$x[d$x 1100], d$y[d$x 1100], col=blue, lwd=2) curveheight - d$y[abs((d$x - mean(test1))) == min(abs((d$x - mean(test1] segments(x0=mean(test1), y0=0, y1=curveheight) Sarah On Mon, Dec 13, 2010 at 1:44 PM, Val valkr...@gmail.com wrote: Hi All, I generated 5000 samples using the following script test- rnorm(5000,1000,100) test1 - subset(test, subset=(test 1100)) d - density(test) plot(d, main=Density of production) abline(v=mean(test1) I wanted to do the following but faced difficulties 1. to shade or color (blue) the curve using the criterion that any values greater than 11,000 2. I drew a vertical line but I wanted the v-line within the curve not to stick outside the curve 3. to suppress the output produced at the bottom of the curve( N=5000 and bandwidth =16.22) Thanks in advance Val -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re : Re : descriptive statistics
You could also use aggstat() of package tdisplay (available at http://forums.cirad.fr/logiciel-R/viewtopic.php?t=3367). See the help page. mydata - data.frame( + y1 = c(NA, rnorm(n = 8, mean = 10, sd = 5), NA), + y2 = c(rep(NA, 2), rnorm(n = 6, mean = 10, sd = 5), rep(NA, 2)), + y3 = rnorm(n = 10, mean = 10, sd = 5), + y4 = rnorm(n = 10, mean = 10, sd = 5), + f1 = rep(c(a, NA, b), times = c(3, 1, 6)), + f2 = rep(c(c, d, NA), times = c(5, 3, 2)), + f3 = rep(c(e, f, g), times = c(3, 3, 4)) + ) mydata y1y2y3y4 f1 f2 f3 1 NANA 11.277582 13.120160ac e 2 -0.7843488NA 18.633881 9.095533ac e 3 11.626 15.563409 9.433654 16.062916ac e 4 12.2523768 5.567119 19.381132 13.734706 NAc f 5 11.4456084 8.170626 5.039419 7.135086bc f 6 16.1444098 2.518970 7.468279 5.441936bd f 7 9.4774380 5.114297 14.777489 8.884707bd g 8 13.9189684 13.090211 17.060803 12.467241bd g 9 12.0196222NA 4.551620 9.506194b NA g 10 NANA 8.377446 6.572499b NA g aggstat(formula = cbind(y1, y2, y3) ~ f1 + f2, data = mydata, FUN = mean) aggstat(formula = cbind(y1, y2, y3) ~ f1 + f2, data = mydata, FUN = mean) f1 f2y1y2y3 1 a c 5.435602 15.563409 13.115039 2 b c 11.445608 8.170626 5.039419 3 b d 13.180272 6.907826 13.102191 See also the function univar(): mydata - data.frame( + f1 = c(NA, rep(a, 2), rep(b, 5), NA, a, a), + f2 = rep(c(c, d), times = c(5, 6)), + f3 = rep(c(e, NA, f), times = c(4, 1, 6)), + y1 = c(rnorm(n = 9, mean = 10, sd = 5), NA, 2.1) + ) mydata f1 f2 f3y1 1 NA ce 6.948897 2 a ce 20.115954 3 a ce 13.569935 4 b ce 12.159732 5 b c NA 11.862606 6 b df 21.610803 7 b df 10.820413 8 b df 13.200561 9 NA df 9.694245 10a dfNA 11a df 2.10 univar(formula = y1 ~ f1, data = mydata) f1 NA's n minq25 median meanq75maxsd iqr rangecv 1 a1 3 2.10 7.835 13.57 11.929 16.843 20.116 9.119 9.008 18.016 0.764 2 b0 5 10.82 11.863 12.16 13.931 13.201 21.611 4.376 1.338 10.790 0.314 univar(formula = y1 ~ f1 + f2, data = mydata) f1 f2 NA's nminq25 median meanq75maxsd iqr rangecv 1 a c0 2 13.570 15.206 16.843 16.843 18.479 20.116 4.629 3.273 6.546 0.275 3 a d1 1 2.100 2.100 2.100 2.100 2.100 2.100NA 0.000 0.000NA 2 b c0 2 11.863 11.937 12.011 12.011 12.085 12.160 0.210 0.149 0.297 0.017 4 b d0 3 10.820 12.010 13.201 15.211 17.406 21.611 5.669 5.395 10.790 0.373 -- Matthieu Lesnoff CIRAD Bamako, Mali On 13 December 2010 17:04, Ivan Calandra ivan.calan...@uni-hamburg.de wrote: Do it with aggregate(), something like this should do: aggregate(.~cluster, FUN=summary, data=data) Now if you don't want to run summary(), replace it with the function you'd like. HTH, Ivan Le 12/13/2010 17:17, effeesse a écrit : what am I supposed to put into function(x)? The indicator for extracting the subgroups? data is the df. cluster={1,...,14}. This is how I was compiling: for (i in 1:14) { my.summary-data$cluster==i c(mean(?),var(?)) summary(var_A~cluster, fun=my.summary,data=data) summary(var_B~cluster, fun=my.summary,data=data) summary(var_C~cluster, fun=my.summary,data=data) summary(var_D~cluster, fun=my.summary,data=data) summary(var_E~cluster, fun=my.summary,data=data) summary(var_F~cluster, fun=my.summary,data=data) summary(var_G~cluster, fun=my.summary,data=data) } thanks for your patience. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Post positive reviews
Googles Huge Change and How it affects you. Anyone can now post bad reviews and kill your rank. We post good reviews and improve your rank. We post good reviews to keep others from killing your rank. Google: Judge, Jury and Online Shopping Executioner Google rank is based on reviews of your business? Google Statement: ...in the last few days we developed an algorithmic solution which detects the merchant from the Times article along with hundreds of other merchants that, in our opinion, provide an extremely poor user experience. The algorithm we incorporated into our search rankings represents an initial solution to this issue, and Google users are now getting a better experience as a result. This means that anyone can write bad reviews about your business and lower your ranking. We knew that getting good reviews and not getting bad reviews was always important. Now it is a must to have good reviews for your business to keep the rank safe or to improve rank with Google. We post positive reviews for your company. We have the experience and ability to post hundreds of positive reviews that are all unique content and posted on unique IP addresses. Visit www.postgoodreviews.com for more information. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] stepAIC: plot predicted versus observed
Hi, stepAIC generic plot function creates useful graphics for the diagnosis of multiple regressions. To create predicted versus observed plots, I use to look for the coefficients, copy them by hand, calculate R², then plot. Is there a more automated way to plot predicted versus observed with its associated R² output using stepAIC, or another function? Kind regards, S.-É. Parent Université Laval, Québec Canada -- View this message in context: http://r.789695.n4.nabble.com/stepAIC-plot-predicted-versus-observed-tp3085991p3085991.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to ignore data
Oh dear oh dear!!! another arrogant statistician/scientist One asks for help and instead one gets an ear full!!! So much for the much vaunted helpful R community. But thanks anyway, I guess you were trying Steve On 2010/12/13 08:17 PM, Bert Gunter wrote: Inline below. -- Bert On Mon, Dec 13, 2010 at 9:42 AM, Steve Sidneysbsid...@mweb.co.za wrote: Thanks for the questions. 1) The data represents micro-organism counts and a count of zero in this case is highly unlikely given the info we have; including the other participants. ?? Censoring or an experimental failure? Big difference. 2) The data is submitted in duplicate and then a standardised sum and difference is established and is used to calculate a Z-score which is used as a measure of performance. Z scores are usually inappropriate for count data, which are discrete and tend to be skew. Given both 1) and 2) it is necessary to exclude a raw count of zero (since the log of 0 is meaningless) and a count of one (since the log of 1 of course is zero). False. Correct statement is: Because I do not know the statistical methodology necessary to handle such discrete data with 0 counts, I exclude them. You are confusing your ignorance of statistical methodology with the need for spurious ad hoc treatments. 0 counts can and should be handled by appropriate statistical methods (e.g. possibly 0 inflated Poisson models via glm() or otherwise). I guess one can think of these values as outliers and that is what I am trying to exclude. This is a wholly unscientific statement, I'm afraid. There is ample evidence that such an approach is acceptable. What evidence, pray tell? -- a prior culture of inappropriate analyses, perhaps? I do not wish to engage in a debate about this, but, again, all I can say is that the above statement is not scientific. If I were consulting with you, I would say Please show me your 'evidence.' But, of course, I am not, and won't. None of this is to say that you aren't correct in all respects. It is just that you have raised all my usual warning flags, so that I am somewhat skeptical. But that's MY problem. This is the last I will say on the matter, so feel free to get in the final word, as I will not respond. And I wish you success in your efforts. -- Bert Thanks for the interest Steve On 2010/12/13 06:47 PM, Stavros Macrakis wrote: If you need to take the log of the values for your calculation, then what does it mean that you have 0 values in the input? And why do you need to exclude the 1 values? Are you sure that a) you are doing the correct kind of analysis and b) the analysis is correct if you exclude 0 and 1? -s On Mon, Dec 13, 2010 at 10:38, Steve Sidneysbsid...@mweb.co.zawrote: Dear list I have quite a small data set in which I need to have the following values ignored - not used when performing an analysis but they need to be included later in the report that I write. Can anyone help with a suggestion as to how this can be accomplished Values to be ignored 0 - zero and 1 this is in addition to NA (null) The reason is that I need to use the log10 of the values when performing the calculation. Currently I hand massage the data set, about a 100 values, of which less than 5 to 10 are in this category. The NA values are NOT the problem What I was hoping was that I did not have to use a series of if and ifelse statements. Perhaps there is a more elegant solution. Any ideas would be welcomed. Regards Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] curve
On Mon, Dec 13, 2010 at 2:12 PM, Ashta sewa...@gmail.com wrote: Thanks Sarah, 1. to shade or color (blue) the curve using the criterion that any values greater than 11,000 I think I was not clear in the above point. I want shade not the line but the area under the curve, Here's an example of how to do that using polygon: http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=7 and Your last line of code, segments(x0=mean(test1), y0=0, y1=curveheight) gave me the following error message Error in segments(x0 = mean(test1), y0 = 0, y1 = curveheight) : element 3 is empty; the part of the args list of '.Internal' being evaluated was: (x0, y0, x1, y1, col = col, lty = lty, lwd = lwd, ...) could you check it please I checked it before I sent it to you. The code I provided works correctly on my computer. (R 2.12.0, Linux). You could try this statement instead: segments(x0 = mean(test1), y0 = 0, x1=mean(test1), y1 = curveheight) Sarah -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to ignore data
Steve Sidney sbsidney at mweb.co.za writes: Oh dear oh dear!!! another arrogant statistician/scientist One asks for help and instead one gets an ear full!!! So much for the much vaunted helpful R community. But thanks anyway, I guess you were trying Steve I know I shouldn't bite, but didn't I give you a helpful answer? Shouldn't that be One asks for help and gets an earful in addition to the help? There are a variety of possible responses to a question that suggests that the questioner wants to do something that one thinks is a bad idea: they represent various combination of (1) (requested) help to perform the task asked and (2) (unrequested) advice. Personality, philosophy, and the degree of perceived unwisdom/danger in the specified activity determine the mix. cheers Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to ignore data
On Mon, Dec 13, 2010 at 7:36 PM, Steve Sidney sbsid...@mweb.co.za wrote: Oh dear oh dear!!! another arrogant statistician/scientist One asks for help and instead one gets an ear full!!! So much for the much vaunted helpful R community. But thanks anyway, I guess you were trying Steve, we're statisticians. we love data. we hate seeing it go to waste. every zero, every one, every NA value is dear to our hearts. Bert was showing the same concern that a mother does for her children. Don't hate him for that. Knowing you have 100 values with about 5-10% 0/1 values is half the story we need - if the remaining 90-95% are in the thousands then clearly these low ones are failures, and everyone on this list will say treat as missing values, do X = X[X1] and carry on. However if the real values are in the tens and units then maybe something is going on. You did say removing them hasn't affected previous analyses, but some more data would help - we just love data... Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Does a formula object have a left hand side
Hello, Does anyone know of a function that will determine whether or not a formula object has a left hand side? I.e., can differentiate between y ~ x + z and ~ x + z Perhaps I'm overlooking the obvious... Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Does a formula object have a left hand side
attr(terms(formula), response) is 1 if the formula has a left hand side and 0 otherwise. At a lower level, you can look at length(formula): 2 means there is no LHS, 3 means there is (any other value indicates that someone made a call object that the parser would not make). Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Erik Iverson Sent: Monday, December 13, 2010 12:17 PM To: R-help Subject: [R] Does a formula object have a left hand side Hello, Does anyone know of a function that will determine whether or not a formula object has a left hand side? I.e., can differentiate between y ~ x + z and ~ x + z Perhaps I'm overlooking the obvious... Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Does a formula object have a left hand side
Erik - Perhaps the response attribute of the terms() function? formula1 = formula(y ~ x + z) formula2 = formula(~x + z) attr(terms(formula1),'response') [1] 1 attr(terms(formula2),'response') [1] 0 Although there may be more direct ways. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Mon, 13 Dec 2010, Erik Iverson wrote: Hello, Does anyone know of a function that will determine whether or not a formula object has a left hand side? I.e., can differentiate between y ~ x + z and ~ x + z Perhaps I'm overlooking the obvious... Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to ignore data
On 2010-12-13 11:36, Steve Sidney wrote: Oh dear oh dear!!! another arrogant statistician/scientist One asks for help and instead one gets an ear full!!! So much for the much vaunted helpful R community. But thanks anyway, I guess you were trying Steve Ouch!! I didn't offer advice earlier because (a) I felt that Ben had adequately shown you how to do what you felt compelled to do and (b) Bert had more than adequately conveyed the appropriate warnings. But let me add two comments now: 1. You said that you wanted eliminate 1s because the log of 1 is zero. That implies, at least to me, that you might also be inclined to eliminate 0s when you want to calculate the mean of 4, 5, 0, 9, 0, 2. But perhaps you were just incomplete in your problem description. Still, we do see a lot of statistical abuse on this list. 2. As is so frequent with requests for help, the asked-for (in the posting guide) reproducible example was missing. That might easily have modified at least some of the advice (as indicated by Barry). Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Does a formula object have a left hand side
Excellent, thank you all! William Dunlap wrote: attr(terms(formula), response) is 1 if the formula has a left hand side and 0 otherwise. At a lower level, you can look at length(formula): 2 means there is no LHS, 3 means there is (any other value indicates that someone made a call object that the parser would not make). Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Erik Iverson Sent: Monday, December 13, 2010 12:17 PM To: R-help Subject: [R] Does a formula object have a left hand side Hello, Does anyone know of a function that will determine whether or not a formula object has a left hand side? I.e., can differentiate between y ~ x + z and ~ x + z Perhaps I'm overlooking the obvious... Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integration with LaTex and LyX
I tried hard to write an automagic script to configure LyX so that you don't need to go to the instructions on CRAN (http://cran.r-project.org/contrib/extra/lyx/): http://yihui.name/en/2010/10/how-to-start-using-pgfsweave-in-lyx-in-one-minute/ This works for LyX 1.6.x and major OS'es with probability 95%. There will be substantial changes in LyX 2.0, and I will need to modify my configurations after LyX 2.0 is out (hopefully early next year). Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Mon, Dec 13, 2010 at 9:27 AM, Eduardo de Oliveira Horta eduardo.oliveiraho...@gmail.com wrote: Hello, Are there any packages which allow for a good integration between R and LaTex / LyX? I'm interested mainly in automatic (automagic?) imports of plots/graphics. Thanks in advance and best regards, Eduardo de Oliveira Horta [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Date variable error
Greetings In attempting to create a date variable based on month (e.g., February, April, etc.) and year (e.g., 2006) data, wherein I converted Month to a factor with Jan=1...Dec=12, I used the following command: data$Date-mdy.date(month=data$Month,day=15,year=data$Year) however, I get a message Error: trunc not meaningful for factors Any advice? Cheers Kurt *** Kurt Lewis Helf, Ph.D. Ecologist EEO Counselor National Park Service Cumberland Piedmont Network P.O. Box 8 Mammoth Cave, KY 42259 Ph: 270-758-2163 Lab: 270-758-2151 Fax: 270-758-2609 Science, in constantly seeking real explanations, reveals the true majesty of our world in all its complexity. -Richard Dawkins The scientific tradition is distinguished from the pre-scientific tradition in having two layers. Like the latter it passes on its theories but it also passes on a critical attitude towards them. The theories are passed on not as dogmas but rather with the challenge to discuss them and improve upon them. -Karl Popper ...consider yourself a guest in the home of other creatures as significant as yourself. -Wayside at Wilderness Threshold in McKittrick Canyon, Guadalupe Mountains National Park, TX Cumberland Piedmont Network (CUPN) Homepage: http://tiny.cc/e7cdx CUPN Forest Pest Monitoring Website: http://bit.ly/9rhUZQ CUPN Cave Cricket Monitoring Website: http://tiny.cc/ntcql CUPN Cave Aquatic Biota Monitoring Website: http://tiny.cc/n2z1o __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] OFFTOPIC: BAD SCIENCE by Ben Goldacre
Folks: This is off topic, but I believe many R-Help participants would be interested in this. My apologies to my British colleagues, who probably already know about this, and to others for whom this is a waste of their time. Dr. Ben Goldacre, a British Physician and science columnist, has written a book for popular consumption entitled: Bad Science: Quacks, Hacks, and Big Pharma Flacks . While much of it is concerned with fraudulent products (Homeopathic treatments, nutritional supplements that reduce wrinkles,grow hair, etc.), readers may find large parts of it relevant to the role of statistical thinking in science (mostly medicine). For example, we all know about the placebo effect, but I found his chapter on it fascinating. He also has a lot to say about publication bias and why the public at large -- especially journalists -- need to understand the role of statistics in evaluating scientific claims. His website and columns may also be of interest: http://www.badscience.net/ Cheers, Bert -- Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.