[R] Automated figure production
Hello, everybody Two questions: 1) I am new to maillists, and particularly to r-help maillist. Where specifically do I go online to see my question and answers to it??? If I use the searchable archives, or archives by Robert King, I see my question but not the answers, though I know that at least one persion posted the answer to r-help. Why do not I see the answers? 2) I need to produce 104 figures. In each graphic window I want to plot 8 figures. How do I automate the process so that it opens 13 separate graphic windows in R Gui and plots the figures? (I then paste them in Word, one by one). Also, how do I save all 104 figures to one PDF file? Here is the code to produce the figures, and at this point I change the j-index in the for-loop by hand, produce 8 figures, copy them to Word, then change j to 9:16 etc... #HOW TO PRODUCE ALL 104 GRAPHS AT ONCE old.par-par(no.readonly=TRUE) par(mfrow=c(4,2)) for(j in 1:8) { plot(Cleaned[zz[,j],4], type=b, main=long.names[j], xlab=Month, ylab=MFR, col=blue) abline(h=0, col=red) } par(old.par) Thanks for your help! Sergey -- Laziness is nothing more than the habit of resting before you get tired. - Jules Renard (writer) Experience is one thing you can't get for nothing. - Oscar Wilde (writer) When you are finished changing, you're finished. - Benjamin Franklin (President) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to fill between 2 stair plots
Williams Scott wrote: Hi all, I want to create a simple plot with 2 type='s' lines on it: plot(a, b, type='s') lines(x, y, type='s') I wish to then fill the area between the curves with a colour to accentuate the differences eg col=gray(0.95). I cant seem to come up with a simple method for this. Any pointers in the right direction much appreciated. I don't think there is a really simple method for this. I'd start with converting the two 's' lines to ordinary lines along the lines of N - length(a) a1 - c(a[1],rep(a[-1],each=2),a[N]) # possibly a[N]+a_bit for the final step) b1 - rep(b,each=2) x1, y1 similarly, then polygon(c(a1,rev(x1)),c(b1,rev(y1), col=grey) (Did I confuse 's' and 'S'? Anyways, you get the idea) Cheers Scott _ Dr. Scott Williams MBBS BScMed FRANZCR Peter MacCallum Cancer Centre Melbourne, Australia [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to fill between 2 stair plots
Williams Scott napsal(a): Hi all, I want to create a simple plot with 2 type='s' lines on it: plot(a, b, type='s') lines(x, y, type='s') I wish to then fill the area between the curves with a colour to accentuate the differences eg col=gray(0.95). I cant seem to come up with a simple method for this. Any pointers in the right direction much appreciated. Cheers Scott _ Dr. Scott Williams MBBS BScMed FRANZCR Peter MacCallum Cancer Centre Melbourne, Australia [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ?polygon might be useful. See also demo(graphics) Petr -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Automated figure production
On 26 Feb 2007 at 9:14, Sergey Goriatchev wrote: Date sent: Mon, 26 Feb 2007 09:14:17 +0100 From: Sergey Goriatchev [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Subject:[R] Automated figure production Hello, everybody Two questions: 1) I am new to maillists, and particularly to r-help maillist. Where specifically do I go online to see my question and answers to it??? If I use the searchable archives, or archives by Robert King, I see my question but not the answers, though I know that at least one persion posted the answer to r-help. Why do not I see the answers? You is close. Try click on latest R-help in Searchable R-list archive 2) I need to produce 104 figures. In each graphic window I want to plot 8 figures. How do I automate the process so that it opens 13 separate graphic windows in R Gui and plots the figures? (I then paste them in Word, one by one). Also, how do I save all 104 figures to one PDF file? see ?pdf, ?png Here is the code to produce the figures, and at this point I change the j-index in the for-loop by hand, produce 8 figures, copy them to Word, then change j to 9:16 etc... #HOW TO PRODUCE ALL 104 GRAPHS AT ONCE e.g. cycle png(name.based.on.cycle.pointer, 800,800) old.par-par(no.readonly=TRUE) par(mfrow=c(4,2)) for(j in 1:8) { plot(Cleaned[zz[,j],4], type=b, main=long.names[j], xlab=Month, ylab=MFR, col=blue) abline(h=0, col=red) } par(old.par) dev.off() endcycle or similar without cycle using pdf device see onefile=T option HTH Petr Thanks for your help! Sergey -- Laziness is nothing more than the habit of resting before you get tired. - Jules Renard (writer) Experience is one thing you can't get for nothing. - Oscar Wilde (writer) When you are finished changing, you're finished. - Benjamin Franklin (President) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Chi Square with two tab-delimited text files
Hi, I want to do a chi square test and I have two tab delimited text files with Expected and Observed values to compare. Each file contains only the values and are 48 rows by 116 columns. I have managed to do something with them, but I don't think it is right as I got a p value of 1. In this case I used the read.table() function to read the values from the files. But I don't know if this was right. x=read.table(file=C:/Program Files/R/R-2.2.1/Projects/Stats EU/Expected input.txt) y=read.table(file=C:/Program Files/R/R-2.2.1/Projects/Stats EU/Observed input.txt) chisq.test(x,y) Pearson's Chi-squared test data: x X-squared = 4.4602, df = 5405, p-value = 1 Warning message: Chi-squared approximation may be incorrect in: chisq.test(x, y) Maybe the scan() function is more correct?? Using this I got: x=scan(file=C:/Program Files/R/R-2.2.1/Projects/Stats EU/Observed input.txt) Read 5568 items y=scan(file=C:/Program Files/R/R-2.2.1/Projects/Stats EU/Expected input.txt) Read 5568 items chisq.test(x,y) Pearson's Chi-squared test data: x and y X-squared = 172306.4, df = 13880, p-value 2.2e-16 Warning message: Chi-squared approximation may be incorrect in: chisq.test(x, y) Any help would be much appreciated. Regards, Carina [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Double-banger function names: preferences and suggestions
hadley wickham wrote: What do you prefer/recommend for double-banger function names: 1 scale.colour 2 scale_colour 3 scaleColour 1 is more R-like, but conflicts with S3. 2 is a modern version of number 1, but not many packages use it. Number 3 is more java-like. (I like number 2 best) Or you can be lisp-ish and use hyphens (or many other symbols) by quoting: scale-colour=2 ls() [1] scale-colour but that requires further perversions: get(scale-colour) [1] 2 I like (3), aka camelCase - aka about a dozen other names: http://en.wikipedia.org/wiki/Camel_case - but that's mainly because its widely used in Python, and Python syntax is just marvellous. The more R syntax tends to Python syntax the better. Let's get rid of curly brackets and make whitespace significant... But I digress. As usual. ANytHiNg bUt sTUdLY cApS: http://en.wikipedia.org/wiki/StudlyCaps Barry __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] some caracter dont work with JGR
Den Fr, 2007-02-23, 19:21 skrev Ronaldo Reis Junior: Hi, I testing JGR and I like, but my ~ caracter dont work. My keyboard is Brazilian ABNT2. The key is OK, only in JGR it dont work. Anybody have any idea about this? Yes, and it's known problem -- see e.g. http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6371251. It has been discussed on more than one occasion over at the RoSuDa devel list (http://www.rosuda.org/lists.shtml) for which questions on JGR and related projects are more appropriate than R-help. HTH, Henric Thanks Ronaldo -- Mais variado que baldeação em Cacequi. -- Prof. Ronaldo Reis Júnior | .''`. UNIMONTES/Depto. Biologia Geral/Lab. Ecologia Evolutiva | : :' : Campus Universitário Prof. Darcy Ribeiro, Vila Mauricéia | `. `'` CP: 126, CEP: 39401-089, Montes Claros - MG - Brasil | `- Fone: (38) 3229-8190 | [EMAIL PROTECTED] | [EMAIL PROTECTED] | ICQ#: 5692561 | LinuxUser#: 205366 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nested design in lme, need help with specifying model
Dear Radka I'm not sure I quite understand your design and quite where the nesting comes in. But a quick suggestion is why are you adding species as random as well as fixed? I don't think you can do this or indeed should do it. I think this is why you get problems with your fixed effects. If you have 3 species then species ought to be fixed. Replicate is more the sort of effect that ought to be random, this ought to pick up the fact that prey within one run of the experiment won't be independent. But if you only have two replicates per treatment (species of prey?), then this will limit your ability to detect differences between species of prey, unless your within-replicate variation is very low. You can look at this very simply but not quite as powerfully by averaging the responses for each replicate and doing a non nested anova. Re your second analysis, this seems along the right lines. How have you coded replicate? This may explain your results. Without more details on the plot you did its difficult to help further. regards Mike Radka Ptacnikova [EMAIL PROTECTED] 25/02/2007 23:32 Hi, I wonder if anyone can help me with specifying a right model for my analysis. I am a beginner to lme methods and though have spent already many hours studying from various books an on-line helps, I was unfortunately not able to find a solution to my problem on my own. Data structure: I studied escape behavior of three species of a prey to a predator. The prey specimens (many) were in a vessel, together with one predator. Escape responses were video-recorded when a prey approached the predator close enough and jumped consequently away. Each set was run twice with a fresh predator and a fresh set of the prey specimens, leading to two replicates per treatment. Unequal number of shots (i.e. prey specimens) were analyzed in each of the two replicates for each of the three prey species (range 11-19). The data are therefore unbalanced and also variance for treatments/replicates is far from being homogeneous, so that a nested anova is not a good choice here. As the number of prey specimens was rather high, I assume that each shot represents a different prey individual. My questions: 1) Do the three prey species significantly differ in their escape response? 2) What was variability between replicates within a species and how much did it contribute to overall variability? Now, to my best understanding, the model should be: mod1-lme(Escape.parameter~Species, random=~1|Species/Replicate) as I am interested in Species as fixed effects and want to know variability caused by Replicates nested within Species as random effects. However, when running this model, I get Random effects: Formula: ~1 | Species (Intercept) StdDev:2.937479 Formula: ~1 | Replicate %in% Species (Intercept) Residual StdDev:4.973931 4.266302 Fixed effects: Max_speed ~ Species Value Std.ErrorDF t-value p-value (Intercept) 23.792040 4.798143 39 4.958593 0 Spec2 -7.121766 6.747930 0 -1.055400 NaN Spec3 -9.779830 6.725391 0 -1.454165 NaN So I get variance within species and within replicates, but what the hell are these zero DF's, leading to zero p's and how should I interpret them? Another model I tried was: mod2-lme(Escape.parameter~Species, random=~1|Replicate) Random effects: Formula: ~1 | Replicate (Intercept) Residual StdDev: 0.0002733313 5.180472 Fixed effects: Max_speed ~ Species Value Std.ErrorDF t-value p-value (Intercept)26.00364 1.561971 41 16.647963 0e+00 SpeciesSpec2 -7.93297 2.056430 41 -3.857641 4e-04 SpeciesSpec3-11.81048 1.962713 41 -6.017425 0e+00 Alright, I get the among species differences, but I am confused here with the very low StdDev of Replicate as a random effect, since I know f.ex. from a plot, that it is relatively high. Which leads me to thinking, that something is wrong here. I'd appreciate any hints and suggestions. Radka Never Miss an Email __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- This message (and any attachments) is for the recipient only...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Add-up duplicates and merge
Hello, a have two matrices of data as below. I would like to add-up the duplicate in terms of pair of names in rows, and then merge the values in the second matrix to the pairs as two new variables x3 and x4. Input ,x1,x2 jane.mike,31,43 jane.steve,32,2 jane.steve,5,3 jim.mike,76,5 jane.steve,4,4 mike.steve,54,7 mike.steve,5,7 jane.mike,7,8 and ,y jane,0.3 jim,0.4 mike,0.1 carl,0.5 john,0.9 steve,0.4 dirk,0.2 Output: ,x1,x2,x3,x4 jane.mike,38,51,0.3,0.1 jane.steve,41,9,0.3,0.4 jim.mike,76,5,0.4,0.1 mike.steve,59,14,0.1,0.4 Any help appreciated, Serguei [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] PlotAffyRNAdeg on Estrogen Data
Hi everyone, I'm trying to generate an RNA degradation plot of the Estrogen example data plot, but seem to get an error. I've tried defining an ylim value, ylim=c(0,30) , but it doesn't seem to work either. My code is as follows: RNAdeg-AffyRNAdeg(Data) png(DegLoc, width=720, height=720) par(ann=FALSE) par(mar=c(3,3,0.1,0.1)) plotAffyRNAdeg(RNAdeg,col=cols, cex.axis=1.2) Error in plot.window(xlim, ylim, log, asp, ...) : need finite 'ylim' values dev.off() null device 1 Thanks in advance Tony [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add-up duplicates and merge
Hello, for the first task change the names if they are character into a factor and have a look at ?by For the merge you will need strsplit() on character vectors (so be carefull to change the factor back to character) and merge() Regards, Winfried -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Serguei Kaniovski Sent: Monday, February 26, 2007 12:14 PM To: r-help@stat.math.ethz.ch Subject: [R] Add-up duplicates and merge Hello, a have two matrices of data as below. I would like to add-up the duplicate in terms of pair of names in rows, and then merge the values in the second matrix to the pairs as two new variables x3 and x4. Input ,x1,x2 jane.mike,31,43 jane.steve,32,2 jane.steve,5,3 jim.mike,76,5 jane.steve,4,4 mike.steve,54,7 mike.steve,5,7 jane.mike,7,8 and ,y jane,0.3 jim,0.4 mike,0.1 carl,0.5 john,0.9 steve,0.4 dirk,0.2 Output: ,x1,x2,x3,x4 jane.mike,38,51,0.3,0.1 jane.steve,41,9,0.3,0.4 jim.mike,76,5,0.4,0.1 mike.steve,59,14,0.1,0.4 Any help appreciated, Serguei [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Chi Square with two tab-delimited text files
Carina Brehony napsal(a): Hi, I want to do a chi square test and I have two tab delimited text files with Expected and Observed values to compare. Each file contains only the values snip There are a lot of chi^2 tests, most of them compare OE quantities and it is not clear which one you want to use. I'd guess a goodness of fit test, but who knows? See ?chisq.test and the examples given there. It also tells you that the y-argument is ignored if x is a matrix (that's probably the reason why you get different results using read.table and scan). Petr -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Chi Square with two tab-delimited text files
Yes, I would like to do a goodness-of-fit test. -Original Message- From: Petr Klasterecky [mailto:[EMAIL PROTECTED] Sent: 26 February 2007 11:50 To: Carina Brehony Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Chi Square with two tab-delimited text files Carina Brehony napsal(a): Hi, I want to do a chi square test and I have two tab delimited text files with Expected and Observed values to compare. Each file contains only the values snip There are a lot of chi^2 tests, most of them compare OE quantities and it is not clear which one you want to use. I'd guess a goodness of fit test, but who knows? See ?chisq.test and the examples given there. It also tells you that the y-argument is ignored if x is a matrix (that's probably the reason why you get different results using read.table and scan). Petr -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Chi Square with two tab-delimited text files
It's a bit difficult to advise without knowing what the rows and columns represent, but why not just calculate the statistic yourself, given that you already have observed and expected values? For example: chi2 - sum((y-x)^2/x) On 26/02/07, Carina Brehony [EMAIL PROTECTED] wrote: Yes, I would like to do a goodness-of-fit test. -Original Message- From: Petr Klasterecky [mailto:[EMAIL PROTECTED] Sent: 26 February 2007 11:50 To: Carina Brehony Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Chi Square with two tab-delimited text files Carina Brehony napsal(a): Hi, I want to do a chi square test and I have two tab delimited text files with Expected and Observed values to compare. Each file contains only the values snip There are a lot of chi^2 tests, most of them compare OE quantities and it is not clear which one you want to use. I'd guess a goodness of fit test, but who knows? See ?chisq.test and the examples given there. It also tells you that the y-argument is ignored if x is a matrix (that's probably the reason why you get different results using read.table and scan). Petr -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] If you had just one book on R to buy...
On 2/25/07, Julien Barnier [EMAIL PROTECTED] wrote: Hi, I am starting a new job as a study analyst for a social science research unit. I would really like to use R as my main tool for data manipulation and analysis. So I'd like to ask you, if you had just one book on R to buy (or to keep), which one would it be ? I already bought the Handbook of Statistical Analysis Using R, but I'd like to have something more complete, both on the statistical point of view and on R usage. I thought that Modern applied statistics with S-Plus would be a good choice, but maybe some of you could have interesting suggestions ? Dear Julien, I'd definitely go for MASS if you already have Handbook. MASS is an awesome book, but you did not tell us anything about your background (stats begginners, for instance, sometimes get lost in MASS, because that is not the target audience). In terms of books of this level, MASS is unique. (There are more specific books for certain topics, such as mixed models, etc; but for a wide coverage, I'd go with MASS). HTH, R. Thanks in advance, -- Julien __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ramon Diaz-Uriarte Statistical Computing Team Structural Biology and Biocomputing Programme Spanish National Cancer Centre (CNIO) http://ligarto.org/rdiaz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Chi Square with two tab-delimited text files
Hi, The files look like below and the rows and columns are numbers of genetic types e.g. row1 is type 4; column1 is type A. So for, row1:column1 cell there are 78 type 4/type A combinations. I hope this makes sense! 78 500 18 6 0 4 0 1 6 1 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 2 1 0 0 0 1 0 0 0 0 23 0 0 0 7 0 0 7 0 0 0 6 0 8 0 0 0 0 0 0 14 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 45 0 0 0 0 0 0 0 0 0 0 0 0 3 0 40 0 0 0 0 0 0 0 0 0 0 0 0 0 12 0 0 0 0 8 4 0 0 0 0 0 0 etc... -Original Message- From: David Barron [mailto:[EMAIL PROTECTED] Sent: 26 February 2007 12:12 To: Carina Brehony; r-help Subject: Re: [R] Chi Square with two tab-delimited text files It's a bit difficult to advise without knowing what the rows and columns represent, but why not just calculate the statistic yourself, given that you already have observed and expected values? For example: chi2 - sum((y-x)^2/x) On 26/02/07, Carina Brehony [EMAIL PROTECTED] wrote: Yes, I would like to do a goodness-of-fit test. -Original Message- From: Petr Klasterecky [mailto:[EMAIL PROTECTED] Sent: 26 February 2007 11:50 To: Carina Brehony Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Chi Square with two tab-delimited text files Carina Brehony napsal(a): Hi, I want to do a chi square test and I have two tab delimited text files with Expected and Observed values to compare. Each file contains only the values snip There are a lot of chi^2 tests, most of them compare OE quantities and it is not clear which one you want to use. I'd guess a goodness of fit test, but who knows? See ?chisq.test and the examples given there. It also tells you that the y-argument is ignored if x is a matrix (that's probably the reason why you get different results using read.table and scan). Petr -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rdonlp2 0.2-1 released
Hello R-lists, I have released the new version of Rdonlp2 (an extension package to solve nonlinear constrained optimization problem). 0.2-1 improves stability and usability, and runs little faster. Windows, OSX binary, and source files are available from: http://arumat.net/Rdonlp2/ Any feedbacks are highly welcome. Regards, Ryuichi Tamura __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Chi Square with two tab-delimited text files
In that case, you can just ignore the expected values and use the observed values in the chisq.test. The reason you got a p value of 1 before is because the second argument was ignored, and so you did a chi square test on the expected values alone. If you have loaded the obseved values into a matrix y using read.table as in your first example, then just use chisq.test(y). But you should notice that you have a lot of zero cells and so probably lots of small expected values, which is a problem for the chi square test. On 26/02/07, Carina Brehony [EMAIL PROTECTED] wrote: Hi, The files look like below and the rows and columns are numbers of genetic types e.g. row1 is type 4; column1 is type A. So for, row1:column1 cell there are 78 type 4/type A combinations. I hope this makes sense! 78 500 18 6 0 4 0 1 6 1 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 2 1 0 0 0 1 0 0 0 0 23 0 0 0 7 0 0 7 0 0 0 6 0 8 0 0 0 0 0 0 14 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 45 0 0 0 0 0 0 0 0 0 0 0 0 3 0 40 0 0 0 0 0 0 0 0 0 0 0 0 0 12 0 0 0 0 8 4 0 0 0 0 0 0 etc... -Original Message- From: David Barron [mailto:[EMAIL PROTECTED] Sent: 26 February 2007 12:12 To: Carina Brehony; r-help Subject: Re: [R] Chi Square with two tab-delimited text files It's a bit difficult to advise without knowing what the rows and columns represent, but why not just calculate the statistic yourself, given that you already have observed and expected values? For example: chi2 - sum((y-x)^2/x) On 26/02/07, Carina Brehony [EMAIL PROTECTED] wrote: Yes, I would like to do a goodness-of-fit test. -Original Message- From: Petr Klasterecky [mailto:[EMAIL PROTECTED] Sent: 26 February 2007 11:50 To: Carina Brehony Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Chi Square with two tab-delimited text files Carina Brehony napsal(a): Hi, I want to do a chi square test and I have two tab delimited text files with Expected and Observed values to compare. Each file contains only the values snip There are a lot of chi^2 tests, most of them compare OE quantities and it is not clear which one you want to use. I'd guess a goodness of fit test, but who knows? See ?chisq.test and the examples given there. It also tells you that the y-argument is ignored if x is a matrix (that's probably the reason why you get different results using read.table and scan). Petr -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] PLotting R graphics/symbols without user x-y scaling
Is it possible to add lines or other user defined graphics to a plot in R that does not depend on the user scale for the plot? For example I have a plot plot(x,y) and I want to add some graphic that is scaled in inches or cm but I do not want the graphic to change when the x-y scales are changed - like a thermometer, scale bar or other symbol - How does one do this? I want to build my own library of glyphs to add to plots but I do not know how to plot them when their size is independent of the device/user coordinates. Is it possible to add to the list of symbols in the function symbols() other than: _circles_, _squares_, _rectangles_, _stars_, _thermometers_, and _boxplots_ can I make my own symbols and have symbols call these? Thanks- -- Jonathan M. Lees Professor THE UNIVERSITY OF NORTH CAROLINA AT CHAPEL HILL Department of Geological Sciences Campus Box #3315 Chapel Hill, NC 27599-3315 TEL: (919) 962-0695 FAX: (919) 966-4519 [EMAIL PROTECTED] http://www.unc.edu/~leesj __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Chi Square with two tab-delimited text files
Hi, Thanks for the input. I have tried the test again just using the Observed values and the read.table() function and get this: data: y X-squared = NaN, df = 5405, p-value = NA Warning message: Chi-squared approximation may be incorrect in: chisq.test(y) So it doesn't seem to like it! I guess the zeroes are a problem for it. Is there another way around? Do I need to have the totals of each column and row in the file also? Thanks, Carina __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] PlotAffyRNAdeg on Estrogen Data
Hi Tony, This question concerns a Bioconductor package, so is best asked on the BioC mailing list instead of R-help. Best, Jim Brooks, Anthony B wrote: Hi everyone, I'm trying to generate an RNA degradation plot of the Estrogen example data plot, but seem to get an error. I've tried defining an ylim value, ylim=c(0,30) , but it doesn't seem to work either. My code is as follows: RNAdeg-AffyRNAdeg(Data) png(DegLoc, width=720, height=720) par(ann=FALSE) par(mar=c(3,3,0.1,0.1)) plotAffyRNAdeg(RNAdeg,col=cols, cex.axis=1.2) Error in plot.window(xlim, ylim, log, asp, ...) : need finite 'ylim' values dev.off() null device 1 Thanks in advance Tony [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Test of Presence Matrix HOWTO?
Hello, Imagine 3 lists like so: a - list(A,B,C,D) b - list(A,B,E,F) c - list(A,C,E,G) What I need (vennDiagram) is a matrix characterizing with 1 or 0 whether any given member is present or not like so: x1 x2 x3 [1,] 1 1 1 [2,] 1 1 0 [3,] 1 0 1 [4,] 1 0 0 [5,] 0 1 1 [6,] 0 1 0 [7,] 0 0 1 (where the rows represent A-G and the columns a-c, respectively). table(c(a,b,c)) will give me a quick answer for the 1 1 1 case, but how to deal with the other cases efficiently without looping over each string and looking for membership %in% each list? Thanks for enlightening the learning, Joh __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] someattributes
I'd like to use someattributes(), as described in documentation for R version 2.4.1 (windows build) help(attributes) however, someattributes() does not seem to exist. someattributes() Error: could not find function someattributes Is this true or am I doing something wrong? -John Confidentiality Notice: This e-mail message, including any a...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test of Presence Matrix HOWTO?
a - c(A,B,C,D) b - c(A,B,E,F) c - c(A,C,E,G) Df - cbind(a, b, c) apply(Df, 2, function(x)(LETTERS[1:7] %in% x)) a b c [1,] TRUE TRUE TRUE [2,] TRUE TRUE FALSE [3,] TRUE FALSE TRUE [4,] TRUE FALSE FALSE [5,] FALSE TRUE TRUE [6,] FALSE TRUE FALSE [7,] FALSE FALSE TRUE apply(Df, 2, function(x)(as.numeric(LETTERS[1:7] %in% x))) a b c [1,] 1 1 1 [2,] 1 1 0 [3,] 1 0 1 [4,] 1 0 0 [5,] 0 1 1 [6,] 0 1 0 [7,] 0 0 1 Cheers, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Reseach Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be Do not put your faith in what statistics say until you have carefully considered what they do not say. ~William W. Watt A statistical analysis, properly conducted, is a delicate dissection of uncertainties, a surgery of suppositions. ~M.J.Moroney -Oorspronkelijk bericht- Van: [EMAIL PROTECTED] [mailto:r-help- [EMAIL PROTECTED] Namens Johannes Graumann Verzonden: maandag 26 februari 2007 16:25 Aan: r-help@stat.math.ethz.ch Onderwerp: [R] Test of Presence Matrix HOWTO? Hello, Imagine 3 lists like so: a - list(A,B,C,D) b - list(A,B,E,F) c - list(A,C,E,G) What I need (vennDiagram) is a matrix characterizing with 1 or 0 whether any given member is present or not like so: x1 x2 x3 [1,] 1 1 1 [2,] 1 1 0 [3,] 1 0 1 [4,] 1 0 0 [5,] 0 1 1 [6,] 0 1 0 [7,] 0 0 1 (where the rows represent A-G and the columns a-c, respectively). table(c(a,b,c)) will give me a quick answer for the 1 1 1 case, but how to deal with the other cases efficiently without looping over each string and looking for membership %in% each list? Thanks for enlightening the learning, Joh __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test of Presence Matrix HOWTO?
you can use something like the following: a - list(A,B,C,D) b - list(A,B,E,F) c - list(A,C,E,G) # abc - list(a, b, c) unq.abc - unique(unlist(abc)) out.lis - lapply(abc, %in%, x = unq.abc) out.lis lapply(out.lis, as.numeric) I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: Johannes Graumann [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Monday, February 26, 2007 4:25 PM Subject: [R] Test of Presence Matrix HOWTO? Hello, Imagine 3 lists like so: a - list(A,B,C,D) b - list(A,B,E,F) c - list(A,C,E,G) What I need (vennDiagram) is a matrix characterizing with 1 or 0 whether any given member is present or not like so: x1 x2 x3 [1,] 1 1 1 [2,] 1 1 0 [3,] 1 0 1 [4,] 1 0 0 [5,] 0 1 1 [6,] 0 1 0 [7,] 0 0 1 (where the rows represent A-G and the columns a-c, respectively). table(c(a,b,c)) will give me a quick answer for the 1 1 1 case, but how to deal with the other cases efficiently without looping over each string and looking for membership %in% each list? Thanks for enlightening the learning, Joh __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] LD50 contrasts with lmer/lme4
Dear R-list, I have a data set from 20 pigs, each of which is tested at crossed 9 doses (logdose -4:4) and 3 skin treatment substances when exposed to a standard polluted environment. So there are 27 patches on each pig. The response is irritation=yes/no. I want to determine equally effective 50% doses (similar to old LD50), and to test the treatments against each other. I am looking for something like dose.p in MASS generalized to lmer (or glmmPQL or whatever). The direct as output by lmer are not useful, because saying 30% irritation with A and 40% with B at dose xx has less meaning than giving equivalent effective doses. Dieter - Simulated data - library(lme4) animal = data.frame(ID = as.factor(1:20), da = rnorm(1:20)) treat = data.frame(treat=c('A','B','C'), treatoff=c(1,2,1.5), treatslope = c(0.5,0.6,0.7)) gr = expand.grid(animal=animal$ID,treat=treat$treat,logdose=c(-4:4)) gr$resp = as.integer(treat$treatoff[gr$treat]+ treat$treatslope[gr$treat]*gr$logdose+ animal$da[gr$animal] + rnorm(nrow(gr),0,2) 0) gr.lmer = lmer(resp ~ treat*logdose+(1|animal),data=gr,family=binomial) summary(gr.lmer) --- Output Fixed effects: Estimate Std. Error z value Pr(|z|) (Intercept) 0.9553 0.30743.11 0.0019 ** treatB 0.8793 0.33132.65 0.0079 ** treatC 0.5516 0.30771.79 0.0730 . logdose 0.3733 0.07744.82 1.4e-06 *** treatB:logdose 0.3081 0.13232.33 0.0198 * treatC:logdose 0.2666 0.12492.13 0.0328 * - Goal Value SD p 50% logdose (A-B) xx xx xx 50% logdose (A-C) yy yy yy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] someattributes
On Feb 26, 2007, at 10:35 AM, Thaden, John J wrote: I'd like to use someattributes(), as described in documentation for R version 2.4.1 (windows build) help(attributes) however, someattributes() does not seem to exist. someattributes() Error: could not find function someattributes Is this true or am I doing something wrong? My help shows it as moreattributes, not someattributes. (MacOSX, though doesn't sound like it should be platform-specific). -John Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Double-banger function names: preferences and suggestions
Thanks to every one who contributed - there definitely isn't a consensus, but perhaps a slight preference towards number 3 (camelCase). I'm not sure yet if this beats out my personal preference for number 2. Hadley On 2/25/07, hadley wickham [EMAIL PROTECTED] wrote: What do you prefer/recommend for double-banger function names: 1 scale.colour 2 scale_colour 3 scaleColour 1 is more R-like, but conflicts with S3. 2 is a modern version of number 1, but not many packages use it. Number 3 is more java-like. (I like number 2 best) Any suggestions? Thanks, Hadley __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Barplot Graph
Hello everyone: I want to draw a Par plot but I don't know how to choose different pattern of of colors for different bars to be able to distinguish them in a black and white print. I want some kinds of patterns on the bars such as '///' or '\\\' or Suppose I have the following data in the R.dat file: 1.29 22.43 1.92 5.08 18.70 0.00 2.19 33.69 1.92 2.95 20.39 0.00 3.29 36.16 4.48 and I am using the following code: MP-read.table(file='R.dat') names(MP)-c('BL','LR','Q') cols- I want white when 'Q' column has zero and different kind of patterns when 'Q' is 1.92 and another pattern when 'Q' is 4.48 Graph-barplot(MP$LR, col=cols, width=(MP$BL)) Thanks [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Adding duplicates by rows
Hi, I am trying to add duplicates of matrix mat by row. Commands subset(mat,duplicated(rownames(mat))) or mat[which(duplicated(rownames(mat))),] return only half of the required indices. How can I find the remaining ones, ie the matches, so that I can add them up? Thanks, Serguei ___ Austrian Institute of Economic Research (WIFO) Name: Serguei Kaniovski P.O.Box 91 Tel.: +43-1-7982601-231 Arsenal Objekt 20 Fax: +43-1-7989386 1103 Vienna, Austria Mail: [EMAIL PROTECTED] A-1030 Wien http://www.wifo.ac.at/Serguei.Kaniovski [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] survival analysis using rpart
Hello, I use rpart to predict survival time and have a problem in interpreting the output of “estimated rate”. Here is an example of what I do: stagec - read.table(http://www.stanford.edu/class/stats202/DATA/stagec.data;, col.names=c(pgtime, pgstat, age,eet, g2, grade, gleason, ploidy)) fit - rpart(Surv(pgtime, pgstat) ~ age + eet + g2 + grade + gleason + ploidy, data=stagec) Result: 1) root 146 195.411600 1.000 2) grade 2.5 61 45.021520 0.3624701 4) g2 11.36 33 9.120116 0.1225562 * 5) g2=11.36 28 27.804100 0.7335298 10) gleason 5.5 20 14.376900 0.5292190 * 11) gleason=5.5 8 11.201470 1.3083680 * 3) grade=2.5 85 125.327400 1.6190620 6) age=56.5 75 104.154700 1.4287310 12) gleason 7.5 50 66.701410 1.1431320 * 13) gleason=7.5 25 33.993130 2.0355220 26) g2=15.29 13 16.555970 1.3494740 * 27) g2 15.29 12 14.220260 2.9210480 * 7) age 56.5 10 15.522810 3.1977430 * Let’s look at the terminal node 4: # PGTIME PGSTAGE AGE EET G2 GRADE GLEASON PLOIDY 1 8.6570840 70 1 4.431 3 1 2 16.700880 56 2 5.291 3 1 3 3.1622171 62 2 3.572 4 1 4 10.201230 63 2 5.142 5 1 5 4.4791240 63 2 5.752 5 1 6 6.5160840 66 2 5.922 5 1 7 4.9363450 67 2 6.412 5 1 8 10.798080 72 1 6.682 NA 1 9 9.1745370 62 1 6.742 5 1 10 10.874740 72 2 6.8 2 5 1 11 7.0280620 52 2 7.152 7 1 12 11.364810 59 2 7.612 5 1 13 10.176590 64 1 7.612 NA 1 14 6.96783 0 67 2 7.782 6 1 15 10.617380 55 2 7.812 5 1 16 6.5106090 70 1 7.882 6 1 17 10.362760 55 2 8.1 2 5 1 18 6.6940450 54 2 8.112 4 1 19 11.718 0 61 2 8.4 2 5 1 20 7.3018470 69 2 8.462 5 1 21 6.0670770 69 2 8.582 6 1 22 8.3531820 59 2 8.762 6 1 23 5.5414090 59 1 9.012 5 1 24 5.4921280 61 2 9.422 5 1 25 7.2087610 63 1 9.762 5 1 26 6.0041060 52 2 9.9 2 4 1 27 5.6646130 71 1 10.16 2 6 1 28 6.1300470 64 2 10.26 2 4 1 29 9.8124570 64 1 10.51 2 5 1 30 6.2751540 62 2 10.82 2 6 1 31 9.2539350 61 2 11.23 2 5 1 32 5.2019160 54 2 11.35 2 6 1 33 6.22861 0 65 2 11.35 2 5 1 Here we have 33 observations and 1 event. The “estimated rate” is 0.1225562. My questions are: (1) Is the “estimated rate” the estimated hazard rate ratio? (2) How does rpart calculate this rate? (3) Suppose I use xpred.rpart(fit, xval=10) to perform 10-fold cross-validation using (a) the complete stagec data set and (b) only a subset of it, say, using the columns Age, EET, and G2 only. For the i-th patient, I am likely to obtain a different estimated rate. How can I meaningfully compare both rates? How can say which one is “better”? Thanks a lot for all comments! Walter -- View this message in context: http://www.nabble.com/survival-analysis-using-rpart-tf3294276.html#a9163329 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Partial whitening of time series?
I have a time series with a one year lag, ar=0.5. The series has some interesting events that disappear when the series is whitened (i.e., fitting an AR process and looking at the residuals). I'd like to remove the autocorrelation in stages to see the effect on the time series. Is there a way to specify the autocorrelation term while fitting an AR process? For instance, given the following: x - arima.sim(model = list(order = c(1,0,0), ar = 0.5), n = 500, sd=0.25) Can I filter x in a way that the autocorrelation at lag one is 0.4, then 0.3, 0.2, 0.1, until I get to a clean series equivalent to: y - arima(x, order = c(1,0,0))$resid Thanks in advance, Andy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding duplicates by rows
Serguei Kaniovski wrote: Hi, I am trying to add duplicates of matrix mat by row. Commands subset(mat,duplicated(rownames(mat))) or mat[which(duplicated(rownames(mat))),] return only half of the required indices. How can I find the remaining ones, ie the matches, so that I can add them up? mat - matrix(runif(70), ncol=5) rownames(mat) - c(Z, rep(LETTERS[1:6], each=2), G) There is probably a more elegant way, but this seems to do what you want: mat[rownames(mat) %in% names(which(table(rownames(mat)) 1)),] Also, have you considered aggregate()? aggregate(mat, list(ROW = rownames(mat)), sum) Thanks, Serguei ___ Austrian Institute of Economic Research (WIFO) Name: Serguei Kaniovski P.O.Box 91 Tel.: +43-1-7982601-231 Arsenal Objekt 20 Fax: +43-1-7989386 1103 Vienna, Austria Mail: [EMAIL PROTECTED] A-1030 Wien http://www.wifo.ac.at/Serguei.Kaniovski [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Partial whitening of time series?
andy, if your model is Xt = 0.5 * Xt-1 + e, then it should have Xt = 0.1 * Xt-1 + 0.4 * Xt-1 + e (Xt - 0.1*Xt-1) = 0.4 * Xt-1 + e so what you need to do is to substract part of lag from your series. it is just my $0.02. On 2/26/07, Andy Bunn [EMAIL PROTECTED] wrote: I have a time series with a one year lag, ar=0.5. The series has some interesting events that disappear when the series is whitened (i.e., fitting an AR process and looking at the residuals). I'd like to remove the autocorrelation in stages to see the effect on the time series. Is there a way to specify the autocorrelation term while fitting an AR process? For instance, given the following: x - arima.sim(model = list(order = c(1,0,0), ar = 0.5), n = 500, sd=0.25) Can I filter x in a way that the autocorrelation at lag one is 0.4, then 0.3, 0.2, 0.1, until I get to a clean series equivalent to: y - arima(x, order = c(1,0,0))$resid Thanks in advance, Andy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] eigenvalue ordering
Hi all, Is it possible to get unordered eigenvalues and eigenvectors of a symmetric matrix in R? Any help appreciated. Regards, Kaustubh - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eigenvalue ordering
Kaustubh Patil wrote: Is it possible to get unordered eigenvalues and eigenvectors of a symmetric matrix in R? Yes, see help(eigen). If you are strict about the unordered part, do a sample(set, size) to randomize the eigenvalues. Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eigenvalue ordering
Alberto Monteiro wrote: Kaustubh Patil wrote: Is it possible to get unordered eigenvalues and eigenvectors of a symmetric matrix in R? Yes, see help(eigen). Er, where do you see anything about (un)order? As far as I know, there's no natural ordering of eigenvalues and eigenvalue algorithms generally find them in either increasing or decreasing order (or closest to specified value). If you are strict about the unordered part, do a sample(set, size) to randomize the eigenvalues. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Macros in R
If I understand the question correctly, I would do this: for (i in 1:54) assign( paste('input',i,sep='') , matrix( dataset$variable, nrow=1) ) You now have 54 matrices, named input1, input2, ... input54, each having 1 row and as many columns as dataset$variable is long. (also, they're identical, since all are created from the same object, dataset$variable) See, of course, the help page for assign() to see why this works. However, I do wonder, in the bigger picture of what you're trying to do, whether there isn't a better way. For example, why matrices, since they all have only one row? -Don At 5:02 PM +0100 2/25/07, Monika Kerekes wrote: Dear members, I have started to work with R recently and there is one thing which I could not solve so far. I don't know how to define macros in R. The problem at hand is the following: I want R to go through a list of 1:54 and create the matrices input1, input2, input3 up to input54. I have tried the following: for ( i in 1:54) { input[i] = matrix(nrow = 1, ncol = 107) input[i][1,]=datset$variable } However, R never creates the required matrices. I have also tried to type input'i' and input$i, none of which worked. I would be very grateful for help as this is a basic question the answer of which is paramount to any further usage of the software. Thank you very much Monika [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- -- Don MacQueen Environmental Protection Department Lawrence Livermore National Laboratory Livermore, CA, USA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eigenvalue ordering
Peter Dalgaard wrote: Is it possible to get unordered eigenvalues and eigenvectors of a symmetric matrix in R? Yes, see help(eigen). Er, where do you see anything about (un)order? As far as I know, there's no natural ordering of eigenvalues and eigenvalue algorithms generally find them in either increasing or decreasing order (or closest to specified value). eigen orders the values. From help(eigen): values: a vector containing the p eigenvalues of 'x', sorted in _decreasing_ order, according to 'Mod(values)' in the asymmetric case when they might be complex (even for real matrices). For real asymmetric matrices the vector will be complex only if complex conjugate pairs of eigenvalues are detected. So, if you are strict about getting unordered eigenvalues, you must shuffle them :-) Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 2 data frames - list in one out put , matrix in another ??
I have two more or less parallel dataframes that are giving me different results on one subset of variables. I know that I assembled the 2 dataframes slightly differently but I don't see why I am getting this result because one set of variables are labelled and the other is not. Variable names are the same, etc. as far as I can acertain. The only diffference seems to be that bdata variables are labelled. About now I really don't care which I get but I would like them to be the same. Can anyone suggest what I am doing wrong or should be looking at? Windows XP , R 2.4.1 Using Hmisc and gtools as well as the basic R installation. Problem load(adata) fn1 - function(x) {table(x)} jj -apply(adata[,110:127], 2, fn1) OUTPUT jj is aa list of 18 tables Examine a variable: typeof(adata$act.toy) [1] integer class(adata$act.toy) [1] integer load(bdata fn1 - function(x) {table(x)} kk -apply(bdata[,94:111], 2, fn1) OUTPUT jj is a matrix 2 X 18 class(bdata$act.toy) [1] labelled typeof(bdata$act.toy) [1] integer __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] someattributes
I had written ...someattributes() does not seem to exist. And Haris Skiadas replied My help shows it as moreattributes, not someattributes. (MacOSX), though doesn't sound like it should be platform-specific). Thanks for correcting me. Actually, my windows R documentation says mostattributes(), but it makes no difference -- none of the three show up as function names or R objects. -John Confidentiality Notice: This e-mail message, including any a...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] returns from dnorm and dmvnorm
Hi All, Why would calls to dnorm and dmvnorm return values that are above 1? For example, dnorm(0.3,mean=0, sd=0.1) [1] 3.989423 This is happening on two different installations of R that I have. Thank you. Hailu [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] hierarchical clustering using cutree
Hi Everyone, I am doing hierarchical clustering analysis and have a question regarding cutree. I am doing things like this: hc - hclust(dist(X)) a - cutree(hc, k=2) Basically a is a vector containing the assignments of 1 or 2 for each sample. May I know how cutree decides to assign 1 and 2's to each sample (in other words, how clusters 1 and 2 are decided)? I am having the feeling that the first sample will always be assigned to Cluster 1, but I am not sure about this. Thank you! Best, Jun Looking for earth-friendly autos? Browse Top Cars by Green Rating at Yahoo! Autos' Green Center. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] returns from dnorm and dmvnorm
well, nobody said that the density must be smaller than 1, right? :-) it's just the value of the normal density function at the point you asked. you may try doing that by hand and, with the correct math, you'll get the same thing. b On Feb 26, 2007, at 3:03 PM, A Hailu wrote: Hi All, Why would calls to dnorm and dmvnorm return values that are above 1? For example, dnorm(0.3,mean=0, sd=0.1) [1] 3.989423 This is happening on two different installations of R that I have. Thank you. Hailu __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] someattributes
On Feb 26, 2007, at 3:00 PM, Thaden, John J wrote: Thanks for correcting me. Actually, my windows R documentation says mostattributes(), but it makes no difference -- none of the three show up as function names or R objects. That's because there is no mostattributes function, it only works as an assignment: ?mostattributes- Example: x - c(2,3,4) mostattributes(x) - list(foo=bar) x [1] 2 3 4 attr(,foo) [1] bar -John Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] someattributes
On 2/26/07, Thaden, John J [EMAIL PROTECTED] wrote: I had written ...someattributes() does not seem to exist. And Haris Skiadas replied My help shows it as moreattributes, not someattributes. (MacOSX), though doesn't sound like it should be platform-specific). Thanks for correcting me. Actually, my windows R documentation says mostattributes(), but it makes no difference -- none of the three show up as function names or R objects. Try: ?mostattributes- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] returns from dnorm and dmvnorm
On Feb 26, 2007, at 3:03 PM, A Hailu wrote: Hi All, Why would calls to dnorm and dmvnorm return values that are above 1? For example, dnorm(0.3,mean=0, sd=0.1) [1] 3.989423 Because dnorm gives you the density function, whose integral is the distribution function, which is likely what you want. Try: pnorm(0.3,mean=0, sd=0.1) This is happening on two different installations of R that I have. Thank you. Hailu Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] returns from dnorm and dmvnorm
I guarantee that it would also happen on all future versions of R. Why would you expect density to be smaller than 1? The only constraints on density are that (a) it is non-negative and (b) it integrates to one. The smaller the variance, the greater the density is around its center. Density can be made to become arbitrarily large by letting the variance gets close to zero, and in the limit you will obtain Dirac's delta function. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of A Hailu Sent: Monday, February 26, 2007 3:04 PM To: r-help@stat.math.ethz.ch Subject: [R] returns from dnorm and dmvnorm Hi All, Why would calls to dnorm and dmvnorm return values that are above 1? For example, dnorm(0.3,mean=0, sd=0.1) [1] 3.989423 This is happening on two different installations of R that I have. Thank you. Hailu [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] someattributes (actually, mostattributes)
Haris Skiadas replied Thanks for correcting me. Actually, my windows R documentation says mostattributes(), but it makes no difference -- none of the three show up as function names or R objects. That's because there is no mostattributes function, it only works as an assignment: ?mostattributes- Thanks. Obviously I need to learn about assignments that are not R objects. -John Confidentiality Notice: This e-mail message, including any a...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] returns from dnorm and dmvnorm
Yes, you are right. Thanks. On 2/27/07, Benilton Carvalho [EMAIL PROTECTED] wrote: well, nobody said that the density must be smaller than 1, right? :-) it's just the value of the normal density function at the point you asked. you may try doing that by hand and, with the correct math, you'll get the same thing. b On Feb 26, 2007, at 3:03 PM, A Hailu wrote: Hi All, Why would calls to dnorm and dmvnorm return values that are above 1? For example, dnorm(0.3,mean=0, sd=0.1) [1] 3.989423 This is happening on two different installations of R that I have. Thank you. Hailu [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Partial whitening of time series?
Thanks, I wasn't thinking real clearly when I pressed 'send'. All figured out now. -A -Original Message- From: Wensui Liu [mailto:[EMAIL PROTECTED] Sent: Monday, February 26, 2007 10:15 AM To: Andy Bunn Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Partial whitening of time series? andy, if your model is Xt = 0.5 * Xt-1 + e, then it should have Xt = 0.1 * Xt-1 + 0.4 * Xt-1 + e (Xt - 0.1*Xt-1) = 0.4 * Xt-1 + e so what you need to do is to substract part of lag from your series. it is just my $0.02. On 2/26/07, Andy Bunn [EMAIL PROTECTED] wrote: I have a time series with a one year lag, ar=0.5. The series has some interesting events that disappear when the series is whitened (i.e., fitting an AR process and looking at the residuals). I'd like to remove the autocorrelation in stages to see the effect on the time series. Is there a way to specify the autocorrelation term while fitting an AR process? For instance, given the following: x - arima.sim(model = list(order = c(1,0,0), ar = 0.5), n = 500, sd=0.25) Can I filter x in a way that the autocorrelation at lag one is 0.4, then 0.3, 0.2, 0.1, until I get to a clean series equivalent to: y - arima(x, order = c(1,0,0))$resid Thanks in advance, Andy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] returns from dnorm and dmvnorm
Thanks everyone. I should have thought of dnorm as a straing return from the normal density formula. Hailu On 2/27/07, Charilaos Skiadas [EMAIL PROTECTED] wrote: On Feb 26, 2007, at 3:03 PM, A Hailu wrote: Hi All, Why would calls to dnorm and dmvnorm return values that are above 1? For example, dnorm(0.3,mean=0, sd=0.1) [1] 3.989423 Because dnorm gives you the density function, whose integral is the distribution function, which is likely what you want. Try: pnorm(0.3,mean=0, sd=0.1) This is happening on two different installations of R that I have. Thank you. Hailu Haris Skiadas Department of Mathematics and Computer Science Hanover College [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Double-banger function names: preferences and suggestions
The underscore versus left-arrow conundrum has its roots in the evolution of ASCII during the middle of the last century. Some Teletype machines in the 1970s (when the S language was being developed) still had a left arrow, and its ASCII code was used in S as a one keystroke convenience for the assignment operator. The left arrow symbol was then removed from most keyboards/printers/fontsets and replaced by the underscore. Thus the underscore remained as a one keystroke assignment operator. See e.g. http://www.wps.com/projects/codes/index.html#GRPH LEFT-ARROW, ? UNDERSCORE, _ One of the graphical codes, left-arrow mutated to the underscore of ASCII-1967. It may have had earlier, or other, meanings, but for some early programming languages it was assignment, eg. c ? b + a C is assigned the sum of B and A. Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: [EMAIL PROTECTED] tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: [EMAIL PROTECTED] on behalf of Marc Schwartz Sent: Sun 2/25/2007 8:28 AM To: Alberto Vieira Ferreira Monteiro Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Double-banger function names: preferences and suggestions On Sun, 2007-02-25 at 15:56 +, Alberto Vieira Ferreira Monteiro wrote: hadley wickham wrote: What do you prefer/recommend for double-banger function names: 1 scale.colour 2 scale_colour 3 scaleColour 1 is more R-like, but conflicts with S3. 2 is a modern version of number 1, but not many packages use it. Number 3 is more java-like. (I like number 2 best) Any suggestions? I always prefer 2, but this would make it non-portable to S-Plus. S-Plus has a bug, where _ is the equivalent to - (why would they do this? I prefer to think it's stupidity and not villainy) That's not a bug. If you search the archives of both the S-PLUS list and the R lists, you will see highly energized discussion on the use of the underscore operator. In R, the use of '_' was allowed for assignment up until version 1.8.0 when: DEPRECATED DEFUNCT o The assignment operator `_' has been removed. and subsequently allowed in names in version 1.9.0 when: o Underscore '_' is now allowed in syntactically valid names, and make.names() no longer changes underscores. Very old code that makes use of underscore for assignment may now give confusing error messages. Not to further contribute to the dialog on 'style', but to further contribute ;-), for those who have coded in the Windows environment (ie. C, VBA, etc.) the extension of sorts to number 3 is of course Hungarian Notation, named after Charles Simonyi, originally at Xerox PARC and later senior developer/architect at MS. The extension was the inclusion of the data type prefix, such as fnScaleColour to indicate that this was a function, with the name using caps to make words more distinct. And no, I'm not advocating that use...I have been guilty myself of using variants of 1 and 3, perhaps driven by my circulating caffeine levels as much as anything else. HTH, Marc Schwartz Off to go remove 12 inches of snow from the driveway and sidewalk...oy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] training svm
Hello. I'm new to R and I'm trying to solve a classification problem. I have a training dataset of about 40,000 rows and 50 columns. When I try to train support vector machine, it gives me this error after a few seconds: Error in predict.svm(ret, xhold) : Model is empty! This is the code I use: ne_span_data - as.matrix(read.table('ne_span.data.R.txt', header=TRUE, row.names='id')) library('e1071') svm_ne_span_model - svm(NE_type ~ . , ne_span_data) it gives me: Error in predict.svm(ret, xhold) : Model is empty! A line from the ne_span.data.R.txt file: svt OTHER N N I S 2 NA NA NA NA NA A NA NA 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 train-s1m2 Any idea what's wrong here? -- View this message in context: http://www.nabble.com/training-svm-tf3296613.html#a9170716 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] match() function with a little enhancement
Dear R users, I was wondering if R has a built-in function doing the following : my_match(values_vector,lookup_vector) { for each value of values_vector : if value %in% lookup_vector, then value is unchanged else, value is changed the the closest element of lookup_vector, closest meaning the one that would come just after if we sorted them using order() } For example : values - c(Kiwis, Bananas, Ananas, Cherries, Peer) vector - c(Oranges, Bananas, Apples, Cherries, Lemons) my_match(values, vector) should return : c(Lemons,Bananas,Apples,Cherries,NA) I currently use a home-made function for this, but it is quite slow on large sets, msotly because I did not manage to avoid using a loop. Many thanks for your ideas, Nicolas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] match() function with a little enhancement
try this: values - c(Kiwis, Bananas, Ananas, Cherries, Peer) vector - c(Oranges, Bananas, Apples, Cherries, Lemons) vector - sort(vector) vector [1] Apples Bananas Cherries Lemons Oranges x - sapply(values, function(x)ifelse(x=vector, -1, 1)) x Kiwis Bananas Ananas Cherries Peer [1,] 1 1 -111 [2,] 1 -1 -111 [3,] 1 -1 -1 -11 [4,]-1 -1 -1 -11 [5,]-1 -1 -1 -11 vector[apply(x, 2, function(z) which(z 0)[1])] [1] Lemons Bananas Apples Cherries NA On 2/26/07, Nicolas Prune [EMAIL PROTECTED] wrote: Dear R users, I was wondering if R has a built-in function doing the following : my_match(values_vector,lookup_vector) { for each value of values_vector : if value %in% lookup_vector, then value is unchanged else, value is changed the the closest element of lookup_vector, closest meaning the one that would come just after if we sorted them using order() } For example : values - c(Kiwis, Bananas, Ananas, Cherries, Peer) vector - c(Oranges, Bananas, Apples, Cherries, Lemons) my_match(values, vector) should return : c(Lemons,Bananas,Apples,Cherries,NA) I currently use a home-made function for this, but it is quite slow on large sets, msotly because I did not manage to avoid using a loop. Many thanks for your ideas, Nicolas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] PLotting R graphics/symbols without user x-y scaling
Hi Jonathan Lees wrote: Is it possible to add lines or other user defined graphics to a plot in R that does not depend on the user scale for the plot? For example I have a plot plot(x,y) and I want to add some graphic that is scaled in inches or cm but I do not want the graphic to change when the x-y scales are changed - like a thermometer, scale bar or other symbol - How does one do this? I want to build my own library of glyphs to add to plots but I do not know how to plot them when their size is independent of the device/user coordinates. Is it possible to add to the list of symbols in the function symbols() other than: _circles_, _squares_, _rectangles_, _stars_, _thermometers_, and _boxplots_ can I make my own symbols and have symbols call these? There is currently no mechanism for defining your own additions to symbols(), but this sort of thing is easily doable using the grid graphics system, and the resulting symbols would be easy to add to lattice plots. See ... http://www.stat.auckland.ac.nz/~paul/RGraphics/chapter4.pdf http://www.stat.auckland.ac.nz/~paul/RGraphics/chapter5.pdf There is also an example of how to do this sort of thing using the grImport package (and grid and lattice) in http://www.stat.auckland.ac.nz/~paul/Talks/import.pdf The complete code for the example is ... library(grImport) hourglass - new(Picture, paths= list(new(PictureFill, x=c(0, 1, 0, 1), y=c(0, 0, 1, 1), rgb=black), new(PictureStroke, x=c(0, 1, 0, 1, 0), y=c(0, 0, 1, 1, 0), rgb=grey)), summary= new(PictureSummary, numPaths=1, xscale=c(0, 1), yscale=c(0, 1))) dotplot(variety ~ yield | year, data=barley, panel=function(x, y, type, ...) { panel.dotplot(x, y, type=n, ...) grid.symbols(hourglass, x=unit(as.numeric(x), native), y=unit(as.numeric(y), native), size=unit(5, mm)) }) Paul Thanks- -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 [EMAIL PROTECTED] http://www.stat.auckland.ac.nz/~paul/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Crosstabbing multiple response data
Thanks to Charles, Gabor, and a private message from Frank E Harrell with some good ideas and help. This crossprod approach was very clever, I would never have thought of it. Best, Michael - Original Message From: Charles C. Berry [EMAIL PROTECTED] To: Michael Wexler [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Thursday, February 22, 2007 1:17:44 PM Subject: Re: [R] Crosstabbing multiple response data res - crossprod( as.matrix( ratings[ , -1] ) ) diag(res) - print(res, quote=F) att1 att2 att3 att1 21 att2 2 2 att3 12 res2 - crossprod(as.matrix( ratings[ , -1])) * 100 / nrow( ratings ) res2[] - paste( res2, %, sep= ) diag(res2) - print(res2, quote=F) att1 att2 att3 att1 50% 25% att2 50% 50% att3 25% 50% Be sure to bone up on format and sprintf before taking this into production. On Thu, 22 Feb 2007, Michael Wexler wrote: Using R version 2.4.1 (2006-12-18) on Windows, I have a dataset which resembles this: idatt1att2att3 1110 2100 3011 4111 ratings - data.frame(id = c(1,2,3,4), att1 = c(1,1,0,1), att2 = c(1,0,0,1), att3 = c(0,1,1,1)) I would like to get a cross tab of counts of co-ocurrence, which might resemble this: att1att2att3 att1 2 1 att222 att312 with the hope of understanding, at least pairwise, what things hang together. (Yes, there are much, much better ways to do this statistically including clustering and binary corrected correlation, but the audience I am working with asked for this version for a specific reason.) (Later on, I would also like to convert to percentages of the total unique pop, so the final version of the table would be att1att2att3 att1 50% 25% att250%50% att325%50% But I can do this in excel if I can get the first table out.) I have tried the reshape library, but could not get anything resembling this (both on its own, as well as feeding in to table()). (I have also played with transposing and using some comments from this list from 2002 and 2004, but the questioners appear to assume more knowledge than I have in use of R; the example in the posting guide was also more complex than I was ready for, I'm afraid.) Sample of some of my efforts: library(reshape) melt(ratings,id=c(id)) ds1 - melt(ratings,id=c(id)) table(ds1$variable, ds1$variable) # returns only rowcounts, 3 along diagonal xtabs(formula = value ~ ds1$variable + ds1$variable , data=ds1) # returns only a single row of collapsed counts, appears to not allow 1 variable in multiple uses I suspect I am close, so any nudges in the right direction would be helpful. Thanks much, Michael PS: www.rseek.org is very impressive, I heartily encourage its use. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] looping
Greetings: I am looking for some help (probably really basic) with looping. What I want to do is repeatedly sample observations (about 100 per sample) from a large dataset (100,000 observations). I would like the samples labelled sample.1, sample.2, and so on (or some other suitably simple naming scheme). To do this manually I would smp.1 - sample(10, 100) sample.1 - dataset[smp.1,] smp.2 - sample(10, 100) sample.2 - dataset[smp.2,] . . . smp.50 - sample(10, 100) sample.50 - dataset[smp.50,] and so on. I tried the following loop code to generate 100 samples: for (i in 1:50){ + smp.[i] - sample(10, 100) + sample.[i] - dataset[smp.[i],]} Unfortunately, that does not work -- specifying the looping variable i in the way that I have does not work since R uses that to reference places in a vector (x[i] would be the ith element in the vector x) Is it possible to assign the value of the looping variable in a name within the loop structure? Cheers, Neil Hepburn === Neil Hepburn, Economics Instructor Social Sciences Department, The University of Alberta Augustana Campus 4901 - 46 Avenue Camrose, Alberta T4V 2R3 Phone (780) 697-1588 email [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RDA and trend surface regression
Dear all, I'm performing RDA on plant presence/absence data, constrained by geographical locations. I'd like to constrain the RDA by the extended matrix of geographical coordinates -ie the matrix of geographical coordinates completed by adding all terms of a cubic trend surface regression- . This is the command I use (package vegan): rda(Helling ~ x+y+x*y+x^2+y^2+x*y^2+y*x^2+x^3+y^3) where Helling is the matrix of Hellinger-transformed presence/absence data The result returned by R is exactly the same as the one given by: anova(rda(Helling ~ x+y) Ie the quadratic and cubic terms are not taken into account I hope you can help me with that: how can I perform a RDA on an extended matrix of geographical coordinates in R?. Thank you very much in advance, Helene Morlon University of California, Merced [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RDA and trend surface regression
Helene, You will have to give us more information, such as your system/versions and a small reproducible example. We try to stress that questions are more easily answered when there are a lot of specific details given and a reproducible case can be tested. Here are two comments though: 1. The quadratic terms probably are not showing up because you are not using a proper model formula for the task. See: http://cran.r-project.org/doc/manuals/R-intro.html#Formulae-for-statisti cal-models Specifically, the part that says I(M): Insulate M. Inside M all operators have their normal arithmetic meaning, and that term appears in the model matrix. is important. So, as an example from ?rda: x - rda(Species ~ (Sepal.Length+Sepal.Width)^2 + Sepal.Width^2, data = iris) would not work for the squared term, but x - rda(Species ~ (Sepal.Length+Sepal.Width)^2 + I(Sepal.Width^2), data = iris) would. 2. RDA is fitting models at or between LDA and QDA. So a QDA model with quadratic terms would be quartic discriminant analysis. Of course, there are no rules against this, but high order polynomials can do weird things in the tail (which would be the edges of the space defined by your training data). If your data are that nonlinear, there are much better ways of classifying data. I'd suggests getting a copy of Hastie et all (2001) or MASS. Max -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of MORLON Sent: Monday, February 26, 2007 7:14 PM To: r-help@stat.math.ethz.ch Subject: [R] RDA and trend surface regression Dear all, I'm performing RDA on plant presence/absence data, constrained by geographical locations. I'd like to constrain the RDA by the extended matrix of geographical coordinates -ie the matrix of geographical coordinates completed by adding all terms of a cubic trend surface regression- . This is the command I use (package vegan): rda(Helling ~ x+y+x*y+x^2+y^2+x*y^2+y*x^2+x^3+y^3) where Helling is the matrix of Hellinger-transformed presence/absence data The result returned by R is exactly the same as the one given by: anova(rda(Helling ~ x+y) Ie the quadratic and cubic terms are not taken into account I hope you can help me with that: how can I perform a RDA on an extended matrix of geographical coordinates in R?. Thank you very much in advance, Helene Morlon University of California, Merced [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] prop.test or chisq.test ..?
Hi everyone, Suppose I have a count the occurrences of positive results, and the total number of occurrences: pos - 14 total - 15 testing that the proportion of positive occurrences is greater than 0.5 gives a p-value and confidence interval: prop.test( pos, total, p=0.5, alternative='greater') 1-sample proportions test with continuity correction data: 14 out of 15, null probability 0.5 X-squared = 9.6, df = 1, p-value = 0.0009729 alternative hypothesis: true p is greater than 0.5 95 percent confidence interval: 0.706632 1.00 sample estimates: p 0.933 My question is how does the use of chisq.test() differ from the above operation. For example: chisq.test(table( c(rep('pos', 14), rep('neg', 1)) )) Chi-squared test for given probabilities data: table(c(rep(pos, 14), rep(neg, 1))) X-squared = 11.2667, df = 1, p-value = 0.0007891 ... gives slightly different results. I am corrent in interpreting that the chisq.test() function in this case is giving me a p-value associated with the test that the probabilities of pos are *different* than the probabilities of neg -- and thus a larger p-value than the prop.test(... , p=0.5, alternative='greater') ? I realize that this is a rather elementary question, and references to a text would be just as helpful. Ideally, I would like a measure of how much I can 'trust' that a larger proportion is also statistically meaningful. Thus far the results from prop.test() match my intuition, but affirmation would be great. Cheers, -- Dylan Beaudette Soils and Biogeochemistry Graduate Group University of California at Davis 530.754.7341 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] PLotting R graphics/symbols without user x-y scaling
You can use par(usr) to get the min/max coords of the plot and then use that. For example, this will plot a red dot in the middle of the plot area regardless of the coordinates: plot(1:10) # sample plot usr - par(usr) points(mean(usr[1:2]), mean(usr[3:4]), pch = 20, col = red) # red dot See ?par On 2/26/07, Jonathan Lees [EMAIL PROTECTED] wrote: Is it possible to add lines or other user defined graphics to a plot in R that does not depend on the user scale for the plot? For example I have a plot plot(x,y) and I want to add some graphic that is scaled in inches or cm but I do not want the graphic to change when the x-y scales are changed - like a thermometer, scale bar or other symbol - How does one do this? I want to build my own library of glyphs to add to plots but I do not know how to plot them when their size is independent of the device/user coordinates. Is it possible to add to the list of symbols in the function symbols() other than: _circles_, _squares_, _rectangles_, _stars_, _thermometers_, and _boxplots_ can I make my own symbols and have symbols call these? Thanks- -- Jonathan M. Lees Professor THE UNIVERSITY OF NORTH CAROLINA AT CHAPEL HILL Department of Geological Sciences Campus Box #3315 Chapel Hill, NC 27599-3315 TEL: (919) 962-0695 FAX: (919) 966-4519 [EMAIL PROTECTED] http://www.unc.edu/~leesj __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] looping
You do not say -- and I am unable to divine -- whether you wish to sample with or without replacement: each time or as a whole. In general, when you want to do this sort of thing, the fastest way to do it is just to sample everything you need at once and then form it into a list or matrix or whatever. For example, for sampling 100 each time with replacement 200 times: mySamples - matrix(sample(yourDatavector, 100*200,replace=FALSE),ncol=200) will give you a 100 row by 200 column matrix of samples without replacement from yourDatavector. I hope that you can adapt this to suit your needs. Bert Gunter Nonclinical Statistics 7-7374 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Neil Hepburn Sent: Monday, February 26, 2007 4:11 PM To: r-help@stat.math.ethz.ch Subject: [R] looping Greetings: I am looking for some help (probably really basic) with looping. What I want to do is repeatedly sample observations (about 100 per sample) from a large dataset (100,000 observations). I would like the samples labelled sample.1, sample.2, and so on (or some other suitably simple naming scheme). To do this manually I would smp.1 - sample(10, 100) sample.1 - dataset[smp.1,] smp.2 - sample(10, 100) sample.2 - dataset[smp.2,] . . . smp.50 - sample(10, 100) sample.50 - dataset[smp.50,] and so on. I tried the following loop code to generate 100 samples: for (i in 1:50){ + smp.[i] - sample(10, 100) + sample.[i] - dataset[smp.[i],]} Unfortunately, that does not work -- specifying the looping variable i in the way that I have does not work since R uses that to reference places in a vector (x[i] would be the ith element in the vector x) Is it possible to assign the value of the looping variable in a name within the loop structure? Cheers, Neil Hepburn === Neil Hepburn, Economics Instructor Social Sciences Department, The University of Alberta Augustana Campus 4901 - 46 Avenue Camrose, Alberta T4V 2R3 Phone (780) 697-1588 email [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Optimizing the loop for large data
Rusers: I am trying to apply a quadratic discriminant function to find the best classification outcomes. 1 is assigned to the values greater than a threshold value; and 0 otherwise. I would like to see how the apparent error rates and the optimal error rate change with increasing threshold values. I have a 1000*10 data matrix: n=1000 and p=10. Here is what I wrote so far, but seems to be inefficient. I appreciate if someone help me out. library(foreign) library(MASS) D-read.dbf('data/Indianapolis015.dbf') # import a data # data looks like this LONGLAB XY Perimeter AreaX_UTM Y_UTM F0 F1 F2 1 TAZ 18011:1000 -86.25985 39.95286 2.061630 0.1862549 50600.38 4435792 235 0 35 2 TAZ 18011:1001 -86.31030 39.97591 3.657006 0.7305006 46440.80 44386080 0 0 3 TAZ 18011:1002 -86.29542 39.97054 3.516089 0.6408084 47677.31 4437936 155 0 15 4 TAZ 18011:1003 -86.27574 39.97294 5.000185 1.2592142 49374.91 4438102 835 0 55 5 TAZ 18011:1004 -86.25967 39.97197 4.788531 1.1930984 50741.38 4437913 425 0 80 6 TAZ 18011:1005 -86.29245 39.98580 6.189141 1.6734483 48031.44 4439616 185 0 35 7 TAZ 18011:1006 -86.24899 39.98259 7.525633 2.0564466 51723.80 4439040 505 0 45 8 TAZ 18011:1007 -86.30974 39.99014 3.773037 0.7790234 46583.20 4440186 30 0 10 9 TAZ 18011:1008 -86.27151 39.99040 4.589226 1.2212674 49850.92 4440021 40 0 0 10 TAZ 18011:9215 -86.58085 40.13588 37.278521 69.6681954 24438.13 4457794 2095 85 200 thrs-seq(1000,1,length=50) ED-D[,383]/D[,5] # employment density CBDx-D[,6]-58277.194363 # convert a coordinate for x CBDy-D[,7]-4414486.03135 # convert a coordinate for y AER-vector(numeric,length(thrs)) OER-vector(numeric,length(thrs)) MER-vector(numeric,length(thrs)) # compute the apparent error rates for each threshold value for (j in 1:length(thrs)){ ctgy-ifelse(EDthrs[j],2,1) # 2 categories are created by the threshold test1-qda(cbind(ED,CBDx,CBDy),ctgy) est1-cbind(ctgy,predict(test1)$class) AER[j]-sum((est1[,1]-est1[2])==0)/dim(D)[1] } # OER computation for ith location taken out for the thresholds for (k in 1:dim(D)[1]){ for (j in 1:length(thrs)){ ctgy-ifelse(EDthrs[j],2,1) test2-qda(cbind(ED[-k],CBDx[-k],CBDy[-k]),ctgy[-k]) est2-cbind(ctgy[-k],predict(test2)$class) OER[j]-mean(sum((est2[,1]-est2[2])==0)/(dim(D)[1]-1)) }} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] exact matching of names in attr
In R 2.5.0 (r40806), one of the change is to allow partial matching of name in the attr function. However, how can I tell if I have an exact match or not? For example, checking to see if an object has a name attribute, then giving it one if it doesn't: dat - data.frame(x=1:10,y=rnorm(10)) if(is.null(attr(dat,name))) attr(dat,name) - Site 1 str(dat) (This example works in R 2.5) Although there is no name attribute to the data.frame, it partially matches to names, resulting in not setting the attribute. (Personally, I think this change in the attr function is not desirable, and much prefer exact matches to avoid unintentional errors). How can I tell if this is an exact match? Is there a way to force an exact match? Thanks. +mt __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] looping
Another way is to use an indexed list, which is far more tidier than your method. If you mean about 100 as in an irregular number, then a list is your friend (i.e., a ragged array, that can have sometimes 97 samples, sometime 105 samples, etc.). Similar to your example: dat - runif(10,0,100) # fake dataset smp - list() # need an empty list first for(i in 1:1000) smp[[i]] - sample(dat,100) However, if you are new to R/S, the best advice is to learn to _not_ use the for loop (because it is slow, and there are vectorized ways). For example, if we want to find the mean of each sample, then return a tidy result: sapply(samp,mean) or a crazy new analysis you might be working on: crazy - function(x,y) (sum(xy)^2)/sum(x) sapply(smp,crazy,10) etc. +mt __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to use bash command in R script?
Dear All: Maybe it is a too basic question, but I don't how to find the answer. Sorry for that. What I want to do is call a shell command, which will provide two numbers, and assign those numbers to a vector. For example: The following command: $mxresult.sh ABC.mx mxresult.sh is a script written by myself and ABC.mx is a Mx script. I can get two numbers, 126.128 and 29, with this command. Is there any way to do it like this: c - somefunction(mxresult.sh ABC.mx) Or is their any other way to fulfill the function? Thanks in advance! Best washes, Wei-Wei __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to use bash command in R script?
Guo Wei-Wei wrote: Dear All: Maybe it is a too basic question, but I don't how to find the answer. Sorry for that. What I want to do is call a shell command, which will provide two numbers, and assign those numbers to a vector. For example: The following command: $mxresult.sh ABC.mx mxresult.sh is a script written by myself and ABC.mx is a Mx script. I can get two numbers, 126.128 and 29, with this command. Is there any way to do it like this: c - somefunction(mxresult.sh ABC.mx) Or is their any other way to fulfill the function? txt - system(mxresult.sh ABC.mx, intern=TRUE) is the first step. Then you need to get the numbers using either a scan() on a textConnection (see its help page) or something like mynum - as.numeric(strsplit(txt, *)[[1]]) Thanks in advance! Best washes, Wei-Wei __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] exact matching of names in attr
if (!name %in% names(attributes(dat))) { ... } /Henrik On 2/26/07, Michael Toews [EMAIL PROTECTED] wrote: In R 2.5.0 (r40806), one of the change is to allow partial matching of name in the attr function. However, how can I tell if I have an exact match or not? For example, checking to see if an object has a name attribute, then giving it one if it doesn't: dat - data.frame(x=1:10,y=rnorm(10)) if(is.null(attr(dat,name))) attr(dat,name) - Site 1 str(dat) (This example works in R 2.5) Although there is no name attribute to the data.frame, it partially matches to names, resulting in not setting the attribute. (Personally, I think this change in the attr function is not desirable, and much prefer exact matches to avoid unintentional errors). How can I tell if this is an exact match? Is there a way to force an exact match? Thanks. +mt __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 2 data frames - list in one out put , matrix in another ??
Hi as I do not know what class is labelled and not knowing about your data I just try to give you some hints. What says str(adata) and str(bdata) about structure of your data? It can be possible that in second case the structure of resulting table can be formed into matrix and apply probably does this coercion in accordance with Details section of help page. But this is only a guess. HTH Petr On 26 Feb 2007 at 14:56, John Kane wrote: Date sent: Mon, 26 Feb 2007 14:56:07 -0500 (EST) From: John Kane [EMAIL PROTECTED] To: R R-help r-help@stat.math.ethz.ch Subject:[R] 2 data frames - list in one out put , matrix in another ?? I have two more or less parallel dataframes that are giving me different results on one subset of variables. I know that I assembled the 2 dataframes slightly differently but I don't see why I am getting this result because one set of variables are labelled and the other is not. Variable names are the same, etc. as far as I can acertain. The only diffference seems to be that bdata variables are labelled. About now I really don't care which I get but I would like them to be the same. Can anyone suggest what I am doing wrong or should be looking at? Windows XP , R 2.4.1 Using Hmisc and gtools as well as the basic R installation. Problem load(adata) fn1 - function(x) {table(x)} jj -apply(adata[,110:127], 2, fn1) OUTPUT jj is aa list of 18 tables Examine a variable: typeof(adata$act.toy) [1] integer class(adata$act.toy) [1] integer load(bdata fn1 - function(x) {table(x)} kk -apply(bdata[,94:111], 2, fn1) OUTPUT jj is a matrix 2 X 18 class(bdata$act.toy) [1] labelled typeof(bdata$act.toy) [1] integer __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to use bash command in R script?
?system. Cheers, Andrew On Tue, Feb 27, 2007 at 02:05:09PM +0800, Guo Wei-Wei wrote: Dear All: Maybe it is a too basic question, but I don't how to find the answer. Sorry for that. What I want to do is call a shell command, which will provide two numbers, and assign those numbers to a vector. For example: The following command: $mxresult.sh ABC.mx mxresult.sh is a script written by myself and ABC.mx is a Mx script. I can get two numbers, 126.128 and 29, with this command. Is there any way to do it like this: c - somefunction(mxresult.sh ABC.mx) Or is their any other way to fulfill the function? Thanks in advance! Best washes, Wei-Wei __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Andrew Robinson Department of Mathematics and StatisticsTel: +61-3-8344-9763 University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599 http://www.ms.unimelb.edu.au/~andrewpr http://blogs.mbs.edu/fishing-in-the-bay/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RDA and trend surface regression
'm performing RDA on plant presence/absence data, constrained by geographical locations. I'd like to constrain the RDA by the extended matrix of geographical coordinates -ie the matrix of geographical coordinates completed by adding all terms of a cubic trend surface regression- . This is the command I use (package vegan): rda(Helling ~ x+y+x*y+x^2+y^2+x*y^2+y*x^2+x^3+y^3) where Helling is the matrix of Hellinger-transformed presence/absence data The result returned by R is exactly the same as the one given by: anova(rda(Helling ~ x+y) Ie the quadratic and cubic terms are not taken into account You must *I*solate the polynomial terms with function I (AsIs) so that they are not interpreted as formula operators: rda(Helling ~ x + y + I(x*y) + I(x^2) + I(y^2) + I(x*y^2) + I(y*x^2) + I(x^3) + I(y^3)) If you don't have the interaction terms, then it is easier and better (numerically) to use poly(): rda(Helling ~ poly(x, 3) + poly(y, 3)) Another issue is that in my opinion using polynomial constraints is an Extremely Bad Idea(TM). cheers, Jari Oksanen __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] fitting of all possible models
Hi, Fitting all possible models (GLM) with 10 predictors will result in loads of (2^10 - 1) models. I want to do that in order to get the importance of variables (having an unbalanced variable design) by summing the up the AIC-weights of models including the same variable, for every variable separately. It's time consuming and annoying to define all possible models by hand. Is there a command, or easy solution to let R define the set of all possible models itself? I defined models in the following way to process them with a batch job: # e.g. model 1 preference- formula(Y~Lwd + N + Sex + YY) # e.g. model 2 preference_heterogeneity- formula(Y~Ri + Lwd + N + Sex + YY) etc. etc. I appreciate any hint Cheers Lukas °°° Lukas Indermaur, PhD student eawag / Swiss Federal Institute of Aquatic Science and Technology ECO - Department of Aquatic Ecology Überlandstrasse 133 CH-8600 Dübendorf Switzerland Phone: +41 (0) 71 220 38 25 Fax: +41 (0) 44 823 53 15 Email: [EMAIL PROTECTED] www.lukasindermaur.ch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to use bash command in R script?
Thank you all! I solved my problem with your help. Best wishes, Wei-Wei __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.