[R] Adding data to existing plot with new=TRUE does not appear to work
Dear all, I am trying to shove a number of cmdscale() results into a single plot (k=1 so I'm trying to get multiple columns in the plot). From ?par I learned that I can/should set new=TRUE in either par() or the plot function itself. However with the following reduced code, I get only a plot with a column of data points with x==2. plot(1,10, xlim=range(0,3), ylim=range(0,10), type='n') aa - rep(1,10) bb - 1:10 plot(aa,bb, xlim=range(0,3), ylim=range(0,10), new=TRUE) aa - rep(2,10) plot(aa,bb, xlim=range(0,3), ylim=range(0,10), new=TRUE) Also, when I insert a op - par(new=TRUE) either before or immediately after the first plot statement (the type='n' one) in the above code fragment, the resulting graph still only shows one column of data. Have I misinterpreted the instructions or the functionality of new=TRUE? Thank you, Paul Lemmens __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding data to existing plot with new=TRUE does not appear to work
Hi Petr, On 7/4/07, Petr PIKAL [EMAIL PROTECTED] wrote: par(new=T) plot(aa,bb, xlim=range(0,3), ylim=range(0,10), new=TRUE) So I need to activate the par(new=T) really just ahead of time when I need it, not as sort of a general clause at the beginning of my script? However you can get similar result with using points Yes I new that, but I wanted to try and go without an if() for deciding between the first and consecutive columns. Thnx for helping out! Paul Lemmens __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cmdscale(eurodist)
Dear all, I have a question regarding the 'y - -loc[,2]' in ?cmdscale. Although I see that the plot is more sensible when using the '-loc' instead of just 'y - loc[,2]', I don't understand if there is a statistical reason to do '-loc[,2]'. So is this just to make the graph look better, or should I always use -loc for the y-axis of a similar plot for a completely different data set. Kind regards, Paul Lemmens __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading multiple txt files into one data frame
On 7/30/06, Kartik Pappu [EMAIL PROTECTED] wrote: Hello All, I have a device that spews out experimental data as a series of text files each of which contains one column with several rows of numeric data. My problem is that for each trial it gives me one text file (and I run between 30 to 50 trials at a time) and I would ideally like to merge all these text files into one large data frame with each column representing a single trial. It is not a problem if NA characters are added to make all the columna of eaqual length. Right now I am doing this by opening each file individually and cutting and pasting the data into an excel file. How can I do this in R assuming all my text files are in one directory. Is it also possible to customize the column headers. For example if I have 32 trials and 16 are experimental and 16 are control and I want to name the columns Expt1, Expt2,... Expt16 and the control columns Cntl1,...Cntl16. Kartik setwd(E:/Cooperation @ Delft-Nijmegen (Feb. 2006 - Sep. 2006)/Research/Study 20 - Roughness/Experiment 20a - Roughness Index for CUReT textures/Statistics) # Concatenate the raw data files. data.path = ../data files/ (datafiles - list.files(path=data.path, pattern=subject\_[0-9]+\.txt$)) exp20a - do.call('rbind', lapply(datafiles, function(x) read.table(paste(data.path, x, sep= rm(datafiles, data.path) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Finding ties in data (be)for(e) BradleyTerry
Dear all, I have carried out a pairwise comparison study that I want to analyze using the BradleyTerry package to establish a rank order of my stimuli. However, BT does not handle ties between stimuli, so I need to find those in my data before I can use that model. The code below goes from the format of my result file(s) to a data frame suitable for BT, but as you can see, there are some ties. I need to find the rows with identical (but swapped) winner and loser and with the same frequency. How can I accomplish that using the R-way (not looping through the entire thing; in reality, I have approx 40 stimuli with around 180 observations of an odd 40 subjects)? # $lp was left picture; $rp, right one; $wr was the winner/chosen one by subject. dat - data.frame(subjno=gl(4,3), lp=factor(c(1,3,2,2,3,1,3,1,2,1,2,3), labels=c('a','b','c')), rp=factor(c(2,1,3,3,1,2,1,2,3,3,1,2), labels=c('a', 'b', 'c')), wr=factor(c(1,1,2,1,2,2,1,1,2,2,1,2), labels=c('lp', 'rp'))) dat.lp - subset(dat, wr=='lp') dat.rp - subset(dat, wr=='rp') names(dat.rp)[c(2,3)] - c('loser', 'winner') names(dat.lp)[c(2,3)] - c('winner', 'loser') (dat - with(merge(dat.lp, dat.rp, all=TRUE), data.frame(table(winner,loser Thank you for your help! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Reading multiple files into R
Hoi Vikas, --On vrijdag 1 oktober 2004 10:50 +0530 Vikas Rawal [EMAIL PROTECTED] wrote: I want to read data from a number of files into R. Reading individual files one by one requires writing enormous amount of code that will look something like the following. Is there a better way of doing this? These days I'm using the code below to read in each datafile I have, and come out with a single dataframe. # Concatenate the raw data files. (datafiles - list.files(path=../raw data/, pattern=pp.+\.dat$)) tst - do.call('rbind', lapply(datafiles, function(x) read.table( paste('../raw data/', x, sep=), skip=1))) rm(datafiles) -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.05)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Conditionally swap two columns of a data.frame?
Hoi Dan, --On donderdag 16 september 2004 13:55 +0100 Dan Bolser [EMAIL PROTECTED] wrote: Is there an R cookbook? Yes there is (sort of): StatsRus http://www.ku.edu/~pauljohn/R/Rtips.html kind regards, Paul -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.05)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] library(car) Anova() and Error-term in aov()
Dear all, Type III SS time again. This case trying to reproduce some SPSS (type III) data in R for a repeated measures anova with a betwSS factor included. As I understand this list etc, if I want type III then I can do library(car) Anova(lm.obj, type=III) But for the repeated measures anova, I need to include an Error-term in the aov() call (Psychology-guide from Jonathan Baron) which results in multiple lm() calls. Anova() does not seem capable to handle this situation. Or am I tackling Type III calculation, in this case with Error(), the wrong way (besides ignoring advice concerning Type I vs III)?? For instance, dat - rnorm(12) pp - factor(c(rep(1:3,2), rep(4:6,2))) betw - gl(2,6) A - factor(rep(c(rep('a',3),rep('b',3)), 2)) taov - aov(dat~betw*A+Error(pp/A)) Anova(taov, type=III) # Goes wrong with following error. #Error in Anova(taov, type = III) : no applicable method for Anova Phrased differently, ?Anova says Calculates type-II or type-III analysis-of-variance tables for model objects produced by 'lm' and 'glm', so it's not suitable for the aovlist that aov() with Error()-term returns. How can I compute Type III SS for such objects? kind regards, Paul -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.05)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] basic questions: any place for them
Hoi Tiago, --On dinsdag 3 augustus 2004 18:50 +0100 Tiago R Magalhaes [EMAIL PROTECTED] wrote: Once again I'm sorry for these basic questions and since predictably I'll have more of those if there's a basic-questions-list I would love to know more about it Recently we discussed, on this list, that several online communities have dedicated discussion groups for R. One of them is Orkut.com. The name of the other one slips my mind, but if you search the archives for my name and orkut, then you'll probably find those emails quickly. regards, Paul -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.05)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to select a whole column? Thanks!
Hoi Jinsong, --On dinsdag 3 augustus 2004 1:42 -0700 Jinsong Zhao [EMAIL PROTECTED] wrote: For instance, I hope to remove the V3~6 column, for all the value in those colume is zero. V3 V4 V5 V6 V7 V8 V9V10 1 0 0 0 0 0.000 0.000 0.000 0.000 2 0 0 0 0 0.000 0.000 0.000 0.000 3 0 0 0 0 0.000 0.000 0.000 0.000 4 0 0 0 0 0.000 0.000 0.000 0.000 5 0 0 0 0 0.000 0.000 0.000 0.000 6 0 0 0 0 -0.001 -0.001 -0.001 -0.001 7 0 0 0 0 0.000 0.000 0.000 -0.001 8 0 0 0 0 0.000 0.000 0.000 -0.001 9 0 0 0 0 -0.009 -0.012 -0.015 -0.018 I mean how to select the first four columns. subset(df, select=-c(V3,V4,V5,V6)) -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.05)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to sort TWO columns ?
Hoi Jacques, --On woensdag 21 juli 2004 10:35 +0400 Jacques VESLOT [EMAIL PROTECTED] wrote: Could somedy please let me know how to sort two columns of a dataframe, with priority to one of them, just like in access ? Would this http://www.ku.edu/~pauljohn/R/Rtips.html#2.12 help? kind regards, Paul -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.05)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] subset(..., drop=TRUE) doesn't seem to work.
Hoi Peter, --On woensdag 16 juni 2004 17:35 +0200 Peter Dalgaard [EMAIL PROTECTED] wrote: Anyways, the way out is d2 - subset(dd,c==1) ifac - sapply(dd,is.factor) d2[ifac] - lapply(d2[ifac],factor) or d2 - subset(dd,c==1) d2[] - lapply(d2, function(x) if (is.factor(x)) factor(x) else x) My toolbox.r now contains: my.subset - function(x, drop.unused.levels=FALSE, ...) { subsetted - subset(x, ...) if (drop.unused.levels) { subsetted[] - lapply(subsetted, function(x) if (is.factor(x)) factor(x) else x) } subsetted } Thnx for all the help! -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.05)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] subset and lme
Hoi Todd, --On woensdag 16 juni 2004 10:18 -0400 Todd Ogden [EMAIL PROTECTED] wrote: I'm puzzled by the following problem, which appears when attempting to run an analysis on part of a dataset: If I try: csubset - dat$Diagnosis==0 This just creates a vector of booleans that indicate (the row numbers) for which positions in dat Diagnosis==0. cdat - dat[dat$Diagnosis==0,] This OTOH, uses the above vector to index the rows of dat, indeed selecting those rows from dat that have Diagnosis==0. This is assigned to cdat. You could have done cdat - dat[csubset,] as well HTH -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.05)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] subset(..., drop=TRUE) doesn't seem to work.
Hello! If I read ?subset, the workings of the argument drop (to me) seem to imply equivalence of A and B (R 1.9.0): #A dd - data.frame(rt=rnorm(10), c=factor(gl(2,5))) dd - subset(dd, c==1) dd$c - dd$c[, drop=TRUE] table(dd$c) 1 5 #B dd - data.frame(rt=rnorm(10), c=factor(gl(2,5))) dd - subset(dd, c==1, drop=TRUE) table(dd$c) 1 2 5 0 So to lose the second level of dd$c, in method B I still need to 'dd$c - dd$c[, drop=TRUE]', while the manual seems to imply that with the drop argument to subset() this would not be necessary. Could you comment? kind regards, Paul -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.05)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] subset(..., drop=TRUE) doesn't seem to work.
Dear Peter, --On woensdag 16 juni 2004 17:06 +0200 Peter Dalgaard [EMAIL PROTECTED] wrote: Paul Lemmens [EMAIL PROTECTED] writes: Hello! If I read ?subset, the workings of the argument drop (to me) seem to imply equivalence of A and B (R 1.9.0): # A dd - data.frame(rt=rnorm(10), c=factor(gl(2,5))) dd - subset(dd, c==1) dd$c - dd$c[, drop=TRUE] table(dd$c) 1 5 # B dd - data.frame(rt=rnorm(10), c=factor(gl(2,5))) dd - subset(dd, c==1, drop=TRUE) table(dd$c) 1 2 5 0 So to lose the second level of dd$c, in method B I still need to 'dd$c - dd$c[, drop=TRUE]', while the manual seems to imply that with the drop argument to subset() this would not be necessary. Could you comment? Looks like a documentation bug. The actual code ends up doing x[r, vars, drop = drop] and [.data.frame will not drop factor levels. I wonder if it ever did... Bottomline: unless I find the time to submit a patch for '[.data.frame', I'll need to use the more elaborate way of dropping the unused levels? Does will not drop imply that it cannot be programmed, should not be programmed, or has not been programmed yet? kind regards, Paul -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.05)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Informal discussion group about R
Hoi Harold, --On donderdag 10 juni 2004 7:55 -0700 Baize, Harold [EMAIL PROTECTED] wrote: I've started a tribe for discussing R and sharing scripts. Tribe.net is one of the popular I hope it will be of help to newbies, although I'm new to R myself. Here is the url: http://r-statisticalenvironment.tribe.net Recently I discovered that there's also a community on Orkut called R-project http://www.orkut.com/Community.aspx?cmm=11845, with similar entry level discussions on problem solving with/in R. The same if holds: registration with Orkut obligatory. Perhaps these communities can/will function as the low-entry level help that was discussed around a year ago. kind regards, Paul -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.05)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Accessing data
Hoi TEMPL, --On maandag 17 mei 2004 10:46 +0200 TEMPL Matthias [EMAIL PROTECTED] wrote: Hello, I would like to access my data frame without one variable. E.g.: colnames(x) [1] Besch Ang.m Arb.m i10Umsatz arbstd I can try x[,-1], but this variable must be called by it´s name. x[,-Besch] x[,!Besch] attach(x) x[-Besch] ... Try x2 - subset(x, select=-Besch) kind regards, Paul -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.05)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Sum Sq of SPSS and R different for repeated measures Anova
Dear all, I'm still learning and transitioning from SPSS to R (1.9.0, winXP) and today I have data from two repeated measures experiments. For each of the subjects I've averaged for two within-SS factors (2 x 2, 4 means per subjects). One experiment had 16 subjects, the other one 25 (between-SS factor exp). So I have something like: avg.cond - read.table('data.txt') # data set attached as text. avg.cond[1:5,] # pp pictcat cond rtexp #1 1 animal con 517.8125 exp11b #2 2 animal con 425.9375 exp11b #3 3 animal con 379.6563 exp11b #4 4 animal con 410.6563 exp11b #5 5 animal con 420.3125 exp11b Then I do the anova: summary(taov - aov(rt~exp*pictcat*cond+Error(pp/(pictcat*cond)), data=subset(avg.cond, cond=='con'|cond=='incon'))) # #Error: pp # Df Sum Sq Mean Sq F value Pr(F) #exp1178 178 0.0102 0.9202 #Residuals 39 683329 17521 # #Error: pp:pictcat #Df Sum Sq Mean Sq F value Pr(F) #pictcat 1 62382 62382 87.334 1.68e-11 *** #exp:pictcat 1653 653 0.914 0.3449 #Residuals 39 27858 714 #--- #Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 # #Error: pp:cond # Df Sum Sq Mean Sq F value Pr(F) #cond 1 6550.2 6550.2 9.3057 0.004095 ** #exp:cond 1 206.3 206.3 0.2931 0.591313 #Residuals 39 27451.9 703.9 #--- #Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 # #Error: pp:pictcat:cond # Df Sum Sq Mean Sq F value Pr(F) #pictcat:cond 1 517.4 517.4 1.8747 0.1788 #exp:pictcat:cond 167.367.3 0.2439 0.6241 #Residuals39 10763.2 276.0 Now I want to verify these results with how I used to do it in SPSS (from it's menu): Analyze General Linear Model Repeated Measures (with the data set suitably rearranged to conform to SPSS' format). Comparing the Tests of Within-Subject Effects (I table a cannot reproduce in this medium) with the above results, I find identical results but for the main effects of pictcat and cond and their interaction. The Sum of squares of R are slightly larger (in itself not a bad thing because it results in larger F-ratio's) than those of SPSS (resp., 62141, 5747, 416 for pictcat, cond, pictcat:cond). The funny thing is that the SumSq are spot on for the interactions with the between-SS factor exp. The data for both analyses are exactly the same. I've read about R using type I SS, but if I change the type III default of SPSS into type I, the results get more different, instead more more comparable. I've read about drop1, but drop1(taov, test=F) #Error in extractAIC(object, scale, k = k, ...) : #no applicable method for extractAIC Also library(car) Anova(taov, type=III) #Error in Anova(taov, type = III) : no applicable method for Anova which I found in the archives doesn't work. As the data for R and SPSS are identical and because the interactions with exp are identical between R and SPSS, there is something else going on (if it were the type I/III difference then the exp interactions would have been different as well, right?). I cannot deduce the reason of the difference. Can anybody help with this? thank you for your time, regards, Paul Lemmens -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.05)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 pp pictcat cond rt exp 1 1 animal con 517.8125 exp11b 2 2 animal con 425.9375 exp11b 3 3 animal con 379.6563 exp11b 4 4 animal con 410.6563 exp11b 5 5 animal con 420.3125 exp11b 6 6 animal con 381.1613 exp11b 7 7 animal con 423.1875 exp11b 8 8 animal con 397.8387 exp11b 9 9 animal con 404.5714 exp11b 10 10 animal con 481.5000 exp11b 11 11 animal con 364.6875 exp11b 12 12 animal con 397.0938 exp11b 13 13 animal con 375.4375 exp11b 14 14 animal con 345.5313 exp11b 15 15 animal con 378.3548 exp11b 16 16 animal con 476.5161 exp11b 17 17 animal con 455.3750 exp11b 18 18 animal con 374.3125 exp11b 19 19 animal con 387.6250 exp11b 20 20 animal con 355.8750 exp11b 21 21 animal con 321.7097 exp11b 22 22 animal con 452.7813 exp11b 23 23 animal con 373.1290 exp11b 24 24 animal con 438.4063 exp11b 25 25 animal con 456.8125 exp11b 26 1 other con 470.4375 exp11b 27 2 other con 463.0938 exp11b 28 3 other con 463.1563 exp11b 29 4 other con 490.9063 exp11b 30 5 other con 467.4375 exp11b 31 6 other con 428.8125 exp11b 32 7 other con 459.2813 exp11b 33 8 other con 441.9375 exp11b 34 9 other con 397.2903 exp11b 35 10 other con 467.2500 exp11b 36 11 other con 431.3548 exp11b 37 12 other con 460.5625 exp11b 38 13 other con 401.8438 exp11b 39 14 other con 417.8438 exp11b 40 15 other con 414.6875 exp11b 41 16 other con 489.9375 exp11b 42 17 other con 485.1563 exp11b 43 18 other con 367.6250 exp11b 44 19 other con
[R] Accessing columns in data.frame using formula
Hello! I'm trying the hard way to use a formula, in a function, to specify the names of several important columns in a data.frame. Maybe I'm just battling to figure out the right search terms :-( This is on XP, R 1.8.1. So, for instance, wery[1:5,] V1 V2 V3 V4V5 congr V7 V8 V9 ok RT 1 1 1 960 520 1483 c 1 r r 1 760 2 1 2 1060 450 3753 c 1 r r 1 555 3 1 3 980 470 5758 c 2 l l 1 432 4 1 4 1060 440 7693 c 1 r r 1 424 5 1 5 1020 440 9578 i 1 l l 1 369 I already figured out how to get to the parts of the formula, tst - function(f=RT~congr+ok, data=wery) { thingy - all.vars(f) resp - thingy[1] facts - thingy[-1] # and how to get data from the data.frame. eval(parse(text=resp), env=data) # But now, I would like to do here what I'd do on the console as # wery$ok - factor(wery$ok), so here data$facts[2] - factor(data$facts[2]) # This won't work here. How do I continu? # Or perhaps also # data.tmp - data$resp[data$facts[1] == 'i'] } thank you, Paul Lemmens P.S: str(wery) `data.frame': 150 obs. of 11 variables: $ V1 : int 1 1 1 1 1 1 1 1 1 1 ... $ V2 : int 1 2 3 4 5 6 7 8 9 10 ... $ V3 : int 960 1060 980 1060 1020 1010 1060 1010 1090 1090 ... $ V4 : int 520 450 470 440 440 530 580 530 560 540 ... $ V5 : int 1483 3753 5758 7693 9578 11488 13423 15368 17548 19678 ... $ congr: Factor w/ 2 levels c,i: 1 1 1 1 2 2 2 1 1 2 ... $ V7 : int 1 1 2 1 1 1 1 2 2 2 ... $ V8 : Factor w/ 2 levels l,r: 2 2 1 2 1 1 1 1 1 2 ... $ V9 : Factor w/ 2 levels l,r: 2 2 1 2 1 2 1 1 1 2 ... $ ok : int 1 1 1 1 1 0 1 1 1 1 ... $ RT : int 760 555 432 424 369 291 403 526 500 458 ... -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.05)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Doubt about pattern
Hoi Marcelo, --On donderdag 29 januari 2004 11:33 -0300 Marcelo Luiz de Laia [EMAIL PROTECTED] wrote: files - dir(pattern=*.sens) but it includes all of the files that have sens, independent of they be in the end or in the middle of the name of the file. That's because your pattern is a regular expression and not a Windows/DOS wildcard. You'll need something like files - dir(pattern=\.sens$) \. matches the dot itself (without the slash it's a wildcard for any character) and the dollar sign $ matches the end of the filename. So this way you'll get every file that has 'sens' as its extension regards, Paul -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.03)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Problem with parser and if/else
Dear Peter, --On donderdag 13 november 2003 15:05 +0100 Peter Dalgaard [EMAIL PROTECTED] wrote: Brown, Simon [EMAIL PROTECTED] writes: Dear r-help people, could you confirm that this is correct behaviour for R? I am using RH9. the code: x1 - 1:6 t1 - 5 if (length(x1) = t1) { cat(in the if \n) } else { cat(in the else\n) } runs fine: source(test_if_else.R) in the if but the code: x1 - 1:6 t1 - 5 if (length(x1) = t1) { cat(in the if 2\n) } else { cat(in the else\n) } fails with the error: source(test_if_else2.R) Error in parse(file, n, text, prompt) : syntax error on line 6 Could someone explain this to me please? Again? This has been hashed over several times before. The basic issue is whether a statement can be assumed to be syntactically complete at the end of a line. It is fairly obvious what happens when you type the same expressions at an interactive prompt: x1 - 1:6 t1 - 5 if (length(x1) = t1) { + cat(in the if 2\n) + } in the if 2 else { Error: syntax error cat(in the else\n) in the else } Error: syntax error Notice that the first right curly brace is seen as terminating the if construct. Otherwise, R would need to wait and check whether the *next* line starts with an else which would certainly be confusing in interactive use. So R assumes the expression is finished and evaluates it. Then it gets an else keyword that it doesn't know what to do with and barfs. I'm trying to grasp this: if you're saying (or are you saying) that the only way to have if() know that an else will be present is to put it on the same line as the closing curly brace of the if() (compound) statement. But if I look at some code from, e.g., aov and lm, I see plenty violations of that rule. regards, Paul -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.03)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Cbind warning message
Hello! I'm not grasping why cbind (in the code below) warns that Warning message: number of rows of result is not a multiple of vector length (arg 2) in: cbind(z, p) when I do sections - function(length, parts) { p - 1:parts q - length %/% parts z - array(p, dim=c(parts,q)) r - length %% parts if ( r 0 ) { p[r+1:length(p)] - NA z - cbind(z,p) } z - na.omit(as.vector(t(z))) } and then sections(32,5) - a As I see it, rows in result are 5 and the vector length of p (which is 5) is a multiple of 5. kind regards, Paul -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.03)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: is.na(v)-b (was: Re: [R] Beginner's query - segmentation fault)
Hello Simon, --On woensdag 15 oktober 2003 10:08 +0100 Simon Fear [EMAIL PROTECTED] wrote: By the way, `is.na(x) - FALSE` will leave x unchanged (including leaving it as NA ! how bad is that ?!) Twilight Zone (Golden Earring). But with that remark I'm getting off topic, so thank you for your summary. I've already memorized the is.na() construct, so I should be safe for the time being : kind regards, Paul -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.03)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Subseting in a 3D array
Hoi Agustin, --On woensdag 15 oktober 2003 18:47 +0200 Agustin Lobo [EMAIL PROTECTED] wrote: Could anyone suggest the way of subseting the 3D array to get a vector of z values for each position recorded in ib5km.lincol.random? (avoiding the use of for loops). Is section 5.3 from the Introduction to R (p21) helpfull? regards, Paul -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.03)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: is.na(v)-b (was: Re: [R] Beginner's query - segmentation fault)
By accident I'm also toying around with NA's, so I started reading up on this thread but failed to find a 'concluding' remark or advice. As a naive R user I would have loved to see a comment do it like this. The prevailing opinion seemed to be that is.na() might be better (safer) but x - NA is much clearer to understand. Can I relatively safely use the easy form, or is it better to remember (the hard way) the safer version? Has the discussion continued privately or just stopped here? Personally I still find the fragments below (taken from the thread) very counter intuitive, not to say scary. x - 1:10 is.na(x) - 1:5 and is.na(x) - FALSE It's very hard to understand what happens (as layman) because the assignment seems to reverse in meaning in the first example (actually taking indices 1:5 of x and assigning those the value NA) whereas in the second case it's not obvious what happens to x: will it get the value FALSE or will the original value remain(*). IMHO the - NA construct is much easier to understand and should be made safe in all possible situations (whatever the underlying safety problem or other difficulties might be). kind regards, Paul (*) Such a remark will probably lead to some kind of reprimand because it's probably somewhere within the 10e6 manual pages but I'm trying my luck here. -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.03)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Unit of legend() coordinates (was: Re: [R] lines and legend)
Dear all, I would like to make a suggestion for an improvement (IMHO) of ?legend. Please add a remark that the x,y positioning coordinates are in the units of the plot() itself, *not* in pixels or anything the like. It would have saved me a lot of time figuring out why my legend just wouldn't appear if ?legend would have told me to set the coordinates in the units of the x and y axis. I haven't found it anywhere on the list or manuals (but I may have missed something). Now that I know this, it seems a logic choice, however, refering to a 'graphics device' seems to imply a coordinate system in pixels starting from the upper/lower left of the device (the usual choice in graphical manipulations in programming languages). This assumption is false and IMHO is nowhere in FAQ/manuals contradicted. regards, Paul --On dinsdag 1 juli 2003 11:23 -0700 Anna H. Pryor [EMAIL PROTECTED] wrote: Yes, I am using plot and then lines. The legend is just not appearing. I am using the coordinates of the legend (150,4) which work on boxplot and plot. I have not looked at the output of par (I don't know how to) to see if they are in the region. I assumed if they worked for plot and boxplot they would also for lines. On Tuesday 01 July 2003 11:16, you wrote: I assume that you are calling 'plot' and then 'lines'. Is the legend just not appearing? what are you using for the coordinates of the legend? Have you looked at the output from par to see if these values are within the plot region? When I am trying to put a legend on a plot where I am using lines, R just ignores it. I can do it with boxplot or plot, but just not with lines. Am I doing something wrong? Maybe I am just making a mistake? -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.03)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] simple matrix question
Hoi Christoph, --On woensdag 4 juni 2003 15:21 +0200 Christoph Lehmann [EMAIL PROTECTED] wrote: what is the easiest way to get from x x1 x2 [1,] 2 3 [2,] 3 2 [3,] 1 3 [4,] 1 4 xbar1 x1 x2 [1,] 1.75 3 [2,] 1.75 3 [3,] 1.75 3 [4,] 1.75 3 with the mean of the columns of x as values? xbar1 - tapply(as.vector(x), gl(2,4), mean); -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.03)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] (no subject)
Hoi Peter, --On woensdag 4 juni 2003 0:16 +0200 Peter Dalgaard BSA [EMAIL PROTECTED] wrote: Gilda Garibotti [EMAIL PROTECTED] writes: Hi, I would like to know if it is possible to get printed output while a loop is taking place. Example: for(i in 1:10){ print(i) some long process } This will print the values of i only after the loop is finished, what I would like is to see them when the process enters the i-th iteration to keep track of how the program is running. Windows, right? (This is system dependent) There's a menu item entitled Buffer output or something to that effect. Turn it off and print() calls display immediately. Lengthy output becomes slower, though. If you don't want to depend on you (or other people) turning of the buffering, use something like cat(this or that); flush.console. regards, Paul -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.03)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Numbers that look equal, should be equal, but if() doesn'tsee as equal (repost with code included)
Hoi Thomas, --On woensdag 28 mei 2003 7:16 -0700 Thomas Lumley [EMAIL PROTECTED] wrote: On Wed, 28 May 2003, Paul Lemmens wrote: Hi! Apologies for sending the mail without any code. Apparently somewhere along the way the .R attachments got filtered out. I have included the code below as clean as possible. My original mail is below the code. I still think you need not to be using ==. You want something like if ( abs(mean.b-mean.orig)/(epsilon+abs(mean.orig) epsilon){ You are effectively using epsilon=0, but epsilon=10e-10 should be adequate. Based on all the hints and explanations I've changed the test to 'identical(all.equal(mean.b, mean.orig, tolerance=.Machine$double.eps), FALSE)'. I still need to look into the concept of finite precision, because I still don't grasp how sometimes (as an extreme example, probably) 0.25 != 1/4. That this will happen for a number with a lot of different decimals I can understand (by an accumulation of rounding errors). thnx all 4 your help! -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /\ Montessorilaan 3 (B.01.03)Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber+31-24-3612648 Fax+31-24-3616066 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Numbers that look equal, should be equal, but if() doesn't seeas equal (repost with code included)
Hi! Apologies for sending the mail without any code. Apparently somewhere along the way the .R attachments got filtered out. I have included the code below as clean as possible. My original mail is below the code. Thank you again for your time. regards, Paul vincentize - function(data, bins) { if ( length(data) 2 ) { stop(The data is really short. Is that ok?); } if ( bins 2 ) { stop(A number of bins smaller than 2 just really isn't useful); } if ( bins length(data) ) { stop(This is really unusual, although perhaps possible. If your eally know what you're doing, maybe you should disable this check!?.); } ret - c(); for ( i in 1:length(data)) { rt - data[i]; b - 0; while ( b bins ) { ret - c(ret, rt); b - b+1; } } ret; } binify - function(data, bins, n) { if ( bins 2 ) { stop(Number of bins is smaller than 2. Nothing to split, exiting.); } if ( length(data) 2 ) { stop(The length of the data is really short. Is that ok?); } if ( bins * n != length(data) ) { stop(Cannot construct bins of equal length.); } t(array(data, c(n,bins))); } mean.bins - function(data) { # For the vincentizing procedures in vincentize() and binify(), # it made sense to check the data array/vector/matrix. Here, # we now just need to check that data is a matrix. if ( !is.matrix(data) ) { stop(The data is not in matrix form.); } means - c(); bins - dim(data)[1]; for (i in 1:bins) { means - c(means, mean(data[i,])); } # return a vector of means. means; } bins.factor - function(data, bins) { if ( !is.data.frame(data) ) { stop(data is not a data frame.); } source('Ratcliff.r', local=TRUE); subject.bin.means - c(); attach(data); l - levels(Cond); for ( i in 1:length(l) ) { cat(Calculating bins for factor level , l[i], .\n, sep=); flush.console(); data - RT[Cond == l[i]]; data - sort(data); n - length(data); data.vincent - vincentize(data,bins); data.vincent.bins - binify(data.vincent, bins, n); bin.means - mean.bins(data.vincent.bins); # FAILING TEST. mean.orig - mean(data); mean.b - mean(bin.means); if ( mean.b != mean.orig ) { #cat(mean.b\n, str(mean.b), mean.orig\n, str(mean.orig)); flush.console; detach(data); stop(Something went wrong calculating the bins: means do not equal.); } subject.bin.means - c(subject.bin.means, bin.means); } detach(data); if ( !length(subject.bin.means) == bins*length(l) ) { stop(Inappropriate number of means calculated.); } else { subject.bin.means } } -- Forwarded Message -- Date: dinsdag 27 mei 2003 14:53 +0200 From: Paul Lemmens [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: [R] Numbers that look equal, should be equal, but if() doesn't see as equal Hi! After a lot of testing and debugging I'm falling silent in figuring out what goes wrong in the following. I'm implementing the Vincentizing procedure that Ratcliff (1979) described. It's about calculating RT bins for any distribution of RT data. It boils down to rank ordering your data, replicating each data point as many times as you need bins and then splitting up the resulting distribution in equal bins. The code that I've written is attached (and not included because it is considerable in length due to many comments). Ratcliff.r contains some basic functions and distribution.bins.r contains the problematic function bins.factor() (problem area marked with 'FAILING TEST'). The final attached file is the mock up distribution I made. The failing test is the check if the mean of the mean RT's for each bin equals the mean of the original distribution. These should/are mathematically equivalent. Sometimes, however, the test fails. With the attached distribution most notably for 4, 7, 8, 9, and 13 bins. Since the means are mathematically equivalent IMHO it should not be an issue of this particular distribution. As a matter of fact, I also have tested some rnorm() distributions and my function also fails on those (albeit a little less often than with foobar.txt). Problem description: if one calculates the bins or bin means by hand, the mean of the bin means is visually the same as the overall mean, even with options(digits=20), but *still* the test fails. IMHO it's not my code and neither the distribution I use to test, but still, can you point out an obvious failure of my programming or is it indeed something of R that I don't yet grasp