You are reading the wrong part of the code for your argument list: > foo["FileName"] Error in `[.data.frame`(foo, "FileName") : undefined columns selected
[.data.frame is one of the most complex functions in R, and does many different things depending on which arguments are supplied. On Fri, 3 Aug 2007, Steven McKinney wrote: > Hi all, > > What are current methods people use in R to identify > mis-spelled column names when selecting columns > from a data frame? > > Alice Johnson recently tackled this issue > (see [BioC] posting below). > > Due to a mis-spelled column name ("FileName" > instead of "Filename") which produced no warning, > Alice spent a fair amount of time tracking down > this bug. With my fumbling fingers I'll be tracking > down such a bug soon too. > > Is there any options() setting, or debug technique > that will flag data frame column extractions that > reference a non-existent column? It seems to me > that the "[.data.frame" extractor used to throw an > error if given a mis-spelled variable name, and I > still see lines of code in "[.data.frame" such as > > if (any(is.na(cols))) > stop("undefined columns selected") > > > > In R 2.5.1 a NULL is silently returned. > >> foo <- data.frame(Filename = c("a", "b")) >> foo[, "FileName"] > NULL > > Has something changed so that the code lines > if (any(is.na(cols))) > stop("undefined columns selected") > in "[.data.frame" no longer work properly (if > I am understanding the intention properly)? > > If not, could "[.data.frame" check an > options() variable setting (say > warn.undefined.colnames) and throw a warning > if a non-existent column name is referenced? > > > > >> sessionInfo() > R version 2.5.1 (2007-06-27) > powerpc-apple-darwin8.9.1 > > locale: > en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8 > > attached base packages: > [1] "stats" "graphics" "grDevices" "utils" "datasets" "methods" > "base" > > other attached packages: > plotrix lme4 Matrix lattice > "2.2-3" "0.99875-4" "0.999375-0" "0.16-2" >> > > > > Steven McKinney > > Statistician > Molecular Oncology and Breast Cancer Program > British Columbia Cancer Research Centre > > email: smckinney +at+ bccrc +dot+ ca > > tel: 604-675-8000 x7561 > > BCCRC > Molecular Oncology > 675 West 10th Ave, Floor 4 > Vancouver B.C. > V5Z 1L3 > Canada > > > > > -----Original Message----- > From: [EMAIL PROTECTED] on behalf of Johnstone, Alice > Sent: Wed 8/1/2007 7:20 PM > To: [EMAIL PROTECTED] > Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame > > For interest sake, I have found out why I wasn't getting my expected > results when using read.AnnotatedDataFrame > Turns out the error was made in the ReadAffy command, where I specified > the filenames to be read from my AnnotatedDataFrame object. There was a > typo error with a capital N ($FileName) rather than lowercase n > ($Filename) as in my target file..whoops. However this meant the > filename argument was ignored without the error message(!) and instead > of using the information in the AnnotatedDataFrame object (which > included filenames, but not alphabetically) it read the .cel files in > alphabetical order from the working directory - hence the wrong file was > given the wrong label (given by the order of Annotated object) and my > comparisons were confused without being obvious as to why or where. > Our solution: specify that filename is as.character so assignment of > file to target is correct(after correcting $Filename) now that using > read.AnnotatedDataFrame rather than readphenoData. > > Data<-ReadAffy(filenames=as.character(pData(pd)$Filename),phenoData=pd) > > Hurrah! > > It may be beneficial to others, that if the filename argument isn't > specified, that filenames are read from the phenoData object if included > here. > > Thanks! > > -----Original Message----- > From: Martin Morgan [mailto:[EMAIL PROTECTED] > Sent: Thursday, 26 July 2007 11:49 a.m. > To: Johnstone, Alice > Cc: [EMAIL PROTECTED] > Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame > > Hi Alice -- > > "Johnstone, Alice" <[EMAIL PROTECTED]> writes: > >> Using R2.5.0 and Bioconductor I have been following code to analysis >> Affymetrix expression data: 2 treatments vs control. The original >> code was run last year and used the read.phenoData command, however >> with the newer version I get the error message Warning messages: >> read.phenoData is deprecated, use read.AnnotatedDataFrame instead The >> phenoData class is deprecated, use AnnotatedDataFrame (with >> ExpressionSet) instead >> >> I use the read.AnnotatedDataFrame command, but when it comes to the >> end of the analysis the comparison of the treatment to the controls >> gets mixed up compared to what you get using the original >> read.phenoData ie it looks like the 3 groups get labelled wrong and so > >> the comparisons are different (but they can still be matched up). >> My questions are, >> 1) do you need to set up your target file differently when using >> read.AnnotatedDataFrame - what is the standard format? > > I can't quite tell where things are going wrong for you, so it would > help if you can narrow down where the problem occurs. I think > read.AnnotatedDataFrame should be comparable to read.phenoData. Does > >> pData(pd) > > look right? What about > >> pData(Data) > > and > >> pData(eset.rma) > > ? It's not important but pData(pd)$Target is the same as pd$Target. > Since the analysis is on eset.rma, it probably makes sense to use the > pData from there to construct your design matrix > >> targs<-factor(eset.rma$Target) >> design<-model.matrix(~0+targs) >> colnames(design)<-levels(targs) > > Does design look right? > >> I have three columns sample, filename and target. >> 2) do you need to use a different model matrix to what I have? >> 3) do you use a different command for making the contrasts? > > Depends on the question! If you're performing the same analysis as last > year, then the model matrix and contrasts have to be the same! > >> I have included my code below if that is of any assistance. >> Many Thanks! >> Alice >> >> >> >> ##Read data >> pd<-read.AnnotatedDataFrame("targets.txt",header=T,row.name="sample") >> Data<-ReadAffy(filenames=pData(pd)$FileName,phenoData=pd) >> ##normalisation >> eset.rma<-rma(Data) >> ##analysis >> targs<-factor(pData(pd)$Target) >> design<-model.matrix(~0+targs) >> colnames(design)<-levels(targs) >> fit<-lmFit(eset.rma,design) >> cont.wt<-makeContrasts("treatment1-control","treatment2-control",level >> s= >> design) >> fit2<-contrasts.fit(fit,cont.wt) >> fit2.eb<-eBayes(fit2) >> testconts<-classifyTestsF(fit2.eb,p.value=0.01) >> topTable(fit2.eb,coef=2,n=300) >> topTable(fit2.eb,coef=1,n=300) >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> [EMAIL PROTECTED] >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > Martin Morgan > Bioconductor / Computational Biology > http://bioconductor.org > > _______________________________________________ > Bioconductor mailing list > [EMAIL PROTECTED] > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.