I've since seen your followup a more detailed explanation may help. The path through the code for your argument list does not go where you quoted, and there is a reason for it.
Generally when you extract in R and ask for an non-existent index you get NA or NULL as the result (and no warning), e.g. > y <- list(x=1, y=2) > y[["z"]] NULL Because data frames 'must' have (column) names, they are a partial exception and when the result is a data frame you get an error if it would contain undefined columns. But in the case of foo[, "FileName"], the result is a single column and so will not have a name: there seems no reason to be different from > foo[["FileName"]] NULL > foo$FileName NULL which similarly select a single column. At one time they were different in R, for no documented reason. On Fri, 3 Aug 2007, Prof Brian Ripley wrote: > You are reading the wrong part of the code for your argument list: > >> foo["FileName"] > Error in `[.data.frame`(foo, "FileName") : undefined columns selected > > [.data.frame is one of the most complex functions in R, and does many > different things depending on which arguments are supplied. > > > On Fri, 3 Aug 2007, Steven McKinney wrote: > >> Hi all, >> >> What are current methods people use in R to identify >> mis-spelled column names when selecting columns >> from a data frame? >> >> Alice Johnson recently tackled this issue >> (see [BioC] posting below). >> >> Due to a mis-spelled column name ("FileName" >> instead of "Filename") which produced no warning, >> Alice spent a fair amount of time tracking down >> this bug. With my fumbling fingers I'll be tracking >> down such a bug soon too. >> >> Is there any options() setting, or debug technique >> that will flag data frame column extractions that >> reference a non-existent column? It seems to me >> that the "[.data.frame" extractor used to throw an >> error if given a mis-spelled variable name, and I >> still see lines of code in "[.data.frame" such as >> >> if (any(is.na(cols))) >> stop("undefined columns selected") >> >> >> >> In R 2.5.1 a NULL is silently returned. >> >>> foo <- data.frame(Filename = c("a", "b")) >>> foo[, "FileName"] >> NULL >> >> Has something changed so that the code lines >> if (any(is.na(cols))) >> stop("undefined columns selected") >> in "[.data.frame" no longer work properly (if >> I am understanding the intention properly)? >> >> If not, could "[.data.frame" check an >> options() variable setting (say >> warn.undefined.colnames) and throw a warning >> if a non-existent column name is referenced? >> >> >> >> >>> sessionInfo() >> R version 2.5.1 (2007-06-27) >> powerpc-apple-darwin8.9.1 >> >> locale: >> en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8 >> >> attached base packages: >> [1] "stats" "graphics" "grDevices" "utils" "datasets" "methods" >> "base" >> >> other attached packages: >> plotrix lme4 Matrix lattice >> "2.2-3" "0.99875-4" "0.999375-0" "0.16-2" >>> >> >> >> >> Steven McKinney >> >> Statistician >> Molecular Oncology and Breast Cancer Program >> British Columbia Cancer Research Centre >> >> email: smckinney +at+ bccrc +dot+ ca >> >> tel: 604-675-8000 x7561 >> >> BCCRC >> Molecular Oncology >> 675 West 10th Ave, Floor 4 >> Vancouver B.C. >> V5Z 1L3 >> Canada >> >> >> >> >> -----Original Message----- >> From: [EMAIL PROTECTED] on behalf of Johnstone, Alice >> Sent: Wed 8/1/2007 7:20 PM >> To: [EMAIL PROTECTED] >> Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame >> >> For interest sake, I have found out why I wasn't getting my expected >> results when using read.AnnotatedDataFrame >> Turns out the error was made in the ReadAffy command, where I specified >> the filenames to be read from my AnnotatedDataFrame object. There was a >> typo error with a capital N ($FileName) rather than lowercase n >> ($Filename) as in my target file..whoops. However this meant the >> filename argument was ignored without the error message(!) and instead >> of using the information in the AnnotatedDataFrame object (which >> included filenames, but not alphabetically) it read the .cel files in >> alphabetical order from the working directory - hence the wrong file was >> given the wrong label (given by the order of Annotated object) and my >> comparisons were confused without being obvious as to why or where. >> Our solution: specify that filename is as.character so assignment of >> file to target is correct(after correcting $Filename) now that using >> read.AnnotatedDataFrame rather than readphenoData. >> >> Data<-ReadAffy(filenames=as.character(pData(pd)$Filename),phenoData=pd) >> >> Hurrah! >> >> It may be beneficial to others, that if the filename argument isn't >> specified, that filenames are read from the phenoData object if included >> here. >> >> Thanks! >> >> -----Original Message----- >> From: Martin Morgan [mailto:[EMAIL PROTECTED] >> Sent: Thursday, 26 July 2007 11:49 a.m. >> To: Johnstone, Alice >> Cc: [EMAIL PROTECTED] >> Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame >> >> Hi Alice -- >> >> "Johnstone, Alice" <[EMAIL PROTECTED]> writes: >> >>> Using R2.5.0 and Bioconductor I have been following code to analysis >>> Affymetrix expression data: 2 treatments vs control. The original >>> code was run last year and used the read.phenoData command, however >>> with the newer version I get the error message Warning messages: >>> read.phenoData is deprecated, use read.AnnotatedDataFrame instead The >>> phenoData class is deprecated, use AnnotatedDataFrame (with >>> ExpressionSet) instead >>> >>> I use the read.AnnotatedDataFrame command, but when it comes to the >>> end of the analysis the comparison of the treatment to the controls >>> gets mixed up compared to what you get using the original >>> read.phenoData ie it looks like the 3 groups get labelled wrong and so >> >>> the comparisons are different (but they can still be matched up). >>> My questions are, >>> 1) do you need to set up your target file differently when using >>> read.AnnotatedDataFrame - what is the standard format? >> >> I can't quite tell where things are going wrong for you, so it would >> help if you can narrow down where the problem occurs. I think >> read.AnnotatedDataFrame should be comparable to read.phenoData. Does >> >>> pData(pd) >> >> look right? What about >> >>> pData(Data) >> >> and >> >>> pData(eset.rma) >> >> ? It's not important but pData(pd)$Target is the same as pd$Target. >> Since the analysis is on eset.rma, it probably makes sense to use the >> pData from there to construct your design matrix >> >>> targs<-factor(eset.rma$Target) >>> design<-model.matrix(~0+targs) >>> colnames(design)<-levels(targs) >> >> Does design look right? >> >>> I have three columns sample, filename and target. >>> 2) do you need to use a different model matrix to what I have? >>> 3) do you use a different command for making the contrasts? >> >> Depends on the question! If you're performing the same analysis as last >> year, then the model matrix and contrasts have to be the same! >> >>> I have included my code below if that is of any assistance. >>> Many Thanks! >>> Alice >>> >>> >>> >>> ##Read data >>> pd<-read.AnnotatedDataFrame("targets.txt",header=T,row.name="sample") >>> Data<-ReadAffy(filenames=pData(pd)$FileName,phenoData=pd) >>> ##normalisation >>> eset.rma<-rma(Data) >>> ##analysis >>> targs<-factor(pData(pd)$Target) >>> design<-model.matrix(~0+targs) >>> colnames(design)<-levels(targs) >>> fit<-lmFit(eset.rma,design) >>> cont.wt<-makeContrasts("treatment1-control","treatment2-control",level >>> s= >>> design) >>> fit2<-contrasts.fit(fit,cont.wt) >>> fit2.eb<-eBayes(fit2) >>> testconts<-classifyTestsF(fit2.eb,p.value=0.01) >>> topTable(fit2.eb,coef=2,n=300) >>> topTable(fit2.eb,coef=1,n=300) >>> >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> [EMAIL PROTECTED] >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> -- >> Martin Morgan >> Bioconductor / Computational Biology >> http://bioconductor.org >> >> _______________________________________________ >> Bioconductor mailing list >> [EMAIL PROTECTED] >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> ______________________________________________ >> R-help@stat.math.ethz.ch mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.