Re: [R] FW: Selecting undefined column of a data frame (was[BioC]read.phenoData vs read.AnnotatedDataFrame)

2007-08-04 Thread Steven McKinney
Resolution:

To avoid bugs in code due to typos of data frame 
column names that can occur when using the 
'$' extractor, 

   foo - data.frame(Filename = c(a, b))
   foo$FileName
  NULL

a past alternative was to use 
foo[, FileName]
instead of
foo$FileName.

However, this too now silently returns NULL.

 foo[, FileName]
NULL

A modest and simple modification is to use
TRUE for the row index argument.

   foo[T, FileName]
  Error in `[.data.frame`(foo, T, FileName) : 
  undefined columns selected

An error is issued, and the misspelled column name
can more easily be found in debugging the issue.

   all.equal(foo$Filename, foo[T, Filename])
  [1] TRUE

The two accessor methods yield the same result
when column names are spelled correctly.

   all.equal(iris$Species, iris[T, Species])
  [1] TRUE

Other solutions no doubt exist.  Currently a single
argument to [.data.frame will throw an error
if the argument does not match a column name.

   foo[FileName]
  Error in `[.data.frame`(foo, FileName) : 
  undefined columns selected


 sessionInfo()
R version 2.5.1 (2007-06-27) 
i386-pc-mingw32 

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
States.1252;LC_MONETARY=English_United 
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods  
[7] base 
 



Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: [EMAIL PROTECTED] on behalf of Steven McKinney
Sent: Fri 8/3/2007 11:10 AM
To: r-help@stat.math.ethz.ch
Subject: Re: [R] FW: Selecting undefined column of a data frame 
(was[BioC]read.phenoData vs read.AnnotatedDataFrame)
 


I see now that for my example


 foo - data.frame(Filename = c(a, b))
 foo[, FileName]
NULL

the issue is in this clause of the 
[.data.frame extractor.

The lines
if (drop  length(y) == 1L) 
return(.subset2(y, 1L)) 
return the NULL result just before the
error check
cols - names(y)
if (any(is.na(cols))) 
stop(undefined columns selected)
is performed.

Is this intended behaviour, or has a logical
bug crept into the [.data.frame extractor?


if (missing(i)) {
if (missing(j)  drop  length(x) == 1L) 
return(.subset2(x, 1L))
y - if (missing(j)) 
x
else .subset(x, j)
if (drop  length(y) == 1L) 
return(.subset2(y, 1L)) ## This returns a result before undefined 
columns check is done.  Is this intended?
cols - names(y)
if (any(is.na(cols))) 
stop(undefined columns selected)
if (any(duplicated(cols))) 
names(y) - make.unique(cols)
nrow - .row_names_info(x, 2L)
if (drop  !mdrop  nrow == 1L) 
return(structure(y, class = NULL, row.names = NULL))
else return(structure(y, class = oldClass(x), row.names = 
.row_names_info(x, 
0L)))
}




 sessionInfo()
R version 2.5.1 (2007-06-27) 
powerpc-apple-darwin8.9.1 

locale:
en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   
base 

other attached packages:
 plotrix lme4   Matrix  lattice 
 2.2-3  0.99875-4 0.999375-0 0.16-2 

Should this discussion move to R-devel?

Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: [EMAIL PROTECTED] on behalf of Steven McKinney
Sent: Fri 8/3/2007 10:37 AM
To: r-help@stat.math.ethz.ch
Subject: [R] FW: Selecting undefined column of a data frame (was 
[BioC]read.phenoData vs read.AnnotatedDataFrame)
 
Hi all,

What are current methods people use in R to identify
mis-spelled column names when selecting columns
from a data frame?

Alice Johnson recently tackled this issue
(see [BioC] posting below).

Due to a mis-spelled column name (FileName
instead of Filename) which produced no warning,
Alice spent a fair amount of time tracking down
this bug.  With my fumbling fingers I'll be tracking
down such a bug soon too.

Is there any options() setting, or debug technique
that will flag data frame column extractions that
reference a non-existent column?  It seems to me
that the [.data.frame extractor used to throw an
error if given a mis-spelled variable name, and I
still see lines of code in [.data.frame such as

if (any(is.na(cols))) 
stop(undefined columns selected)



In R 2.5.1 a NULL is silently returned.

 foo - data.frame(Filename = c(a, b))
 foo

[R] FW: Selecting undefined column of a data frame (was [BioC] read.phenoData vs read.AnnotatedDataFrame)

2007-08-03 Thread Steven McKinney
Hi all,

What are current methods people use in R to identify
mis-spelled column names when selecting columns
from a data frame?

Alice Johnson recently tackled this issue
(see [BioC] posting below).

Due to a mis-spelled column name (FileName
instead of Filename) which produced no warning,
Alice spent a fair amount of time tracking down
this bug.  With my fumbling fingers I'll be tracking
down such a bug soon too.

Is there any options() setting, or debug technique
that will flag data frame column extractions that
reference a non-existent column?  It seems to me
that the [.data.frame extractor used to throw an
error if given a mis-spelled variable name, and I
still see lines of code in [.data.frame such as

if (any(is.na(cols))) 
stop(undefined columns selected)



In R 2.5.1 a NULL is silently returned.

 foo - data.frame(Filename = c(a, b))
 foo[, FileName]
NULL

Has something changed so that the code lines
if (any(is.na(cols))) 
stop(undefined columns selected)
in [.data.frame no longer work properly (if
I am understanding the intention properly)?

If not, could  [.data.frame check an
options() variable setting (say
warn.undefined.colnames) and throw a warning
if a non-existent column name is referenced?




 sessionInfo()
R version 2.5.1 (2007-06-27) 
powerpc-apple-darwin8.9.1 

locale:
en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   
base 

other attached packages:
 plotrix lme4   Matrix  lattice 
 2.2-3  0.99875-4 0.999375-0 0.16-2 
 



Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: [EMAIL PROTECTED] on behalf of Johnstone, Alice
Sent: Wed 8/1/2007 7:20 PM
To: [EMAIL PROTECTED]
Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame
 
 For interest sake, I have found out why I wasn't getting my expected
results when using read.AnnotatedDataFrame
Turns out the error was made in the ReadAffy command, where I specified
the filenames to be read from my AnnotatedDataFrame object.  There was a
typo error with a capital N ($FileName) rather than lowercase n
($Filename) as in my target file..whoops.  However this meant the
filename argument was ignored without the error message(!) and instead
of using the information in the AnnotatedDataFrame object (which
included filenames, but not alphabetically) it read the .cel files in
alphabetical order from the working directory - hence the wrong file was
given the wrong label (given by the order of Annotated object) and my
comparisons were confused without being obvious as to why or where.
Our solution: specify that filename is as.character so assignment of
file to target is correct(after correcting $Filename) now that using
read.AnnotatedDataFrame rather than readphenoData.

Data-ReadAffy(filenames=as.character(pData(pd)$Filename),phenoData=pd)

Hurrah!

It may be beneficial to others, that if the filename argument isn't
specified, that filenames are read from the phenoData object if included
here.

Thanks!

-Original Message-
From: Martin Morgan [mailto:[EMAIL PROTECTED] 
Sent: Thursday, 26 July 2007 11:49 a.m.
To: Johnstone, Alice
Cc: [EMAIL PROTECTED]
Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame

Hi Alice --

Johnstone, Alice [EMAIL PROTECTED] writes:

 Using R2.5.0 and Bioconductor I have been following code to analysis 
 Affymetrix expression data: 2 treatments vs control.  The original 
 code was run last year and used the read.phenoData command, however 
 with the newer version I get the error message Warning messages:
 read.phenoData is deprecated, use read.AnnotatedDataFrame instead The 
 phenoData class is deprecated, use AnnotatedDataFrame (with
 ExpressionSet) instead
  
 I use the read.AnnotatedDataFrame command, but when it comes to the 
 end of the analysis the comparison of the treatment to the controls 
 gets mixed up compared to what you get using the original 
 read.phenoData ie it looks like the 3 groups get labelled wrong and so

 the comparisons are different (but they can still be matched up).
 My questions are,
 1) do you need to set up your target file differently when using 
 read.AnnotatedDataFrame - what is the standard format?

I can't quite tell where things are going wrong for you, so it would
help if you can narrow down where the problem occurs.  I think
read.AnnotatedDataFrame should be comparable to read.phenoData. Does

 pData(pd)

look right? What about

 pData(Data)

and

 pData(eset.rma)

? It's not important but pData(pd)$Target is the same as pd$Target.
Since the analysis is on eset.rma, it probably makes sense to use the
pData from there to construct your design matrix

 

Re: [R] FW: Selecting undefined column of a data frame (was [BioC]read.phenoData vs read.AnnotatedDataFrame)

2007-08-03 Thread Steven McKinney


I see now that for my example


 foo - data.frame(Filename = c(a, b))
 foo[, FileName]
NULL

the issue is in this clause of the 
[.data.frame extractor.

The lines
if (drop  length(y) == 1L) 
return(.subset2(y, 1L)) 
return the NULL result just before the
error check
cols - names(y)
if (any(is.na(cols))) 
stop(undefined columns selected)
is performed.

Is this intended behaviour, or has a logical
bug crept into the [.data.frame extractor?


if (missing(i)) {
if (missing(j)  drop  length(x) == 1L) 
return(.subset2(x, 1L))
y - if (missing(j)) 
x
else .subset(x, j)
if (drop  length(y) == 1L) 
return(.subset2(y, 1L)) ## This returns a result before undefined 
columns check is done.  Is this intended?
cols - names(y)
if (any(is.na(cols))) 
stop(undefined columns selected)
if (any(duplicated(cols))) 
names(y) - make.unique(cols)
nrow - .row_names_info(x, 2L)
if (drop  !mdrop  nrow == 1L) 
return(structure(y, class = NULL, row.names = NULL))
else return(structure(y, class = oldClass(x), row.names = 
.row_names_info(x, 
0L)))
}




 sessionInfo()
R version 2.5.1 (2007-06-27) 
powerpc-apple-darwin8.9.1 

locale:
en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   
base 

other attached packages:
 plotrix lme4   Matrix  lattice 
 2.2-3  0.99875-4 0.999375-0 0.16-2 

Should this discussion move to R-devel?

Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: [EMAIL PROTECTED] on behalf of Steven McKinney
Sent: Fri 8/3/2007 10:37 AM
To: r-help@stat.math.ethz.ch
Subject: [R] FW: Selecting undefined column of a data frame (was 
[BioC]read.phenoData vs read.AnnotatedDataFrame)
 
Hi all,

What are current methods people use in R to identify
mis-spelled column names when selecting columns
from a data frame?

Alice Johnson recently tackled this issue
(see [BioC] posting below).

Due to a mis-spelled column name (FileName
instead of Filename) which produced no warning,
Alice spent a fair amount of time tracking down
this bug.  With my fumbling fingers I'll be tracking
down such a bug soon too.

Is there any options() setting, or debug technique
that will flag data frame column extractions that
reference a non-existent column?  It seems to me
that the [.data.frame extractor used to throw an
error if given a mis-spelled variable name, and I
still see lines of code in [.data.frame such as

if (any(is.na(cols))) 
stop(undefined columns selected)



In R 2.5.1 a NULL is silently returned.

 foo - data.frame(Filename = c(a, b))
 foo[, FileName]
NULL

Has something changed so that the code lines
if (any(is.na(cols))) 
stop(undefined columns selected)
in [.data.frame no longer work properly (if
I am understanding the intention properly)?

If not, could  [.data.frame check an
options() variable setting (say
warn.undefined.colnames) and throw a warning
if a non-existent column name is referenced?




 sessionInfo()
R version 2.5.1 (2007-06-27) 
powerpc-apple-darwin8.9.1 

locale:
en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   
base 

other attached packages:
 plotrix lme4   Matrix  lattice 
 2.2-3  0.99875-4 0.999375-0 0.16-2 
 



Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: [EMAIL PROTECTED] on behalf of Johnstone, Alice
Sent: Wed 8/1/2007 7:20 PM
To: [EMAIL PROTECTED]
Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame
 
 For interest sake, I have found out why I wasn't getting my expected
results when using read.AnnotatedDataFrame
Turns out the error was made in the ReadAffy command, where I specified
the filenames to be read from my AnnotatedDataFrame object.  There was a
typo error with a capital N ($FileName) rather than lowercase n
($Filename) as in my target file..whoops.  However this meant the
filename argument was ignored without the error message(!) and instead
of using the information in the AnnotatedDataFrame object (which
included filenames, but not alphabetically) it read the .cel files in
alphabetical order from the working directory - hence the wrong file was
given the wrong label (given by the order

Re: [R] FW: Selecting undefined column of a data frame (was [BioC] read.phenoData vs read.AnnotatedDataFrame)

2007-08-03 Thread Prof Brian Ripley
You are reading the wrong part of the code for your argument list:

  foo[FileName]
Error in `[.data.frame`(foo, FileName) : undefined columns selected

[.data.frame is one of the most complex functions in R, and does many 
different things depending on which arguments are supplied.


On Fri, 3 Aug 2007, Steven McKinney wrote:

 Hi all,

 What are current methods people use in R to identify
 mis-spelled column names when selecting columns
 from a data frame?

 Alice Johnson recently tackled this issue
 (see [BioC] posting below).

 Due to a mis-spelled column name (FileName
 instead of Filename) which produced no warning,
 Alice spent a fair amount of time tracking down
 this bug.  With my fumbling fingers I'll be tracking
 down such a bug soon too.

 Is there any options() setting, or debug technique
 that will flag data frame column extractions that
 reference a non-existent column?  It seems to me
 that the [.data.frame extractor used to throw an
 error if given a mis-spelled variable name, and I
 still see lines of code in [.data.frame such as

 if (any(is.na(cols)))
stop(undefined columns selected)



 In R 2.5.1 a NULL is silently returned.

 foo - data.frame(Filename = c(a, b))
 foo[, FileName]
 NULL

 Has something changed so that the code lines
 if (any(is.na(cols)))
stop(undefined columns selected)
 in [.data.frame no longer work properly (if
 I am understanding the intention properly)?

 If not, could  [.data.frame check an
 options() variable setting (say
 warn.undefined.colnames) and throw a warning
 if a non-existent column name is referenced?




 sessionInfo()
 R version 2.5.1 (2007-06-27)
 powerpc-apple-darwin8.9.1

 locale:
 en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   
 base

 other attached packages:
 plotrix lme4   Matrix  lattice
 2.2-3  0.99875-4 0.999375-0 0.16-2




 Steven McKinney

 Statistician
 Molecular Oncology and Breast Cancer Program
 British Columbia Cancer Research Centre

 email: smckinney +at+ bccrc +dot+ ca

 tel: 604-675-8000 x7561

 BCCRC
 Molecular Oncology
 675 West 10th Ave, Floor 4
 Vancouver B.C.
 V5Z 1L3
 Canada




 -Original Message-
 From: [EMAIL PROTECTED] on behalf of Johnstone, Alice
 Sent: Wed 8/1/2007 7:20 PM
 To: [EMAIL PROTECTED]
 Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame

 For interest sake, I have found out why I wasn't getting my expected
 results when using read.AnnotatedDataFrame
 Turns out the error was made in the ReadAffy command, where I specified
 the filenames to be read from my AnnotatedDataFrame object.  There was a
 typo error with a capital N ($FileName) rather than lowercase n
 ($Filename) as in my target file..whoops.  However this meant the
 filename argument was ignored without the error message(!) and instead
 of using the information in the AnnotatedDataFrame object (which
 included filenames, but not alphabetically) it read the .cel files in
 alphabetical order from the working directory - hence the wrong file was
 given the wrong label (given by the order of Annotated object) and my
 comparisons were confused without being obvious as to why or where.
 Our solution: specify that filename is as.character so assignment of
 file to target is correct(after correcting $Filename) now that using
 read.AnnotatedDataFrame rather than readphenoData.

 Data-ReadAffy(filenames=as.character(pData(pd)$Filename),phenoData=pd)

 Hurrah!

 It may be beneficial to others, that if the filename argument isn't
 specified, that filenames are read from the phenoData object if included
 here.

 Thanks!

 -Original Message-
 From: Martin Morgan [mailto:[EMAIL PROTECTED]
 Sent: Thursday, 26 July 2007 11:49 a.m.
 To: Johnstone, Alice
 Cc: [EMAIL PROTECTED]
 Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame

 Hi Alice --

 Johnstone, Alice [EMAIL PROTECTED] writes:

 Using R2.5.0 and Bioconductor I have been following code to analysis
 Affymetrix expression data: 2 treatments vs control.  The original
 code was run last year and used the read.phenoData command, however
 with the newer version I get the error message Warning messages:
 read.phenoData is deprecated, use read.AnnotatedDataFrame instead The
 phenoData class is deprecated, use AnnotatedDataFrame (with
 ExpressionSet) instead

 I use the read.AnnotatedDataFrame command, but when it comes to the
 end of the analysis the comparison of the treatment to the controls
 gets mixed up compared to what you get using the original
 read.phenoData ie it looks like the 3 groups get labelled wrong and so

 the comparisons are different (but they can still be matched up).
 My questions are,
 1) do you need to set up your target file differently when using
 read.AnnotatedDataFrame - what is the standard format?

 I can't quite tell where things are going wrong for you, so it would
 help if you can narrow 

Re: [R] FW: Selecting undefined column of a data frame (was [BioC] read.phenoData vs read.AnnotatedDataFrame)

2007-08-03 Thread Steven McKinney
Thanks Prof Ripley,

I used double indexing (if I understand the doc correctly)
so my call was

 foo[, FileName]

I traced through each line of `[.data.frame`
following the sequence of commands executed
for my call.

In the code section

if (missing(i)) {
if (missing(j)  drop  length(x) == 1L)
return(.subset2(x, 1L))
y - if (missing(j))
x
else .subset(x, j)
if (drop  length(y) == 1L)
return(.subset2(y, 1L)) ## This returns a result before undefined 
columns check is done.  Is this intended?
cols - names(y)
if (any(is.na(cols)))
stop(undefined columns selected)
if (any(duplicated(cols)))
names(y) - make.unique(cols)
nrow - .row_names_info(x, 2L)
if (drop  !mdrop  nrow == 1L)
return(structure(y, class = NULL, row.names = NULL))
else return(structure(y, class = oldClass(x), row.names = 
.row_names_info(x,
0L)))
}

the return happened after execution of
if (drop  length(y) == 1L)
return(.subset2(y, 1L))
before the check on column names.

Shouldn't the check on column names
cols - names(y)
if (any(is.na(cols)))
stop(undefined columns selected)
occur before
if (drop  length(y) == 1L)
return(.subset2(y, 1L))
rather than after?




-Original Message-
From: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
Sent: Fri 8/3/2007 12:25 PM
To: Steven McKinney
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] FW: Selecting undefined column of a data frame (was [BioC] 
read.phenoData vs read.AnnotatedDataFrame)
 
You are reading the wrong part of the code for your argument list:

  foo[FileName]
Error in `[.data.frame`(foo, FileName) : undefined columns selected

[.data.frame is one of the most complex functions in R, and does many 
different things depending on which arguments are supplied.


On Fri, 3 Aug 2007, Steven McKinney wrote:

 Hi all,

 What are current methods people use in R to identify
 mis-spelled column names when selecting columns
 from a data frame?

 Alice Johnson recently tackled this issue
 (see [BioC] posting below).

 Due to a mis-spelled column name (FileName
 instead of Filename) which produced no warning,
 Alice spent a fair amount of time tracking down
 this bug.  With my fumbling fingers I'll be tracking
 down such a bug soon too.

 Is there any options() setting, or debug technique
 that will flag data frame column extractions that
 reference a non-existent column?  It seems to me
 that the [.data.frame extractor used to throw an
 error if given a mis-spelled variable name, and I
 still see lines of code in [.data.frame such as

 if (any(is.na(cols)))
stop(undefined columns selected)



 In R 2.5.1 a NULL is silently returned.

 foo - data.frame(Filename = c(a, b))
 foo[, FileName]
 NULL

 Has something changed so that the code lines
 if (any(is.na(cols)))
stop(undefined columns selected)
 in [.data.frame no longer work properly (if
 I am understanding the intention properly)?

 If not, could  [.data.frame check an
 options() variable setting (say
 warn.undefined.colnames) and throw a warning
 if a non-existent column name is referenced?




 sessionInfo()
 R version 2.5.1 (2007-06-27)
 powerpc-apple-darwin8.9.1

 locale:
 en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   
 base

 other attached packages:
 plotrix lme4   Matrix  lattice
 2.2-3  0.99875-4 0.999375-0 0.16-2




 Steven McKinney

 Statistician
 Molecular Oncology and Breast Cancer Program
 British Columbia Cancer Research Centre

 email: smckinney +at+ bccrc +dot+ ca

 tel: 604-675-8000 x7561

 BCCRC
 Molecular Oncology
 675 West 10th Ave, Floor 4
 Vancouver B.C.
 V5Z 1L3
 Canada




 -Original Message-
 From: [EMAIL PROTECTED] on behalf of Johnstone, Alice
 Sent: Wed 8/1/2007 7:20 PM
 To: [EMAIL PROTECTED]
 Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame

 For interest sake, I have found out why I wasn't getting my expected
 results when using read.AnnotatedDataFrame
 Turns out the error was made in the ReadAffy command, where I specified
 the filenames to be read from my AnnotatedDataFrame object.  There was a
 typo error with a capital N ($FileName) rather than lowercase n
 ($Filename) as in my target file..whoops.  However this meant the
 filename argument was ignored without the error message(!) and instead
 of using the information in the AnnotatedDataFrame object (which
 included filenames, but not alphabetically) it read the .cel files in
 alphabetical order from the working directory - hence the wrong file was
 given the wrong label (given by the order of Annotated object) and my
 comparisons were confused without being obvious as to why or where.
 Our solution: specify that filename is as.character so assignment

Re: [R] FW: Selecting undefined column of a data frame (was [BioC] read.phenoData vs read.AnnotatedDataFrame)

2007-08-03 Thread Prof Brian Ripley
I've since seen your followup a more detailed explanation may help.
The path through the code for your argument list does not go where you 
quoted, and there is a reason for it.

Generally when you extract in R and ask for an non-existent index you get 
NA or NULL as the result (and no warning), e.g.

 y - list(x=1, y=2)
 y[[z]]
NULL

Because data frames 'must' have (column) names, they are a partial 
exception and when the result is a data frame you get an error if it would 
contain undefined columns.

But in the case of foo[, FileName], the result is a single column and so 
will not have a name: there seems no reason to be different from

 foo[[FileName]]
NULL
 foo$FileName
NULL

which similarly select a single column.  At one time they were different 
in R, for no documented reason.


On Fri, 3 Aug 2007, Prof Brian Ripley wrote:

 You are reading the wrong part of the code for your argument list:

  foo[FileName]
 Error in `[.data.frame`(foo, FileName) : undefined columns selected

 [.data.frame is one of the most complex functions in R, and does many 
 different things depending on which arguments are supplied.


 On Fri, 3 Aug 2007, Steven McKinney wrote:

 Hi all,
 
 What are current methods people use in R to identify
 mis-spelled column names when selecting columns
 from a data frame?
 
 Alice Johnson recently tackled this issue
 (see [BioC] posting below).
 
 Due to a mis-spelled column name (FileName
 instead of Filename) which produced no warning,
 Alice spent a fair amount of time tracking down
 this bug.  With my fumbling fingers I'll be tracking
 down such a bug soon too.
 
 Is there any options() setting, or debug technique
 that will flag data frame column extractions that
 reference a non-existent column?  It seems to me
 that the [.data.frame extractor used to throw an
 error if given a mis-spelled variable name, and I
 still see lines of code in [.data.frame such as
 
 if (any(is.na(cols)))
stop(undefined columns selected)
 
 
 
 In R 2.5.1 a NULL is silently returned.
 
 foo - data.frame(Filename = c(a, b))
 foo[, FileName]
 NULL
 
 Has something changed so that the code lines
 if (any(is.na(cols)))
stop(undefined columns selected)
 in [.data.frame no longer work properly (if
 I am understanding the intention properly)?
 
 If not, could  [.data.frame check an
 options() variable setting (say
 warn.undefined.colnames) and throw a warning
 if a non-existent column name is referenced?
 
 
 
 
 sessionInfo()
 R version 2.5.1 (2007-06-27)
 powerpc-apple-darwin8.9.1
 
 locale:
 en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods 
 base
 
 other attached packages:
 plotrix lme4   Matrix  lattice
 2.2-3  0.99875-4 0.999375-0 0.16-2
 
 
 
 
 Steven McKinney
 
 Statistician
 Molecular Oncology and Breast Cancer Program
 British Columbia Cancer Research Centre
 
 email: smckinney +at+ bccrc +dot+ ca
 
 tel: 604-675-8000 x7561
 
 BCCRC
 Molecular Oncology
 675 West 10th Ave, Floor 4
 Vancouver B.C.
 V5Z 1L3
 Canada
 
 
 
 
 -Original Message-
 From: [EMAIL PROTECTED] on behalf of Johnstone, Alice
 Sent: Wed 8/1/2007 7:20 PM
 To: [EMAIL PROTECTED]
 Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame
 
 For interest sake, I have found out why I wasn't getting my expected
 results when using read.AnnotatedDataFrame
 Turns out the error was made in the ReadAffy command, where I specified
 the filenames to be read from my AnnotatedDataFrame object.  There was a
 typo error with a capital N ($FileName) rather than lowercase n
 ($Filename) as in my target file..whoops.  However this meant the
 filename argument was ignored without the error message(!) and instead
 of using the information in the AnnotatedDataFrame object (which
 included filenames, but not alphabetically) it read the .cel files in
 alphabetical order from the working directory - hence the wrong file was
 given the wrong label (given by the order of Annotated object) and my
 comparisons were confused without being obvious as to why or where.
 Our solution: specify that filename is as.character so assignment of
 file to target is correct(after correcting $Filename) now that using
 read.AnnotatedDataFrame rather than readphenoData.
 
 Data-ReadAffy(filenames=as.character(pData(pd)$Filename),phenoData=pd)
 
 Hurrah!
 
 It may be beneficial to others, that if the filename argument isn't
 specified, that filenames are read from the phenoData object if included
 here.
 
 Thanks!
 
 -Original Message-
 From: Martin Morgan [mailto:[EMAIL PROTECTED]
 Sent: Thursday, 26 July 2007 11:49 a.m.
 To: Johnstone, Alice
 Cc: [EMAIL PROTECTED]
 Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame
 
 Hi Alice --
 
 Johnstone, Alice [EMAIL PROTECTED] writes:
 
 Using R2.5.0 and Bioconductor I have been following code to analysis
 Affymetrix expression data: 2 treatments vs 

Re: [R] FW: Selecting undefined column of a data frame (was [BioC] read.phenoData vs read.AnnotatedDataFrame)

2007-08-03 Thread Steven McKinney

 -Original Message-
 From: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
 Sent: Fri 8/3/2007 1:05 PM
 To: Steven McKinney
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] FW: Selecting undefined column of a data frame (was [BioC] 
 read.phenoData vs read.AnnotatedDataFrame)
  
 I've since seen your followup a more detailed explanation may help.
 The path through the code for your argument list does not go where you 
 quoted, and there is a reason for it.
 

Using a copy of  [.data.frame with browser() I have traced
the flow of execution. (My copy with the browser command is at the end
of this email)


   foo[, FileName]
  Called from: `[.data.frame`(foo, , FileName)
  Browse[1] n
  debug: mdrop - missing(drop)
  Browse[1] n
  debug: Narg - nargs() - (!mdrop)
  Browse[1] n
  debug: if (Narg  3) {
  if (!mdrop) 
  warning(drop argument will be ignored)
  if (missing(i)) 
  return(x)
  if (is.matrix(i)) 
  return(as.matrix(x)[i])
  y - NextMethod([)
  cols - names(y)
  if (!is.null(cols)  any(is.na(cols))) 
  stop(undefined columns selected)
  if (any(duplicated(cols))) 
  names(y) - make.unique(cols)
  return(structure(y, class = oldClass(x), row.names = .row_names_info(x, 
  0L)))
  }
  Browse[1] n
  debug: if (missing(i)) {
  if (missing(j)  drop  length(x) == 1L) 
  return(.subset2(x, 1L))
  y - if (missing(j)) 
  x
  else .subset(x, j)
  if (drop  length(y) == 1L) 
  return(.subset2(y, 1L))
  cols - names(y)
  if (any(is.na(cols))) 
  stop(undefined columns selected)
  if (any(duplicated(cols))) 
  names(y) - make.unique(cols)
  nrow - .row_names_info(x, 2L)
  if (drop  !mdrop  nrow == 1L) 
  return(structure(y, class = NULL, row.names = NULL))
  else return(structure(y, class = oldClass(x), row.names = 
.row_names_info(x, 
  0L)))
  }
  Browse[1] n
  debug: if (missing(j)  drop  length(x) == 1L) return(.subset2(x, 
  1L))
  Browse[1] n
  debug: y - if (missing(j)) x else .subset(x, j)
  Browse[1] n
  debug: if (drop  length(y) == 1L) return(.subset2(y, 1L))
  Browse[1] n
  NULL
   

So `[.data.frame` is exiting after executing 
+ if (drop  length(y) == 1L) 
+ return(.subset2(y, 1L)) ## This returns a result before undefined 
columns check is done.  Is this intended?

Couldn't the error check
+ cols - names(y)
+ if (any(is.na(cols))) 
+ stop(undefined columns selected)
be done before the above return()?

What would break if the error check on column names was done
before returning a NULL result due to incorrect column name spelling?

Why should

 foo[, FileName]
NULL

differ from

 foo[seq(nrow(foo)), FileName]
Error in `[.data.frame`(foo, seq(nrow(foo)), FileName) : 
undefined columns selected
 

Thank you for your explanations.



 Generally when you extract in R and ask for an non-existent index you get 
 NA or NULL as the result (and no warning), e.g.
 
  y - list(x=1, y=2)
  y[[z]]
 NULL
 
 Because data frames 'must' have (column) names, they are a partial 
 exception and when the result is a data frame you get an error if it would 
 contain undefined columns.
 
 But in the case of foo[, FileName], the result is a single column and so 
 will not have a name: there seems no reason to be different from
 
  foo[[FileName]]
 NULL
  foo$FileName
 NULL
 
 which similarly select a single column.  At one time they were different 
 in R, for no documented reason.
 
 
 On Fri, 3 Aug 2007, Prof Brian Ripley wrote:
 
  You are reading the wrong part of the code for your argument list:
 
   foo[FileName]
  Error in `[.data.frame`(foo, FileName) : undefined columns selected
 
  [.data.frame is one of the most complex functions in R, and does many 
  different things depending on which arguments are supplied.
 
 
  On Fri, 3 Aug 2007, Steven McKinney wrote:
 
  Hi all,
  
  What are current methods people use in R to identify
  mis-spelled column names when selecting columns
  from a data frame?
  
  Alice Johnson recently tackled this issue
  (see [BioC] posting below).
  
  Due to a mis-spelled column name (FileName
  instead of Filename) which produced no warning,
  Alice spent a fair amount of time tracking down
  this bug.  With my fumbling fingers I'll be tracking
  down such a bug soon too.
  
  Is there any options() setting, or debug technique
  that will flag data frame column extractions that
  reference a non-existent column?  It seems to me
  that the [.data.frame extractor used to throw an
  error if given a mis-spelled variable name, and I
  still see lines of code in [.data.frame such as
  
  if (any(is.na(cols)))
 stop(undefined columns selected)
  
  
  
  In R 2.5.1 a NULL is silently returned.
  
  foo - data.frame(Filename = c(a, b))
  foo[, FileName]
  NULL
  
  Has something changed so that the code lines
  if (any(is.na(cols

Re: [R] FW: Selecting undefined column of a data frame (was [BioC] read.phenoData vs read.AnnotatedDataFrame)

2007-08-03 Thread Steven McKinney

 What would break is that three methods for doing the same thing would
 give different answers.
 
 Please do have the courtesy to actually read the detailed explanation you
 are given.

Sorry Prof. Ripley, I am attempting to read carefully, as this
issue has deeper coding/debugging implications, and as you
point out, 
  [.data.frame is one of the most complex functions in R
so please bear with me.  This change in behaviour has 
taken away a side-effect debugging tool, discussed below.


 
 
 On Fri, 3 Aug 2007, Steven McKinney wrote:
 
 
  -Original Message-
  From: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
  Sent: Fri 8/3/2007 1:05 PM
  To: Steven McKinney
  Cc: r-help@stat.math.ethz.ch
  Subject: Re: [R] FW: Selecting undefined column of a data frame (was 
  [BioC] read.phenoData vs read.AnnotatedDataFrame)
 
  I've since seen your followup a more detailed explanation may help.
  The path through the code for your argument list does not go where you
  quoted, and there is a reason for it.
 
 
  Generally when you extract in R and ask for an non-existent index you get
  NA or NULL as the result (and no warning), e.g.
 
  y - list(x=1, y=2)
  y[[z]]
  NULL
 
  Because data frames 'must' have (column) names, they are a partial
  exception and when the result is a data frame you get an error if it would
  contain undefined columns.
 
  But in the case of foo[, FileName], the result is a single column and so
  will not have a name: there seems no reason to be different from
 
  foo[[FileName]]
  NULL
  foo$FileName
  NULL
 
  which similarly select a single column.  At one time they were different
  in R, for no documented reason.


This difference provided a side-effect debugging tool, in that where

   bar - foo[, FileName]

used to throw an error, alerting as to a typo, it now does not.

Having been burned by NULL results due to typos in code lines using
the $ extractor such as
 
   bar - foo$FileName

I learned to use
   bar - foo[, FileName]
to help cut down on typo bugs.  With the ubiquity of
camelCase object names, this is a constant typing bug hazard.


I am wondering what to do now to double check spelling
when accessing columns of a dataframe.

If [.data.frame stays as is, can a debug mechanism
be implemented in R that forces strict adherence
to existing list names in debug mode?  This would also help debug
typos in camelCase names when using the $ and [[
extractors and accessors.

Are there other debugging tools already in R that
can help point out such camelCase list element
name typos?



 
 
  On Fri, 3 Aug 2007, Prof Brian Ripley wrote:
 
  You are reading the wrong part of the code for your argument list:
 
   foo[FileName]
  Error in `[.data.frame`(foo, FileName) : undefined columns selected
 
  [.data.frame is one of the most complex functions in R, and does many
  different things depending on which arguments are supplied.
 
 
  On Fri, 3 Aug 2007, Steven McKinney wrote:
 
  Hi all,
 
  What are current methods people use in R to identify
  mis-spelled column names when selecting columns
  from a data frame?
 
  Alice Johnson recently tackled this issue
  (see [BioC] posting below).
 
  Due to a mis-spelled column name (FileName
  instead of Filename) which produced no warning,
  Alice spent a fair amount of time tracking down
  this bug.  With my fumbling fingers I'll be tracking
  down such a bug soon too.
 
  Is there any options() setting, or debug technique
  that will flag data frame column extractions that
  reference a non-existent column?  It seems to me
  that the [.data.frame extractor used to throw an
  error if given a mis-spelled variable name, and I
  still see lines of code in [.data.frame such as
 
  if (any(is.na(cols)))
 stop(undefined columns selected)
 
 
 
  In R 2.5.1 a NULL is silently returned.
 
  foo - data.frame(Filename = c(a, b))
  foo[, FileName]
  NULL
 
  Has something changed so that the code lines
  if (any(is.na(cols)))
 stop(undefined columns selected)
  in [.data.frame no longer work properly (if
  I am understanding the intention properly)?
 
  If not, could  [.data.frame check an
  options() variable setting (say
  warn.undefined.colnames) and throw a warning
  if a non-existent column name is referenced?
 
 
 
 
  sessionInfo()
  R version 2.5.1 (2007-06-27)
  powerpc-apple-darwin8.9.1
 
  locale:
  en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
 
  attached base packages:
  [1] stats graphics  grDevices utils datasets  methods
  base
 
  other attached packages:
  plotrix lme4   Matrix  lattice
  2.2-3  0.99875-4 0.999375-0 0.16-2
 
 
 
 
  Steven McKinney
 
  Statistician
  Molecular Oncology and Breast Cancer Program
  British Columbia Cancer Research Centre
 
  email: smckinney +at+ bccrc +dot+ ca
 
  tel: 604-675-8000 x7561
 
  BCCRC
  Molecular Oncology
  675 West 10th Ave, Floor 4
  Vancouver B.C.
  V5Z 1L3
  Canada
 
 
 
 
  --
  Brian D. Ripley

Re: [R] FW: Selecting undefined column of a data frame (was [BioC]read.phenoData vs read.AnnotatedDataFrame)

2007-08-03 Thread Bert Gunter
I suspect you'll get some creative answers, but if all you're worried about
is whether a column exists before you do something with it, what's wrong
with:

nm - ... ## a character vector of names
if(!all(nm %in% names(yourdata))) ## complain
else ## do something


I think this is called defensive programming.

Bert Gunter
Genentech


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Steven McKinney
Sent: Friday, August 03, 2007 10:38 AM
To: r-help@stat.math.ethz.ch
Subject: [R] FW: Selecting undefined column of a data frame (was
[BioC]read.phenoData vs read.AnnotatedDataFrame)

Hi all,

What are current methods people use in R to identify
mis-spelled column names when selecting columns
from a data frame?

Alice Johnson recently tackled this issue
(see [BioC] posting below).

Due to a mis-spelled column name (FileName
instead of Filename) which produced no warning,
Alice spent a fair amount of time tracking down
this bug.  With my fumbling fingers I'll be tracking
down such a bug soon too.

Is there any options() setting, or debug technique
that will flag data frame column extractions that
reference a non-existent column?  It seems to me
that the [.data.frame extractor used to throw an
error if given a mis-spelled variable name, and I
still see lines of code in [.data.frame such as

if (any(is.na(cols))) 
stop(undefined columns selected)



In R 2.5.1 a NULL is silently returned.

 foo - data.frame(Filename = c(a, b))
 foo[, FileName]
NULL

Has something changed so that the code lines
if (any(is.na(cols))) 
stop(undefined columns selected)
in [.data.frame no longer work properly (if
I am understanding the intention properly)?

If not, could  [.data.frame check an
options() variable setting (say
warn.undefined.colnames) and throw a warning
if a non-existent column name is referenced?




 sessionInfo()
R version 2.5.1 (2007-06-27) 
powerpc-apple-darwin8.9.1 

locale:
en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods
base 

other attached packages:
 plotrix lme4   Matrix  lattice 
 2.2-3  0.99875-4 0.999375-0 0.16-2 
 



Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: [EMAIL PROTECTED] on behalf of Johnstone, Alice
Sent: Wed 8/1/2007 7:20 PM
To: [EMAIL PROTECTED]
Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame
 
 For interest sake, I have found out why I wasn't getting my expected
results when using read.AnnotatedDataFrame
Turns out the error was made in the ReadAffy command, where I specified
the filenames to be read from my AnnotatedDataFrame object.  There was a
typo error with a capital N ($FileName) rather than lowercase n
($Filename) as in my target file..whoops.  However this meant the
filename argument was ignored without the error message(!) and instead
of using the information in the AnnotatedDataFrame object (which
included filenames, but not alphabetically) it read the .cel files in
alphabetical order from the working directory - hence the wrong file was
given the wrong label (given by the order of Annotated object) and my
comparisons were confused without being obvious as to why or where.
Our solution: specify that filename is as.character so assignment of
file to target is correct(after correcting $Filename) now that using
read.AnnotatedDataFrame rather than readphenoData.

Data-ReadAffy(filenames=as.character(pData(pd)$Filename),phenoData=pd)

Hurrah!

It may be beneficial to others, that if the filename argument isn't
specified, that filenames are read from the phenoData object if included
here.

Thanks!

-Original Message-
From: Martin Morgan [mailto:[EMAIL PROTECTED] 
Sent: Thursday, 26 July 2007 11:49 a.m.
To: Johnstone, Alice
Cc: [EMAIL PROTECTED]
Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame

Hi Alice --

Johnstone, Alice [EMAIL PROTECTED] writes:

 Using R2.5.0 and Bioconductor I have been following code to analysis 
 Affymetrix expression data: 2 treatments vs control.  The original 
 code was run last year and used the read.phenoData command, however 
 with the newer version I get the error message Warning messages:
 read.phenoData is deprecated, use read.AnnotatedDataFrame instead The 
 phenoData class is deprecated, use AnnotatedDataFrame (with
 ExpressionSet) instead
  
 I use the read.AnnotatedDataFrame command, but when it comes to the 
 end of the analysis the comparison of the treatment to the controls 
 gets mixed up compared to what you get using the original 
 read.phenoData ie it looks like the 3 groups get labelled wrong and so

 the comparisons are different (but they can still be matched up

Re: [R] FW: Selecting undefined column of a data frame (was [BioC]read.phenoData vs read.AnnotatedDataFrame)

2007-08-03 Thread Steven McKinney

Hi Bert,

 -Original Message-
 From: Bert Gunter [mailto:[EMAIL PROTECTED]
 Sent: Fri 8/3/2007 3:19 PM
 To: Steven McKinney; r-help@stat.math.ethz.ch
 Subject: RE: [R] FW: Selecting undefined column of a data frame (was 
 [BioC]read.phenoData vs read.AnnotatedDataFrame)
  
 I suspect you'll get some creative answers, but if all you're worried about
 is whether a column exists before you do something with it, what's wrong
 with:
 
 nm - ... ## a character vector of names
 if(!all(nm %in% names(yourdata))) ## complain
 else ## do something
 
 
 I think this is called defensive programming.

This is a good example of good defensive programming.
I do indeed check variable/object names whenever
obtaining them from an external source (user input,
file input, a list in code).


I was able to practice a defensive programming style in the past
by using
  bar - foo[, FileName]
instead of
  bar - foo$FileName

but this has changed recently, so I need to figure out
some other mechanisms.

R is such a productive language, but this change will
lead many of us to chase elusive typos that used to
get revealed.

I'm hoping that some kind of explicit data frame variable
checking mechanism might be introduced since we've
lost this one.  

It would also be great to have such a
mechanism to help catch list access and extraction
errors.  Why should
foo$FileName
always quietly return NULL?

I'm not sure why the following incongruity is okay.

 foo - matrix(1:4, nrow = 2)
 dimnames(foo) - list(NULL, c(a, b))
 bar - foo[, A]
Error: subscript out of bounds

 foo.df - as.data.frame(foo)
 foo.df
  a b
1 1 3
2 2 4
 bar - foo.df[, A]
 bar
NULL
 


It is a lot of extra typing to wrap every command in
extra code, but more of that will need to happen
going forward.

Steve McKinney


 
 Bert Gunter
 Genentech
 
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Steven McKinney
 Sent: Friday, August 03, 2007 10:38 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] FW: Selecting undefined column of a data frame (was
 [BioC]read.phenoData vs read.AnnotatedDataFrame)
 
 Hi all,
 
 What are current methods people use in R to identify
 mis-spelled column names when selecting columns
 from a data frame?
 
 Alice Johnson recently tackled this issue
 (see [BioC] posting below).
 
 Due to a mis-spelled column name (FileName
 instead of Filename) which produced no warning,
 Alice spent a fair amount of time tracking down
 this bug.  With my fumbling fingers I'll be tracking
 down such a bug soon too.
 
 Is there any options() setting, or debug technique
 that will flag data frame column extractions that
 reference a non-existent column?  It seems to me
 that the [.data.frame extractor used to throw an
 error if given a mis-spelled variable name, and I
 still see lines of code in [.data.frame such as
 
 if (any(is.na(cols))) 
 stop(undefined columns selected)
 
 
 
 In R 2.5.1 a NULL is silently returned.
 
  foo - data.frame(Filename = c(a, b))
  foo[, FileName]
 NULL
 
 Has something changed so that the code lines
 if (any(is.na(cols))) 
 stop(undefined columns selected)
 in [.data.frame no longer work properly (if
 I am understanding the intention properly)?
 
 If not, could  [.data.frame check an
 options() variable setting (say
 warn.undefined.colnames) and throw a warning
 if a non-existent column name is referenced?
 
 
 
 
  sessionInfo()
 R version 2.5.1 (2007-06-27) 
 powerpc-apple-darwin8.9.1 
 
 locale:
 en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods
 base 
 
 other attached packages:
  plotrix lme4   Matrix  lattice 
  2.2-3  0.99875-4 0.999375-0 0.16-2 
  
 
 
 
 Steven McKinney
 
 Statistician
 Molecular Oncology and Breast Cancer Program
 British Columbia Cancer Research Centre
 
 email: smckinney +at+ bccrc +dot+ ca
 
 tel: 604-675-8000 x7561
 
 BCCRC
 Molecular Oncology
 675 West 10th Ave, Floor 4
 Vancouver B.C. 
 V5Z 1L3
 Canada
 
 
 
 
 -Original Message-
 From: [EMAIL PROTECTED] on behalf of Johnstone, Alice
 Sent: Wed 8/1/2007 7:20 PM
 To: [EMAIL PROTECTED]
 Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame
  
  For interest sake, I have found out why I wasn't getting my expected
 results when using read.AnnotatedDataFrame
 Turns out the error was made in the ReadAffy command, where I specified
 the filenames to be read from my AnnotatedDataFrame object.  There was a
 typo error with a capital N ($FileName) rather than lowercase n
 ($Filename) as in my target file..whoops.  However this meant the
 filename argument was ignored without the error message(!) and instead
 of using the information in the AnnotatedDataFrame object (which
 included filenames, but not alphabetically) it read the .cel files in
 alphabetical order from the working directory - hence the wrong file was
 given the wrong label