Option (2) sounds nice. Even if you used (3), I would end up naming most
of my columns by seq_len anyway. I usually don't have very good
candidates for column names for my counting functions; the inputs to the
functions are BAM/other file paths that can get quite long, and I don't
like having spaced-out columns when I look at my assays.
I concede that the example of se[,2] <- se[,1] wasn't the most
realistic; it came about from some unit tests I was using to check
subset replacement behaviour, and it was failing once I threw in column
names.
On 05/12/15 18:35, Morgan, Martin wrote:
The philosophy motivating the check is that names make the relationship between
samples and data explicit, rather than relying on fragile positional
information. With this in mind, I wonder why your upstream work flow does not
include dimnames on the matrix?
That said, the check was introduced in
------------------------------------------------------------------------
r68053 | [email protected] | 2012-07-27 03:35:55 -0400 (Fri, 27 Jul 2012) | 2
lines
SummarizedExperiment uses rowData=GRangesList() as defult
------------------------------------------------------------------------
To the observations you mention below one could also add that the rownames()
can be NULL, so there is an uncomfortable asymmetry.
I could (1) remove the check (but use the DataFrame() constructor in an
admittedly hackish way, not wanting to rely on the internal new() function). I
could also (2) construct row / column names as seq_len(nrow()) /
seq_len(ncol()).
Or (3) the code could be tightened to more closely adhere to the philosophy
above (for instance, I think duplication of columns implied by se[,2] = se[,1]
is worth stop()ing over, and allowing colnames(se) = NULL only enables bad
practice). Likely this would be disruptive.
For what it's worth, we have
library(Biobase)
eset = ExpressionSet(matrix(0, 1, 2))
dimnames(eset)
[[1]]
[1] "1"
[[2]]
[1] "1" "2"
colnames(eset) = NULL
Error in `sampleNames<-`(`*tmp*`, value = NULL) :
'value' length (0) must equal sample number in AssayData (2)
so dimnames are being imposed.
(2) would be my current compromise preference.
Martin
________________________________________
From: Bioc-devel [[email protected]] on behalf of Aaron Lun
[[email protected]]
Sent: Saturday, December 05, 2015 7:36 AM
To: bioc-devel
Subject: Re: [Bioc-devel] do SummarizedExperiments really need colnames?
Hello all,
At the start of the SummarizedExperiment constructor, there's a code
block that throws an error if 'colData' is not specified and the assay
matrices don't have column names.
Is this really necessary? In many cases, I just want to get a matrix
into the SE0 object without having to worry about column names. It
doesn't seem like there's a requirement for this in the SE0 class,
either; it seems happy with 'colnames(se0) <- NULL', and setting
'colData' to a 'DataFrame' with 'NULL' row names doesn't break the
constructor.
The requirement for column names causes issues for some manipulations -
for example:
out <- SummarizedExperiment(matrix(0, 10, 5),
colData=DataFrame(row.names=1:5))
out[,1] <- out[,2]
## Error in `rownames<-`(`*tmp*`, value = c("2", "2", "3", "4", "5")) :
## duplicate rownames not allowed
While this is fair enough, it's a bit annoying if I didn't want or need
the names in the first place.
The error mentioned above precedes the construction of the missing
'colData', so if column names are missing, then a more general way to
construct the 'colData' would to do 'new("DataFrame", nrows=ncol(assays))'.
Cheers,
Aaron
_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel
This email message may contain legally privileged and/or confidential
information. If you are not the intended recipient(s), or the employee or
agent responsible for the delivery of this message to the intended
recipient(s), you are hereby notified that any disclosure, copying,
distribution, or use of this email message is prohibited. If you have received
this message in error, please notify the sender immediately by e-mail and
delete this email message from your computer. Thank you.
_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel