One more thing. See below...

On 09/13/2017 02:54 PM, Ludwig Geistlinger wrote:
Coercing vice versa, i.e. from SummarizedExperiment to ExpressionSet,
which is defined in

SummarizedExperiment/R/makeSummarizedExperimentFromExpressionSet.R

as follows:

setAs("SummarizedExperiment", "ExpressionSet", function(from)
     as(as(from, "RangedSummarizedExperiment"), "ExpressionSet")
)

also seems to be a bit problematic, as it makes you lose your rowData/fData.



Here is an example:

## Constructing the SE similar to examples of ?SummarizedExperiment
nrows <- 200; ncols <- 6
counts <- matrix(runif(nrows * ncols, 1, 1e4), nrows)
colData <- DataFrame(Treatment=rep(c("ChIP", "Input"), 3),
                           row.names=LETTERS[1:6])


## some rowData with simulated gene IDs
rowData <- DataFrame(EntrezID=sample(1000, 200), row.names=paste0("g",
1:200))
se <- SummarizedExperiment(assays=SimpleList(exprs=counts),
                             colData=colData, rowData=rowData)

# this is how it looks
rowData(se)
DataFrame with 200 rows and 1 column
      EntrezID
     <integer>
1         289
2         476
3         608
4         998
5         684
...       ...
196       331
197       590
198       445
199        95
200       129

(why did I actually lost the rownames g1-g200 here?)

Your rownames were moved to the names of the object:

> head(names(se))
[1] "g1" "g2" "g3" "g4" "g5" "g6"

The rowData() accessor (like the mcols() accessor, note that rowData()
is just an alias for mcols) does not restore them by default, unless
you use 'use.names=TRUE'.

> rowData(se, use.names=TRUE)
DataFrame with 200 rows and 1 column
      EntrezID
     <integer>
g1         616
g2          45
g3         944
g4         632
g5         270
...        ...
g196       827
g197       943
g198       291
g199       432
g200       106

All Vector derivatives do that (e.g. GRanges), not just
SummarizedExperiment.

The reason for this design is that the rownames must be unique
(this is a base R requirement). By moving them from the DataFrame
containing the metadata columns to the names of the object, Vector
derivatives can be subsetted in a way that repeat some of their
elements. If the rownames were on the DataFrame containing the
metadata columns, these subsetting operations wouldn't be
possible.

Hope this makes sense,
H.



## Coercing to Expression makes me losing the rowData/fData
eset <- as(se, "ExpressionSet")
fData(eset)
data frame with 0 columns and 200 rows


## So where is the problem?
## Apparently in the coercion
##    from SummarizedExperiment to RangedSummarizedExperiment
rse <- as(se, "RangedSummarizedExperiment")
rowData(rse)
DataFrame with 200 rows and 0 columns



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to