Hi Felix,
Nice catch. This can actually be reproduced with just:
> example(SummarizedExperiment)
> metadata(se0) <- list(aa="aa")
> se0[1 , ] <- se0[1 , ]
> metadata(se0)
$aa
[1] "aa"
$aa
[1] "aa"
The culprit is this line:
ans_metadata <- c(metadata(x), metadata(value))
in the "[<-" method for SummarizedExperiment objects.
So somehow it looks like it was a deliberate decision to have
[<- combine the metadata of 'x' and 'value'. Problem is that
this breaks the more-than-reasonable expectation that something
like x[i , j] <- x[i , j] should be a no-op.
I replaced the above line with:
ans_metadata <- metadata(x)
in SummarizedExperiment 1.9.5 (devel). With this change [<-
leaves metadata(x) intact and x[i , j] <- x[i , j] behaves like
a no-op:
https://github.com/Bioconductor/SummarizedExperiment/commit/e4fcb99c442e2f17b0ccddfb05df9f160e0bbe40
Will port to release soon.
Cheers,
H.
On 12/12/2017 01:05 AM, Felix Ernst wrote:
Hi all,
I got a bit of weird behaviour with SummarizedExperiments in Bioc 3.6 and
3.7. I suppose it is a bug, but I might be wrong, since the accession to the
SummarizedExperiment object is not really straight forward. Any suggestions?
library(GenomicRanges)
library(SummarizedExperiment)
nrows <- 200; ncols <- 6
counts <- matrix(runif(nrows * ncols, 1, 1e4), nrows)
colnames(counts) <- LETTERS[1:6]
rownames(counts) <- 1:nrows
counts2 <- counts-floor(counts)
rowRanges <- GRanges(rep(c("chr1", "chr2"), c(50, 150)),
IRanges(floor(runif(200, 1e5, 1e6)), width=100),
strand=sample(c("+", "-"), 200, TRUE),
feature_id=sprintf("ID%03d", 1:200))
colData <- DataFrame(Treatment=rep(c("ChIP", "Input"), 3),
row.names=LETTERS[1:6])
se <- SummarizedExperiment(assays=list(counts=counts),
rowRanges=rowRanges,
colData=colData)
colData(se)$xyz <- rep("",ncol(se))
metadata(se) <- list("meep" = "meep")
str(metadata(se))
colData(se[, 1])$xyz <- "abc"
str(metadata(se))
The first metadata() returns a list, length of 1, with the correct data. The
second call returns a list of two, with a duplicated entries and every
further colData modification (and replacing data) duplicates the entries in
the metadata further.
str(metadata(se))
List of 1
$ meep: chr "meep"
colData(se[, 1])$xyz <- "abc"
str(metadata(se))
List of 2
$ meep: chr "meep"
$ meep: chr "meep"
colData(se[, 2])$xyz <- "abc"
str(metadata(se))
List of 4
$ meep: chr "meep"
$ meep: chr "meep"
$ meep: chr "meep"
$ meep: chr "meep"
colData(se[, 2])$xyz <- "abc"
str(metadata(se))
List of 8
$ meep: chr "meep"
$ meep: chr "meep"
$ meep: chr "meep"
$ meep: chr "meep"
$ meep: chr "meep"
$ meep: chr "meep"
$ meep: chr "meep"
$ meep: chr "meep"
Thanks for any advice and suggestions.
Felix
---
Felix Ernst, PhD
Universit� Libre de Bruxelles
RNA MOLECULAR BIOLOGY
BIOPARK Charleroi Brussels-South CAMPUS
Rue Profs Jeener & Brachet, 12
B-6041 Charleroi - Gosselies
BELGIUM
+32(2)650 9774 (office phone)
<mailto:felix.er...@ulb.ac.be> felix.er...@ulb.ac.be
[[alternative HTML version deleted]]
_______________________________________________
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ZQe-rRouYDtnCV1eWpTTwXEhYq7F6bt4J5-bJtIYxyw&s=_1NFvrNbqOfrWIP1fxPoIZU9Og4dQzUjfpjp2ww6tF8&e=
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpa...@fredhutch.org
Phone: (206) 667-5791
Fax: (206) 667-1319
_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel