On 05/07/2014 12:06 PM, Michael Love wrote:
hi,
Is there a way that I can change the names of the assays slot of a
SummarizedExperiment, without making a new copy of the data contained
within? Assume I get an SE which has already been constructed, but no
names on the assays() SimpleList.
Hi Mike --
names(assays(se)) = counts
extracts the assays from se, then applies the names to the SimpleList, then
re-assigns the SimpleList to the SummarizedExperiment. The memory copy (of big
data) is actually in the extraction assays(se)
m = matrix(0, 0, 0); tracemem(m)
[1] 0x3449b4e8
se = SummarizedExperiment(m)
a = assays(se)
tracemem[0x3449b4e8 - 0x34ef64f0]: lapply lapply lapply lapply endoapply
endoapply assays assays
which can actually be avoided by asking for the assays without their dimnames
a = assays(se, withDimnames=FALSE)
and from there
names(a) = counts
assays(se) = a
verifying that we haven't actually copied the matrix
.Internal(inspect(assays(se, withDimnames=FALSE)[[1]]))
@3449b4e8 14 REALSXP g0c0 [NAM(2),TR,ATT] (len=0, tl=0)
ATTRIB:
@3449b4b0 02 LISTSXP g0c0 []
TAG: @b9c778 01 SYMSXP g0c0 [LCK,gp=0x4000] dim (has value)
@3449a118 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 0,0
.Internal(inspect(m))
@3449b4e8 14 REALSXP g0c0 [NAM(2),TR,ATT] (len=0, tl=0)
ATTRIB:
@3449b4b0 02 LISTSXP g0c0 []
TAG: @b9c778 01 SYMSXP g0c0 [LCK,gp=0x4000] dim (has value)
@3449a118 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 0,0
One would hope (a) that I'd followed through on a previous promise to just apply
the dimnames up-front, so that there is no need to use withDimnames=FALSE to
avoid the copying (there might have been a price on the way in) and (b) that the
following would work
names(assays(se, withDimnames=FALSE)) = counts
it didn't
names(assays(se, withDimnames=FALSE)) = counts
Error in slot(x, nm) :
no slot of name withDimnames for this object of class SummarizedExperiment
but does in 1.17.13
Martin
thanks,
Mike
library(GenomicRanges)
gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1291106 691710298 91.4 1590760 85.0
Vcells 117861991925843 14.7 1724123 13.2
m - matrix(1:2e7, ncol=10)
gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 129 69.01967602 105.1 1590760 85.0
Vcells 11178604 85.3 22482701 171.6 21178631 161.6
# made a ~75 Mb matrix
colnames(m) - letters[1:10]
gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1291149 69.01967602 105.1 1590760 85.0
Vcells 11178679 85.3 22482701 171.6 21179851 161.6
se - SummarizedExperiment(m)
gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1302603 69.61967602 105.1 1623929 86.8
Vcells 12189777 93.1 22482701 171.6 21179851 161.6
# so far no copying
names(assays(se)) - counts
gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1303174 69.61967602 105.1 1623929 86.8
Vcells 22190847 169.4 23686836 180.8 22203423 169.4
# last step made a copy
sessionInfo()
R Under development (unstable) (2014-05-07 r65539)
Platform: x86_64-apple-darwin12.5.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] GenomicRanges_1.17.12 GenomeInfoDb_1.1.3IRanges_1.99.13
[4] S4Vectors_0.0.6 BiocGenerics_0.11.2
loaded via a namespace (and not attached):
[1] RCurl_1.95-4.1 stats4_3.2.0 XVector_0.5.6
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel