see the information on compact() method in XStringSet-class
package:Biostrings           R Documentation

to rationalize this you need to think about the difference between a view
and a concrete instance.  typically you do not want a copy to be made on
each view

On Thu, May 27, 2010 at 10:21 AM, Hans-Ulrich Klein <[email protected]
> wrote:

> Hi all,
>
> I observed that some DNAStrings (and also DNAStringSets) objects are to
> large after subsetting:
>
> > library("Rsamtools")
> > parameters = ScanBamParam()
> > bam = scanBam("data/N01.bam", param=parameters)
> > ss = bam[[1]]$seq
> > ss
>  A DNAStringSet instance of length 230980
>  [...]
> > print(object.size(ss), units="Mb")
> 83.3 Mb
> > dnaS = ss[[5]]
> > dnaS
>  128-letter "DNAString" instance
> seq:
> TAGCGTGGATACAGAGGGACATCTATTGACCAGCTA...AAAGTTGTGCTTTATTTGATGAATAAGTATTGAACA
> > print(object.size(dnaS), units="Mb")
> 80.7 Mb
> > print(object.size(as.character(dnaS)), units="Kb")
> 0.2 Kb
>
> When I write the 128-letter DNAString to disk, it remains quite large (~
> 20Mb).
>
> Best wishes,
> Hans-Ulrich
>
>
>
>
> > sessionInfo()
> R version 2.11.0 (2010-04-22)
> x86_64-pc-linux-gnu
>
> locale:
> [1] C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] Rsamtools_1.0.1     Biostrings_2.16.2   GenomicRanges_1.0.1
> [4] IRanges_1.6.4
>
> loaded via a namespace (and not attached):
> [1] Biobase_2.8.0
>
>
> --
> Hans-Ulrich Klein
> Department of Medical Informatics and Biomathematics
> University of Münster
> Domagkstrasse 9
> 48149 Münster, Germany
> Tel.: +49 (0)251 83-58405
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to