see the information on compact() method in XStringSet-class package:Biostrings R Documentation
to rationalize this you need to think about the difference between a view and a concrete instance. typically you do not want a copy to be made on each view On Thu, May 27, 2010 at 10:21 AM, Hans-Ulrich Klein <[email protected] > wrote: > Hi all, > > I observed that some DNAStrings (and also DNAStringSets) objects are to > large after subsetting: > > > library("Rsamtools") > > parameters = ScanBamParam() > > bam = scanBam("data/N01.bam", param=parameters) > > ss = bam[[1]]$seq > > ss > A DNAStringSet instance of length 230980 > [...] > > print(object.size(ss), units="Mb") > 83.3 Mb > > dnaS = ss[[5]] > > dnaS > 128-letter "DNAString" instance > seq: > TAGCGTGGATACAGAGGGACATCTATTGACCAGCTA...AAAGTTGTGCTTTATTTGATGAATAAGTATTGAACA > > print(object.size(dnaS), units="Mb") > 80.7 Mb > > print(object.size(as.character(dnaS)), units="Kb") > 0.2 Kb > > When I write the 128-letter DNAString to disk, it remains quite large (~ > 20Mb). > > Best wishes, > Hans-Ulrich > > > > > > sessionInfo() > R version 2.11.0 (2010-04-22) > x86_64-pc-linux-gnu > > locale: > [1] C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] Rsamtools_1.0.1 Biostrings_2.16.2 GenomicRanges_1.0.1 > [4] IRanges_1.6.4 > > loaded via a namespace (and not attached): > [1] Biobase_2.8.0 > > > -- > Hans-Ulrich Klein > Department of Medical Informatics and Biomathematics > University of Münster > Domagkstrasse 9 > 48149 Münster, Germany > Tel.: +49 (0)251 83-58405 > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > [[alternative HTML version deleted]]
_______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
