Hi all,

I observed that some DNAStrings (and also DNAStringSets) objects are to large after subsetting:

> library("Rsamtools")
> parameters = ScanBamParam()
> bam = scanBam("data/N01.bam", param=parameters)
> ss = bam[[1]]$seq
> ss
  A DNAStringSet instance of length 230980
  [...]
> print(object.size(ss), units="Mb")
83.3 Mb
> dnaS = ss[[5]]
> dnaS
  128-letter "DNAString" instance
seq: TAGCGTGGATACAGAGGGACATCTATTGACCAGCTA...AAAGTTGTGCTTTATTTGATGAATAAGTATTGAACA
> print(object.size(dnaS), units="Mb")
80.7 Mb
> print(object.size(as.character(dnaS)), units="Kb")
0.2 Kb

When I write the 128-letter DNAString to disk, it remains quite large (~ 20Mb).

Best wishes,
Hans-Ulrich




> sessionInfo()
R version 2.11.0 (2010-04-22)
x86_64-pc-linux-gnu

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] Rsamtools_1.0.1     Biostrings_2.16.2   GenomicRanges_1.0.1
[4] IRanges_1.6.4

loaded via a namespace (and not attached):
[1] Biobase_2.8.0


--
Hans-Ulrich Klein
Department of Medical Informatics and Biomathematics
University of Münster
Domagkstrasse 9
48149 Münster, Germany
Tel.: +49 (0)251 83-58405

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to