Thank you! That helped me a lot.

A further question: Is there any way to access the complete DNAStringSet "dnaS" after I removed it using the rm() function? If not, keeping the complete DNAStringSet in memory does not make much sense to me.

Thank you,
Hans-Ulrich



Vincent Carey wrote:
see the information on compact() method in XStringSet-class package:Biostrings R Documentation

to rationalize this you need to think about the difference between a view and a concrete instance. typically you do not want a copy to be made on each view

On Thu, May 27, 2010 at 10:21 AM, Hans-Ulrich Klein <[email protected] <mailto:[email protected]>> wrote:

    Hi all,

    I observed that some DNAStrings (and also DNAStringSets) objects
    are to large after subsetting:

    > library("Rsamtools")
    > parameters = ScanBamParam()
    > bam = scanBam("data/N01.bam", param=parameters)
    > ss = bam[[1]]$seq
    > ss
     A DNAStringSet instance of length 230980
     [...]
    > print(object.size(ss), units="Mb")
    83.3 Mb
    > dnaS = ss[[5]]
    > dnaS
     128-letter "DNAString" instance
    seq:
    TAGCGTGGATACAGAGGGACATCTATTGACCAGCTA...AAAGTTGTGCTTTATTTGATGAATAAGTATTGAACA
    > print(object.size(dnaS), units="Mb")
    80.7 Mb
    > print(object.size(as.character(dnaS)), units="Kb")
    0.2 Kb

    When I write the 128-letter DNAString to disk, it remains quite
    large (~ 20Mb).

    Best wishes,
    Hans-Ulrich




    > sessionInfo()
    R version 2.11.0 (2010-04-22)
    x86_64-pc-linux-gnu

    locale:
    [1] C

    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods   base

    other attached packages:
    [1] Rsamtools_1.0.1     Biostrings_2.16.2   GenomicRanges_1.0.1
    [4] IRanges_1.6.4

    loaded via a namespace (and not attached):
    [1] Biobase_2.8.0


-- Hans-Ulrich Klein
    Department of Medical Informatics and Biomathematics
    University of Münster
    Domagkstrasse 9
    48149 Münster, Germany
    Tel.: +49 (0)251 83-58405

    _______________________________________________
    Bioc-sig-sequencing mailing list
    [email protected]
    <mailto:[email protected]>
    https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing




--
Hans-Ulrich Klein
Department of Medical Informatics and Biomathematics
University of Münster
Domagkstrasse 9
48149 Münster, Germany
Tel.: +49 (0)251 83-58405

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to