Great Herve, thanks a lot! Florian --
On 11/7/12 3:06 AM, "Hervé Pagès" <hpa...@fhcrc.org> wrote: >Hi Florian, > >I just removed the 'substitutionArray' slot from PairwiseAlignments >objects in Biostrings 2.27.7. The slot didn't seem to be used/needed >by any downstream method. > > > packageVersion("Biostrings") > [1] Œ2.27.7¹ > > x <- "xxxabcdefghijklmnopqyyy" > > y <- "abcdhijkzzzzlmnpqr" > > pa <- pairwiseAlignment(x, y) > > slotNames(pa) > [1] "pattern" "subject" "type" "score" >"gapOpening" > [6] "gapExtension" > > validObject(pa) > [1] TRUE > > object.size(pa) > 35528 bytes > >... instead of 35308996 bytes! 3 orders of magnitude smaller :-) > >Cheers, >H. > > >On 11/05/2012 03:45 AM, Hahne, Florian wrote: >> Indeed. I did not look the far into the implementation, it just seemed >>odd >> to me that the objects got that inflated. scoreOnly is not really that >> helpful if you want to deal with the actual alignments. The only >> reasonable application I see for it is if you want to rank a bunch of >> sequences by pairwise similarity. This gigantic memory footprint is >>really >> breaking things once you start doing a lot of these pairwise alignment >> operations in parallel. mclapply complains about not being able to turn >> such large objects into a raw vector, and serializing to disk quickly >> fills your hard drive. You also loose a lot of the time gained by >>parallel >> processing just by writing and loading gigabytes of data... >> I don't know enough about the internals of the PairwiseAlignments >>classes, >> but it seems that there must be a way to avoid having this huge array as >> part of the object. As a quick and dirty fix for now I just replaced the >> substitutionArray slot with an empty matrix and all the downstream >> operations that I wanted to do still work. Would be great if you could >> take a look into this, Herve. >> Thanks, >> Florian >> > >-- >Hervé Pagès > >Program in Computational Biology >Division of Public Health Sciences >Fred Hutchinson Cancer Research Center >1100 Fairview Ave. N, M1-B514 >P.O. Box 19024 >Seattle, WA 98109-1024 > >E-mail: hpa...@fhcrc.org >Phone: (206) 667-5791 >Fax: (206) 667-1319 _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel