Hi Thomas,

This is fixed in release (Biostrings 2.14.8 / IRanges 1.4.8) and
devel (Biostrings 2.15.9 / IRanges 1.5.10).
In addition to the methods you reported below, I found a few more
methods that were still not supporting XStringSet objects with
a pool of length > 1 (compact() + the coercion methods from an
XStringSet subtype (B/DNA/RNA/AA) to another subtype).

The new versions of Biostrings / IRanges should become available
thru biocLite() in the next 24 hours.

Cheers,
H.


Thomas Girke wrote:
Hi Hervé,

Thanks for the clarification. Right now this is just a slight inconvenience,
whereas the support for larger object sizes is a very welcome major improvement.

Thanks for doing this.

Thomas


On Mon, Nov 23, 2009 at 05:10:44PM -0800, Hervé Pagès wrote:
Hi Thomas,

The internals of the XStringSet container have changed in BioC 2.5
in order to support bigger objects (i.e. objects that can have more
than 2^31 letters in them, now this limit is 2^31 letters per element
and the maximum nb of elements is 2^31, very much like for
standard character vectors) and also to support more efficient
combining thru c() or append() (this is now achieved with no copying
of the sequence data). The fact that reverseComplement(), reverse(), complement() and chartr() are currently broken on XStringSet objects that have gone thru combining is because of this change in the internals. Most methods that operate on XStringSet objects were adapted
except those 4 methods because of lack of time. I'm working on this
right now and will post again here when it's fixed. Thanks for the
reminder and sorry for the inconvenience.

Cheers,
H.


Thomas Girke wrote:
Dear List,

Is there an explanation for the behavior change of XStringSet
objects that have gone through an append() or c() step and those
that didn't? I am not observing this problem in the previous R/BioC release.

Below is a simple example to reproduce this error.

Thanks in advance for your help.

Thomas

## Example
library(Biostrings)
dset1 <- DNAStringSet(c("GCATATTAC", "AATCGATCC", "GCATATTAC"))
dset2 <- DNAStringSet(c("CCGCATATTAC", "AAAATCGATCC", "GCATATAATAC"))
dset3 <- c(dset1, dset2) # using append() doesn't fix the problem
reverseComplement(dset3)
Error in .local(x, ...) : IRanges internal error: length(x) != 1

DNAStringSet(dset3, start=1, end=4)
Error in super(x) : Biostrings internal error: length(x...@pool) != 1

## The problem goes away by doing the following
dset3fix <- DNAStringSet(unlist(strsplit(toString(dset3), ", ")))
reverseComplement(dset3fix)
 A DNAStringSet instance of length 6
   width seq
[1]     9 GTAATATGC
[2]     9 GGATCGATT
[3]     9 GTAATATGC
[4]    11 GTAATATGCGG
[5]    11 GGATCGATTTT
[6]    11 GTATTATATGC


DNAStringSet(dset3fix, start=1, end=4)
 A DNAStringSet instance of length 6
   width seq
[1]     4 GCAT
[2]     4 AATC
[3]     4 GCAT
[4]     4 CCGC
[5]     4 AAAA
[6]     4 GCAT


sessionInfo()
R version 2.10.0 (2009-10-26)
x86_64-unknown-linux-gnu

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] Biostrings_2.14.1 IRanges_1.4.3

loaded via a namespace (and not attached):
[1] Biobase_2.6.0

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [email protected]
Phone:  (206) 667-5791
Fax:    (206) 667-1319


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [email protected]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to