On 09/02/2016 12:25 PM, Hervé Pagès wrote:
Hi,
On 09/01/2016 12:00 AM, Dario Strbenac wrote:
Good day,
According to the documentation, I wouldn't think that substr or
strsplit would work on a BStringSet, but substr does.
IDs
A BStringSet instance of length 5
width seq
[1] 61 D00626:168:C9CWMANXX:1:1105:1816:1998 1:N:0:TCCGGAGA+ATAGAGGC
[2] 61 D00626:168:C9CWMANXX:1:1105:2113:1989 1:N:0:TCCGGAGA+ATAGAGGC
[3] 61 D00626:168:C9CWMANXX:1:1105:2703:1986 1:N:0:TCCGGAGA+ATAGAGGC
[4] 61 D00626:168:C9CWMANXX:1:1105:3255:1979 1:N:0:TCCGGAGA+ATAGAGGC
[5] 61 D00626:168:C9CWMANXX:1:1105:4525:1995 1:N:0:TCCGGAGA+ATAGAGGC
substr(IDs, 1, 37)
[1] "D00626:168:C9CWMANXX:1:1105:1816:1998"
[2] "D00626:168:C9CWMANXX:1:1105:2113:1989"
[3] "D00626:168:C9CWMANXX:1:1105:2703:1986"
[4] "D00626:168:C9CWMANXX:1:1105:3255:1979"
[5] "D00626:168:C9CWMANXX:1:1105:4525:1995"
strsplit(IDs, ' ')
Error in strsplit(IDs, " ") : non-character argument
I think that both of these functions shouldn't work or both should
work, to be consistent.
Why? Because they both have "str" in their name?
It sounds that you are expecting that every string manipulation function
defined in base R should work on a BStringSet object. Well that's not
the case and I don't think that's ever going to happen. Some of them
work and some of them don't. We can add more if needed (e.g. strsplit)
but there are things like the grep family that BStringSet objects will
probably never support.
If you need to strsplit() an XStringSet object, you can use this:
strsplitXStringSet <- function(x, split)
{
m <- vmatchPattern(split, x)
at <- gaps(IRangesList(start=start(m),
end=end(m)), start=1L, end=width(x))
extractAt(x, at)
}
It's going to behave like strsplit(x, split, fixed=TRUE) except when
there is a match at the beginning or end of one of the sequences (in
which case strsplit() has a questionable behavior). Also, unlike
strsplit(), strsplitXStringSet() doesn't support an empty split
pattern.
Another difference between strsplit() and strsplitXStringSet() is when
some matches are adjacent or overlapping. This will be explained in the
man page of strsplit,XStringSet-method.
H.
Note that BStringSet objects have supported the reverse operation
for a while. See ?unstrsplit
I'll add strsplitXStringSet() to Biostrings, as the "strsplit" method
for XStringSet objects.
H.
--------------------------------------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpa...@fredhutch.org
Phone: (206) 667-5791
Fax: (206) 667-1319
_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel