Patrick,
thank you very much for your quick and helpful answer !
Yes, using :
> align <- pairwiseAlignment(samp1,ref1)
> indel(subject(align))
I'm about to get what I'm looking for. Now my question is, which
commands are (will be) availabel for mining an IRangesList-object.
Most of all I'm interested in what would correspond to getting :
> indel(subject(align))@elements
> subject(align)@ra...@start
> subject(align)@ra...@witdth # in fact, so far I can do without this one
(unless you think the @elements, and @range won't change in the future ...)
With these elements I manage now to extract the very nucleotides that
were inserted/deleted.
Wolfgang
Patrick Aboyoun a écrit :
Wolfgang,
Below is code that retrieves the indel locations you are looking for.
I like your attempts at using indel, insertion, and deletion for
PairwiseAlignment objects and I'll add the methods for
PairwiseAlignment objects to BioC 2.5 (devel) shortly using the
conventions that I specify below.
> suppressMessages(library(Biostrings))
> ref1 <- DNAString("GGGATACTTCACCAGCTCCCTGGC") # my pattern
> samp1 <-
DNAStringSet(c("GGGATACTACACCAGCTCCCTGGC","GGGATACTTACACCAGCTCCCTGGC","ATACTTCACCAGCTCCCTG"))
> # 1st has a mutation, 2nd has an insertion, the 3rd is simply
shorter ...
>
> align <- pairwiseAlignment(samp1,ref1)
>
> nindel(align)
An object of class “InDel”
Slot "insertion":
Length WidthSum
[1,] 0 0
[2,] 1 1
[3,] 0 0
Slot "deletion":
Length WidthSum
[1,] 0 0
[2,] 0 0
[3,] 0 0
> deletions <- indel(pattern(align))
> deletions
CompressedIRangesList: 3 elements
> insertions <- indel(subject(align))
> insertions
CompressedIRangesList: 3 elements
> insertions[[2]]
IRanges instance:
start end width
[1] 10 10 1
> sessionInfo()
R version 2.10.0 Under development (unstable) (2009-06-28 r48863)
i386-apple-darwin9.7.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Biostrings_2.13.26 IRanges_1.3.41
loaded via a namespace (and not attached):
[1] Biobase_2.5.4
Wolfgang Raffelsberger wrote:
Dear list,
previously I've been extracting indel-information from sequences
aligned by the Biostrings function pairwiseAlignment(), which is
probably not the best way since the class
'PairwiseAlignedFixedSubject' has evoled & changed and my old code
won't work any more. Now trying to use the library-provided functions
to access the information/details about indels (ie their localization
on the pattern and possibly the indel sequence ). However, I can't
find a function to extract this information, that is (to the best of
my knowledge) part of the aligned object.
## here an example :
library(Biostrings)
ref1 <- DNAString("GGGATACTTCACCAGCTCCCTGGC") # my pattern
samp1 <-
DNAStringSet(c("GGGATACTACACCAGCTCCCTGGC","GGGATACTTACACCAGCTCCCTGGC","ATACTTCACCAGCTCCCTG"))
# 1st has a mutation, 2nd has an insertion, the 3rd is simply shorter
...
align <- pairwiseAlignment(samp1,ref1)
nindel(align) # insertion was found properly but I can't see at which
nt position the indel was found (neither if it's an insertion or
deletion)
indel(align) # Error in function (classes, fdef, mtable) unable to
find an inherited method for function...
insertion(align) # Error in function (classes, fdef, mtable) unable
to find an inherited method for function ...
deletion(align) # neither ...
?AlignedXStringSet # says under 'Accessor methods' that indel()
exists ..
## ideally I'd be looking for something like
mismatchTable(align) # but addressing indels ...
## for completeness :
> sessionInfo()
R version 2.9.1 (2009-06-26)
i386-pc-mingw32
locale:
LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ShortRead_1.2.1 lattice_0.17-25 BSgenome_1.12.3 Biostrings_2.12.7
IRanges_1.2.3
loaded via a namespace (and not attached):
[1] Biobase_2.4.1 grid_2.9.1 hwriter_1.1
Thank's in advance,
Wolfgang Raffelsberger
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Wolfgang Raffelsberger, PhD
Laboratoire de BioInformatique et Génomique Intégratives
CNRS UMR7104, IGBMC, 1 rue Laurent Fries, 67404 Illkirch Strasbourg,
France
Tel (+33) 388 65 3300 Fax (+33) 388 65 3276
wolfgang.raffelsberger (at) igbmc.fr
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
--
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Wolfgang Raffelsberger, PhD
Laboratoire de BioInformatique et Génomique Intégratives
CNRS UMR7104, IGBMC,
1 rue Laurent Fries, 67404 Illkirch Strasbourg, France
Tel (+33) 388 65 3300 Fax (+33) 388 65 3276
http://www-bio3d-igbmc.u-strasbg.fr/~wraff
[email protected]
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing