Patrick,

thank you very much for your quick and helpful answer !

Yes, using  :
> align <- pairwiseAlignment(samp1,ref1)
> indel(subject(align))

I'm about to get what I'm looking for. Now my question is, which commands are (will be) availabel for mining an IRangesList-object.
Most of all I'm interested in what would correspond to getting :

> indel(subject(align))@elements
> subject(align)@ra...@start
> subject(align)@ra...@witdth   # in fact, so far I can do without this one

(unless you think the @elements, and  @range won't change in the future ...)
With these elements I manage now to extract the very nucleotides that were inserted/deleted.

Wolfgang


Patrick Aboyoun a écrit :
Wolfgang,
Below is code that retrieves the indel locations you are looking for. I like your attempts at using indel, insertion, and deletion for PairwiseAlignment objects and I'll add the methods for PairwiseAlignment objects to BioC 2.5 (devel) shortly using the conventions that I specify below.

> suppressMessages(library(Biostrings))
> ref1 <- DNAString("GGGATACTTCACCAGCTCCCTGGC") # my pattern
> samp1 <- DNAStringSet(c("GGGATACTACACCAGCTCCCTGGC","GGGATACTTACACCAGCTCCCTGGC","ATACTTCACCAGCTCCCTG")) > # 1st has a mutation, 2nd has an insertion, the 3rd is simply shorter ...
>
> align <- pairwiseAlignment(samp1,ref1)
>
> nindel(align)
An object of class “InDel”
Slot "insertion":
Length WidthSum
[1,] 0 0
[2,] 1 1
[3,] 0 0

Slot "deletion":
Length WidthSum
[1,] 0 0
[2,] 0 0
[3,] 0 0

> deletions <- indel(pattern(align))
> deletions
CompressedIRangesList: 3 elements
> insertions <- indel(subject(align))
> insertions
CompressedIRangesList: 3 elements
> insertions[[2]]
IRanges instance:
start end width
[1] 10 10 1
> sessionInfo()
R version 2.10.0 Under development (unstable) (2009-06-28 r48863)
i386-apple-darwin9.7.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] Biostrings_2.13.26 IRanges_1.3.41

loaded via a namespace (and not attached):
[1] Biobase_2.5.4


Wolfgang Raffelsberger wrote:
Dear list,

previously I've been extracting indel-information from sequences aligned by the Biostrings function pairwiseAlignment(), which is probably not the best way since the class 'PairwiseAlignedFixedSubject' has evoled & changed and my old code won't work any more. Now trying to use the library-provided functions to access the information/details about indels (ie their localization on the pattern and possibly the indel sequence ). However, I can't find a function to extract this information, that is (to the best of my knowledge) part of the aligned object.

## here an example :
library(Biostrings)
ref1 <- DNAString("GGGATACTTCACCAGCTCCCTGGC") # my pattern
samp1 <- DNAStringSet(c("GGGATACTACACCAGCTCCCTGGC","GGGATACTTACACCAGCTCCCTGGC","ATACTTCACCAGCTCCCTG")) # 1st has a mutation, 2nd has an insertion, the 3rd is simply shorter ...

align <- pairwiseAlignment(samp1,ref1)

nindel(align) # insertion was found properly but I can't see at which nt position the indel was found (neither if it's an insertion or deletion) indel(align) # Error in function (classes, fdef, mtable) unable to find an inherited method for function... insertion(align) # Error in function (classes, fdef, mtable) unable to find an inherited method for function ...
deletion(align) # neither ...
?AlignedXStringSet # says under 'Accessor methods' that indel() exists ..

## ideally I'd be looking for something like
mismatchTable(align) # but addressing indels ...


## for completeness :
> sessionInfo()
R version 2.9.1 (2009-06-26)
i386-pc-mingw32

locale:
LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ShortRead_1.2.1 lattice_0.17-25 BSgenome_1.12.3 Biostrings_2.12.7 IRanges_1.2.3
loaded via a namespace (and not attached):
[1] Biobase_2.4.1 grid_2.9.1 hwriter_1.1

Thank's in advance,
Wolfgang Raffelsberger

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Wolfgang Raffelsberger, PhD
Laboratoire de BioInformatique et Génomique Intégratives
CNRS UMR7104, IGBMC, 1 rue Laurent Fries, 67404 Illkirch Strasbourg, France
Tel (+33) 388 65 3300 Fax (+33) 388 65 3276
wolfgang.raffelsberger (at) igbmc.fr

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing





--
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Wolfgang Raffelsberger, PhD
Laboratoire de BioInformatique et Génomique Intégratives
CNRS UMR7104, IGBMC, 1 rue Laurent Fries, 67404 Illkirch Strasbourg, France
Tel (+33) 388 65 3300         Fax (+33) 388 65 3276
http://www-bio3d-igbmc.u-strasbg.fr/~wraff
[email protected]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to