Dear list, 

thanks for pushing this issue forward. I will try your options in the next days.

Cei, microRNAs are a good point regarding adapter removal.


Best,

Dave

> Date: Wed, 14 Jan 2009 16:17:17 -0800
> From: [email protected]
> To: [email protected]; [email protected]
> Subject: Re: [Bioc-sig-seq] adapter removal
> 
> I just checked in a trimLRPatterns function to the Bioconductor svn 
> repository for BioC 2.4. Its signature is
> 
>   trimLRPatterns(Lpattern = NULL, Rpattern = NULL, subject,
>                  max.Lmismatch = 0, max.Rmismatch = 0,
>                  with.Lindels = FALSE, with.Rindels = FALSE,
>                  Lfixed = TRUE, Rfixed = TRUE, ranges = FALSE)
> 
> As you can infer from the arguments, this function allows the user to 
> set the # of mismatches (if with.*indels = FALSE) / edit distance (if 
> with.*indels = TRUE) for the left and right flanking "patterns". It also 
> allows for IUPAC ambiguity letters in these flanking regions if *fixed = 
> FALSE. When ranges = FALSE, trimLRPatterns returns the trimmed strings. 
> When ranges = TRUE, it returns the ranges that you can use to trim the 
> strings. Here are some examples:
> 
>  >   Lpattern <- "TTCTGCTTG"
>  >   Rpattern <- "GATCGGAAG"
>  >   subject <- DNAString("TTCTGCTTGACGTGATCGGA")
>  >   subjectSet <- DNAStringSet(c("TGCTTGACGGCAGATCGG", 
> "TTCTGCTTGGATCGGAAG"))
>  >   trimLRPatterns(Lpattern = Lpattern, subject = subject)
>   11-letter "DNAString" instance
> seq: ACGTGATCGGA
>  >   trimLRPatterns(Lpattern = Lpattern, Rpattern = Rpattern, subject = 
> subjectSet)
>   A DNAStringSet instance of length 2
>     width seq
> [1]    18 TGCTTGACGGCAGATCGG
> [2]     0
>  >   trimLRPatterns(Lpattern = Lpattern, Rpattern = Rpattern, subject = 
> subjectSet,
> +                  ranges = TRUE)
> IRanges object:
>   start end width
> 1     1  18    18
> 2    10   9     0
> 
> This functionality will be available on bioconductor.org (and 
> downloadable via biocLite) in the next day or so, but you can also grab 
> Biostrings from svn directly if you need it sooner. It will also feed 
> its way into Biostrings documentation and training material before the 
> next release of Bioconductor in May.
> 
> 
> Patrick
> 
> 
> 
> Patrick Aboyoun wrote:
> > David,
> > Following up on Martin's comments, I am putting the finishing touches 
> > on a function called trimLRPatterns for the Biostrings package. Its 
> > purpose is to trim left and/or right flanking patterns from sequences, 
> > so it can strip 5' and/or 3' adapters from your reads. The signature 
> > for this function is
> >
> >  trimLRPatterns(Lpattern=NULL, Rpattern=NULL, subject, max.Lnedit=0, 
> > max.Rnedit=0,
> >                 with.Lindels=FALSE, with.Rindels=FALSE, Lfixed=TRUE, 
> > Rfixed=TRUE,
> >                 rangesOnly = FALSE)
> >
> > I will be checking this function into the BioC 2.4 code line, which 
> > requires using R-devel, sometime today or tomorrow. I will send out an 
> > e-mail to this group when I check it in and show a simple example of 
> > its usage. I talked with Martin and he will wrap this functionality in 
> > the ShortRead layer so you don't have to leave the ShortRead class 
> > system when removing adapters from your reads.
> >
> >
> > Cheers,
> > Patrick
> >
> 

_________________________________________________________________


        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to