It sounded to me like he simply wanted the distribution of last bases,
which happen to be all T in your example.  We can extract and tabulate
them more directly like this:

# example of trimLRPatterns value:
> A = DNAStringSet(c("ACAT", "CAT", "AGGGCGT"))

> n = nchar(A)
> last = narrow(A, start=n, end=n)
> alphabetFrequency(last, baseOnly=TRUE, collapse=TRUE)
    A     C     G     T other
    0     0     0     3     0

On Aug 26, 2010, at 9:29 AM, Joern Toedling wrote:

Hi,

have a look at the "shift" argument of the function consensusMatrix from
Biostrings.

This code example should correspond to your question. Three nucleotide strings are aligned at their last position and the sequence composition is obtained:

A <- DNAStringSet(c("ACAT", "CAT", "AGGGCGT"))
maxlen <- max(nchar(A))
consensusMatrix(A, shift=maxlen-nchar(A), baseOnly=TRUE)

I tested this with Biostrings_2.17.29, but I guess that it works with the
current release version, too.

Regards,
Joern


On Thu, 26 Aug 2010 07:35:01 -0500, joseph franklin wrote
Hi,

I've been trimming adapters from reads using trimLRPatterns.  The
resulting, trimmed set contains a heterogenous mix of widths: from
~18-35 nt.  Can anyone guide me toward an elegant way to find the
nucleotide composition of the final (right-most) cycle for each of
the trimmed reads?

Many thanks,
Joe Franklin

---
Joern Toedling
Institut Curie -- U900
26 rue d'Ulm, 75005 Paris, FRANCE
Tel. +33 (0)156246927

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to