Hey Richard,

there is an on-going discussion about what tag(s) to use for molecular
identifiers.  You can see the proposal here:
https://github.com/samtools/hts-specs/pull/119. I have CC'ed the original
submitter (yfarjoun).

In short, the recommendation is to use the *BC* for the sample barcode, and
*RX* for the molecular barcode.  You can then specify "BARCODE_TAG=RX" in
*MarkDuplicates* to only mark two reads as duplicates if the value in their
*RX* tags are the same in addition to the other criteria. Furthermore, I
would recommend concatenating the two hexamer sequences, as long as they
were attached at the same time.   The reason I say this is because there
are a number of technologies that also use multiple molecular barcodes that
are integrated at various points in the sample and library preparation
process.  For example, a single-cell barcode versus a unique molecule
barcode.  In this case, there is no convention defined yet, but likely you
will want them stored in different tags so you can treat them differently,
if that is warranted.

Sincerely,

Nils Homer

On Tue, Jun 28, 2016 at 5:38 PM, Richard Corbett <rcorb...@bcgsc.ca> wrote:

> Hi all,
>
> We are working on an application in our wet lab where we introduce random
> hexamers into our adapters to allow us to differentiate between PCR
> duplicates and fragments that were present in multiple copies of starting
> material.      The workflow for processing the sequence is that we trim the
> first and last six bases off of our single-ended reads and use those bases
> as our "FP" or "fragment provenance".   When we want to mark the duplicates
> we only want to mark reads as duplicates if they align to the same position
> AND they have the same FP.   Reads aligning to the same position with
> different FPs should not marked as duplicates.
>
> If I'm not mistaken, I now see in the Picard docs that I can  supply a
> BARCODE_TAG to MarkDuplicates to mark the duplicates as I describe above.
> My question to the group is if it is more appropriate to use the BC tag, or
> to use some custom tags to hold the hexamer sequences associated with each
> read.   I expect that the BC tag should be reserved for the classical
> sample multiplexing application (many libraries in a pool).
>
> If you are still reading I have another question for you:
> -Since our reads are single-ended and we have two separate hexamer
> sequences (one from each end of the read)...would you recommend
> concatenating them into one sequence (perhaps with a delimiter) and keeping
> them in one tag, or separating them into two tags?  Would there be any
> issues in Picard if we have single ended sequencing but supply two
> different barcode-like tags?
>
> thanks,
> Richard
>
>
> --
> The contents of this electronic mail transmission are intended to be 
> CONFIDENTIAL and for the sole use of the designated recipient. If this 
> message has been misdirected, please contact the sender as soon as possible.
>
>
>
> ------------------------------------------------------------------------------
> Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
> Francisco, CA to explore cutting-edge tech and listen to tech luminaries
> present their vision of the future. This family event has something for
> everyone, including kids. Get more information and register today.
> http://sdm.link/attshape
> _______________________________________________
> Samtools-help mailing list
> Samtools-help@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/samtools-help
>
>
------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to