Hi all, Our usual read name format in a bam is something like: HS13_248:4:2111:2846:54933
But we just were asked to analyse some data where the read names in the bam have this format: HWI-ST909_0086:3:1101:19761:56275#CGATGT/2/ I notice that while running MarkDuplicates in version 1.122, the live messages suggest that Picard can't identify the pairs because of the terminal /1/ and /2/ in the read names: ... INFO 2014-10-17 11:55:18 MarkDuplicates Read 114,000,000 records. Elapsed time: 00:21:07s. Time for last 1,000,000: 12s. Last read position: GL000212.1:39,902 INFO 2014-10-17 11:55:18 MarkDuplicates Tracking 112566758 as yet unmatched pairs. 2570 records in RAM. ... However, at the end of the run, there are still some reads properly marked as duplicates. Is it ok to ignore the warnings about unmatched pairs, or should we go back and edit the read names in the fastqs to ensure our duplicate marking ? thanks for all your help, RIchard ------------------------------------------------------------------------------ Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help