Hello Ola,

We've determined that your BAM file has quite a few errors in the
alignments on the negative strand.   As part of our work as the ENCODE
Data Coordination Center we've receive thousands of BAM files that we
validate with a program called validateFiles.  This program checks
each alignment in every BAM to make sure that there are not more
mismatches than are expected.

I ran this program on your BAM with flags allowing up to 12 mismatches
which finds 515,226 alignments that exceed this limit which are *all*
on the negative strand, and almost all use the 'S' character in the
CIGAR string.

You can use this program yourself by downloading it from here:
http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/validateFiles

You'll also need the hg18.2bit file and chromosome sizes file which are here:
http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/hg18.2bit
http://hgdownload.cse.ucsc.edu/goldenPath/hg18/database/chromInfo.txt.gz

Here is how I ran the program:
$ validateFiles -type=BAM -genome=hg18.2bit
-chromInfo=chromInfo.txt.gz *.bam -doReport -showBadAlign
-maxErrors=1000000 -mismatches=12 -nMatch

I hope this helps you track down where this error is being introduced.
 Please respond to this list if you have further questions.

Brian

On Thu, Jun 30, 2011 at 1:07 AM, Ola Wallerman <[email protected]> wrote:
> Hi,
>
> I am using the genome browser to view BAM files which have been
> trimmed to remove overlapping read ends for Illumina PE reads. It
> appears the browser is not correctly placing the clipped reads, see
> the example below. The browser removes the clipped part of the read,
> but the remaining part is always positioned at the start of the
> alignment, meaning that reads on the reverse strand will be misplaced.
> Is this a bug or am I doing something wrong?
>
> Cheers,
>
> Ola
>
> Read name: HWI-ST344_0091:6:1204:11502:145261#CTTGTA
> Position: chr1:31021473-31021509
> Band: 1p35.2
> Genomic Size: 37
> Alignment Quality: 60
> CIGAR string: 63S37M (63 Skipped, 37 (mis)Match)
> Tags: AM:37 LB:cw19 MD:100 NM:0 RG:H3k27ac SM:37 XT:U X0:1 X1:0 XM:0 XO:0 XG:0
> Flags: 0x93:
>   (0x80) Read 2 of pair | (0x10) Read is on '-' strand | (0x03)
> Properly paired
> Note: although the read was mapped to the reverse strand of the
> genome, the sequence and CIGAR in BAM are relative to the forward
> strand.
>
> Alignment of HWI-ST344_0091:6:1204:11502:145261#CTTGTA to
> chr1:31021473-31021509:
>
> 00000064 GGAGGCTGAGGCACGAGAATCAATTGAACCTGGGAGG 00000100
>>>>>>>>>             |  | ||  ||||   |  | | || >>>>>>>>
> 31021473 atctctactaaaaatacaaaaaattagccaggcgtgg 31021509
>
>
>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to