Hi Asaf, Sorry for the delayed reply. One of our engineers provided the following explanation for the calculations of inserts in the tracks/tables you're looking at: --
This information is represented in the block structure of the PSLs (blockCount, blockSizes, qStarts, tStarts). The blocks themselves are ungapped. So one looks at what is between blocks. blockSizes, qStarts, tStarts are all parallel arrays, with the coordinates being strand-specific. For native blat alignments, the target coordinates are always positive strand. There are implied end coordinates tEnd[i] = tStarts[i]+blockSizes[i] and similar for qEnd[i]. When you have: tEnd[i-1] < tStart[i] qEnd[i-1] qStart[i] you have an insertion in the target (aka deletion in the query); possibly an intron. When you have: tEnd[i-1] tStart[i] qEnd[i-1] < qStart[i] you have an insertion in the query (aka deletion in the target); possibly a polymorphism When you have: tEnd[i-1] < tStart[i] qEnd[i-1] < qStart[i] you have an unaligned sequence in both, perhaps a polymorphism at an intron. -- I hope that is helpful. If you have any additional questions, please feel free to contact us again at [email protected] - Greg Roe UCSC Genome Bioinformatics Group On 4/11/11 1:36 AM, Asaf Levy wrote: > Hi, > I am using the all_mrna and intronEst tables of several species. I know that > these tables contain some info about the number of insertions and how many > bp are involved in insertions. However, where can I find the location of an > insertion in the alignment? (the data which is presented in the browser by > double horizontal lines or vertical colored lines). > > > Regards, > Asaf > > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
