Percent Identity only applies regions aligned. Gaps are not considered aligned regions of course.
You are interested in the level of coverage. You can use utilities like pslReps and pslCDnaFilter to post-filter your BLAT psl results. You can also just write your own script to filter it however you like. You might also try the BLAT options -maxGap=0 and -fastMap. -Galt On Mon, 10 Nov 2008, Micha Sammeth wrote: > Hello helpdesk, > > I have again a curiosity concerning blat alignments. I consider the case > BM451627. It was for some reasons not in the dataset of hg17, so I > downloaded the sequence and ran a blat -stepSize=5 -minScore=0 > -minIdentity=0, which should correspond to the settings used at UCSC. > Checking idendity, I find the highest score of about 20% sequence > identity -- match/query length -- it also does not change much when > additionally taking into account repmatch. > > However, I found BM451627 in the UCSC hg16 database, where it reports a > ~98% identity match. Looking closer at the alignment, in hg16 there is a > ~150nt stretch from the 1244nt which aligns with 98% identity --- and > probably a couple of bases that changed from hg16->hg17 are responsible > that in this 150nt region sequence identity drops below the threshold of > 96%. > > My question now is: > > Does this hold for all identities, say a transcript aligns with 1000 nt > and 98% identity in one place and in another place with 100nt at 98% > will be put in both places, regardless of the coverage of the transcript > by the alignment? In other words, the identity criterion of 96% or 0.5% > of the best alignment is applied to match/(Qend-Qstart)? And if so, what > was the motivation to not take the "global identity" of the query, did > you have bad experiences with transcripts that did not want to align > that way? > > Thank you! > > micha. > > _______________________________________________ > Genome maillist - [email protected] > http://www.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] http://www.soe.ucsc.edu/mailman/listinfo/genome
