yes, it is common that the blat output is further filtered by some method.
-Galt On Mon, 10 Nov 2008, Micha Sammeth wrote: > Hi Galt, > > thank you for the quick answer, I prefer programs to scripts. Did I > understand correctly that the sketched scenario of 10% of a trancript > aligning with the same identity as 100% of a transcript and both > alignments are kept occurs at UCSC? Is it frequent? > > Thank you, micha. > > > En/na Galt Barber ha escrit: >> >> Percent Identity only applies regions aligned. >> Gaps are not considered aligned regions of course. >> >> You are interested in the level of coverage. >> You can use utilities like pslReps and pslCDnaFilter >> to post-filter your BLAT psl results. You can also >> just write your own script to filter it however you like. >> >> You might also try the BLAT options -maxGap=0 and -fastMap. >> >> -Galt >> >> >> On Mon, 10 Nov 2008, Micha Sammeth wrote: >> >>> Hello helpdesk, >>> >>> I have again a curiosity concerning blat alignments. I consider the case >>> BM451627. It was for some reasons not in the dataset of hg17, so I >>> downloaded the sequence and ran a blat -stepSize=5 -minScore=0 >>> -minIdentity=0, which should correspond to the settings used at UCSC. >>> Checking idendity, I find the highest score of about 20% sequence >>> identity -- match/query length -- it also does not change much when >>> additionally taking into account repmatch. >>> >>> However, I found BM451627 in the UCSC hg16 database, where it reports a >>> ~98% identity match. Looking closer at the alignment, in hg16 there is a >>> ~150nt stretch from the 1244nt which aligns with 98% identity --- and >>> probably a couple of bases that changed from hg16->hg17 are responsible >>> that in this 150nt region sequence identity drops below the threshold of >>> 96%. >>> >>> My question now is: >>> >>> Does this hold for all identities, say a transcript aligns with 1000 nt >>> and 98% identity in one place and in another place with 100nt at 98% >>> will be put in both places, regardless of the coverage of the transcript >>> by the alignment? In other words, the identity criterion of 96% or 0.5% >>> of the best alignment is applied to match/(Qend-Qstart)? And if so, what >>> was the motivation to not take the "global identity" of the query, did >>> you have bad experiences with transcripts that did not want to align >>> that way? >>> >>> Thank you! >>> >>> micha. >>> >>> _______________________________________________ >>> Genome maillist - [email protected] >>> http://www.soe.ucsc.edu/mailman/listinfo/genome >>> >> > > -- > O o O o O o Dr. Michael Sammeth > | O o | | O o | | O o | http://www.sammeth.net > | | O | | | | O | GRIB| O | Phone: +34-933-160-166 > | o O | | o O | | o O | Fax: +34 933-969-983 > o O o O o O Dr. Aiguader 88, 08003 Barcelona > > _______________________________________________ > Genome maillist - [email protected] > http://www.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] http://www.soe.ucsc.edu/mailman/listinfo/genome
