Hi, Mbandi! The minIdentity and minScore filtering have not yet been implemented for sim4 output -out=sim4.
If you run blat -out=psl -minIdentity=0 -minScore=0 you should see the same alignments as sim4. The output order of the sim4 alignments is however reversed compared to psl. Jim Kent, the author of BLAT, has now been informed of these issues. These issues have gone unnoticed so long because almost nobody is using sim4. -Galt <[email protected]> wrote: > Hello Luvina, > > Thank you very much. I followed the link provided herein but could not > find blat binaries for 32bit. There is only a liftOver file in linux.i386 > directory. > > I still observe some strange differences in terms of what is reported in > the sim4 and psl formats for the same set of parameters. In my previous > email, I provided two files: target.fa and query.fa both of which are test > files. If you execute the following: > blat target.fa -t=dna query.fa -q=dna -dots=1 -out=psl test.psl > blat target.fa -t=dna query.fa -q=dna -dots=1 -out=sim4 test.sim4 > > The outputs test.psl and test.sim4 do not seem to contain the same number > of hits. Suprisingly, .sim4 has more hits than .psl although .psl is the > generic format. I am unable to explain how this is possible. For now, I > using the sim4 format and albeit difficult to parse. > > Thank you in advance. > > Mbandi > ------------------------------------ > Universiteit van wes kaapland > > > Hi Mbandi > > > > Thank you for contacting the mailing list, and yes, this is the correct > > place to ask your question. One of our engineers suggests you use psl > > since it is the native blat format, and not to use fastmap for ests > > which may have introns in them. In addition, you may download our latest > > version of BLAT which contains a few bugfixes and may be useful for your > > purposes. The lastest BLAT is available in compiled form in our > > downloads: > > http://hgdownload.cse.ucsc.edu/downloads.html#source_downloads. We also > > suggest you use pslReps and pslCDnsFilter for filtering psl results. > > > > I hope this information is useful and answers your question. Please > > contact us again at [email protected] if you have any further > questions. > > > > --- > > Luvina Guruvadoo > > UCSC Genome Bioinformatics Group > > > > > > On 6/14/2012 1:23 PM, Mbandi S.K wrote: > >> Dear ALL; > >> > >> Firstly, I'm happy to join this mailing list. I do not know if this > >> group > >> is the right place for my question. Kindly bear with me if my question > >> is > >> trivial or has been dealt with already. I have recently settled on BLAT > >> v. > >> 34 for a portion of my project to screen for EST(cDNA) that well aligned > >> to my reference sequence. However, I find it hard to understand the > >> effects of -minIdentity and -fastMap on the output. > >> > >> I also noticed that just changing the output format, affects the the > >> reports in the output file. More ESTs are reported in sim4 format than > >> in > >> psl format. I want to write a parser to calculate coverage, identity etc > >> in other for me to build a filtering matrix. attached here are two test > >> files:query.fa and target.fa. I'm aware -fastMap is for DNA-DNA, but > >> just > >> for test purposes, I ran: > >> blat target.fa -t=dna query.fa -q=dna -out=psl -minIdentity=100 -fastMap > >> -dots=1 test.psl > >> and > >> blat target.fa -t=dna query.fa -q=dna -out=psl -fastMap -dots=1 test.psl > >> > >> However in the first instance; I do not find hits which I expected even > >> though default -minIdentity is 90 which is less stringent to 100. When > >> out=sim4 is used, the hits are totally different. Query.fa contains > >> mutated and unmodified versions of seq1 from target.fa file. > >> > >> Has anyone experience strange results like this? Which output is better > >> from experience? I will appreciate clarity in this regard. > >> > >> Many thanks, > >> > >> Mbandi S.K > >> > >> > >> _______________________________________________ > >> Genome maillist - [email protected] > >> https://lists.soe.ucsc.edu/mailman/listinfo/genome > > > > > > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
