Dear ALL;

Firstly, I'm happy to join this mailing list. I do not know if this group
is the right place for my question. Kindly bear with me if my question is
trivial or has been dealt with already. I have recently settled on BLAT v.
34 for a portion of my project to screen for EST(cDNA) that well aligned
to my reference sequence. However, I find it hard to understand the
effects of -minIdentity and -fastMap on the output.

I also noticed that just changing the output format, affects the the
reports in the output file. More ESTs are reported in sim4 format than in
psl format. I want to write a parser to calculate coverage, identity etc
in other for me to build a filtering matrix. attached here are two test
files:query.fa and target.fa. I'm aware -fastMap is for DNA-DNA, but just
for test purposes, I ran:
blat target.fa -t=dna query.fa -q=dna -out=psl -minIdentity=100 -fastMap
-dots=1 test.psl
and
blat target.fa -t=dna query.fa -q=dna -out=psl -fastMap -dots=1 test.psl

However in the first instance; I do not find hits which I expected even
though default -minIdentity is 90 which is less stringent to 100. When
out=sim4 is used, the hits are totally different. Query.fa contains
mutated and unmodified versions of seq1 from target.fa file.

Has anyone experience strange results like this? Which output is better
from experience? I will appreciate clarity in this regard.

Many thanks,

Mbandi S.K
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to