Hello Luvina,

Thank you very much. I followed the link provided herein but could not
find blat binaries for 32bit. There is only a liftOver file in linux.i386
directory.

I still observe some strange differences in terms of what is reported in
the sim4 and psl formats for the same set of parameters. In my previous
email, I provided two files: target.fa and query.fa both of which are test
files. If you execute the following:
blat target.fa -t=dna query.fa -q=dna -dots=1 -out=psl test.psl
blat target.fa -t=dna query.fa -q=dna -dots=1 -out=sim4 test.sim4

The outputs test.psl and test.sim4 do not seem to contain the same number
of hits. Suprisingly, .sim4 has more hits than .psl although .psl is the
generic format. I am unable to explain how this is possible. For now, I
using the sim4 format and albeit difficult to parse.

Thank you in advance.

Mbandi
------------------------------------
Universiteit van wes kaapland

> Hi Mbandi
>
> Thank you for contacting the mailing list, and yes, this is the correct
> place to ask your question. One of our engineers suggests you use psl
> since it is the native blat format, and not to use fastmap for ests
> which may have introns in them. In addition, you may download our latest
> version of BLAT which contains a few bugfixes and may be useful for your
> purposes. The lastest BLAT is available in compiled form in our
> downloads:
> http://hgdownload.cse.ucsc.edu/downloads.html#source_downloads. We also
> suggest you use pslReps and pslCDnsFilter for filtering psl results.
>
> I hope this information is useful and answers your question. Please
> contact us again at [email protected] if you have any further questions.
>
> ---
> Luvina Guruvadoo
> UCSC Genome Bioinformatics Group
>
>
> On 6/14/2012 1:23 PM, Mbandi S.K wrote:
>> Dear ALL;
>>
>> Firstly, I'm happy to join this mailing list. I do not know if this
>> group
>> is the right place for my question. Kindly bear with me if my question
>> is
>> trivial or has been dealt with already. I have recently settled on BLAT
>> v.
>> 34 for a portion of my project to screen for EST(cDNA) that well aligned
>> to my reference sequence. However, I find it hard to understand the
>> effects of -minIdentity and -fastMap on the output.
>>
>> I also noticed that just changing the output format, affects the the
>> reports in the output file. More ESTs are reported in sim4 format than
>> in
>> psl format. I want to write a parser to calculate coverage, identity etc
>> in other for me to build a filtering matrix. attached here are two test
>> files:query.fa and target.fa. I'm aware -fastMap is for DNA-DNA, but
>> just
>> for test purposes, I ran:
>> blat target.fa -t=dna query.fa -q=dna -out=psl -minIdentity=100 -fastMap
>> -dots=1 test.psl
>> and
>> blat target.fa -t=dna query.fa -q=dna -out=psl -fastMap -dots=1 test.psl
>>
>> However in the first instance; I do not find hits which I expected even
>> though default -minIdentity is 90 which is less stringent to 100. When
>> out=sim4 is used, the hits are totally different. Query.fa contains
>> mutated and unmodified versions of seq1 from target.fa file.
>>
>> Has anyone experience strange results like this? Which output is better
>> from experience? I will appreciate clarity in this regard.
>>
>> Many thanks,
>>
>> Mbandi S.K
>>
>>
>> _______________________________________________
>> Genome maillist  -  [email protected]
>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>
>


_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to