Hello, Dario.

Please note that the web is using -minScore=20 and -tileSize=11.

If you reduce your -minScore=52 you will get the result. Your query aligns
in two blocks with an approximate 11k gap. The informal blat score used in
minScore is penalizing for gaps. In this case it is charging you 8 for
having a gap that big: 60 - 8 = 52.

If this is really mRNA or came from it and introns have been removed, you
might consider making an RNA database to query against since you won't be in
danger of missing small exons separated by huge introns.

If you look at the hgPcr display, for hg19 you can choose can choose the
target "UCSC Genes" to search just intron-less genes for this heightened
sensitivity.

Please contact us again at [email protected] if you have any further
questions.

---
Steve Heitner
UCSC Genome Bioinformatics Group

-----Original Message-----
From: [email protected] [mailto:[email protected]] On
Behalf Of Dario Strbenac
Sent: Sunday, April 29, 2012 7:00 PM
To: Brooke Rhead
Cc: [email protected]
Subject: Re: [Genome] blat minScore Unexpected Behaviour

Thanks for the links. I updated to the latest version. What I noticed,
though, is that some queries that map with the web interface don't map with
the executable.

My parameters are :

-q=dna -t=dna -minScore=54 -stepSize=5 -tileSize=10 -repMatch=2253

I was considering the sequence  :

>ASHG19A3A007737
TCTCGATGCGCCGTCGCCGGGTCAGCCGTTTCCTCTCCCTCGCCGGCCTCGGCGGAGATT

it does not appear in my output PSL file. But if I paste it into the web
interface I get :

   ACTIONS      QUERY           SCORE START  END QSIZE IDENTITY CHRO STRAND
START    END      SPAN
----------------------------------------------------------------------------
-----------------------
browser details YourSeq           59     1    60    60 100.0%    17   +
46669702  46681323  11622

I am mapping against hg19 in both cases.

$ blat
blat - Standalone BLAT v. 34x12 fast sequence search command line tool

What could be causing these to get dropped from the results ?

---- Original message ----
>Date: Fri, 27 Apr 2012 12:15:02 -0700
>From: Brooke Rhead <[email protected]>
>Subject: Re: [Genome] blat minScore Unexpected Behaviour
>To: [email protected]
>Cc: [email protected]
>
>Hi Dario,
>
>Increasing the -minScore to a number that is over half the query size 
>has no further effect (at least in slightly older versions of blat).
>See the end of this FAQ:
>
>http://genome.ucsc.edu/FAQ/FAQblat.html#blat8
>
>There is some more background discussion in our mailing list archives,
here:
>https://lists.soe.ucsc.edu/pipermail/genome/2010-April/022062.html
>
>We recommend using the pslReps or pslCdnaFilter programs on your blat 
>results rather than setting the -minScore option.  If you don't already 
>have those programs, there are executables here:
>http://hgdownload.cse.ucsc.edu/admin/exe/
>or you can get the source
>http://hgdownload.cse.ucsc.edu/downloads.html#source_downloads
>
>If you have further questions, please contact us again at 
>[email protected].
>
>--
>Brooke Rhead
>UCSC Genome Bioinformatics Group
>
>
>On 4/26/12 9:00 PM, Dario Strbenac wrote:
>> I'm using the command blat with
>>
>> -minScore=60
>>
>> and my file has all 60 base DNA sequences. They are actually microarray
probes.
>>
>> My version is
>>
>> $ blat
>> blat - Standalone BLAT v. 34 fast sequence search command line tool
>>
>> The documentation further down the screen says
>>
>> -minScore=N sets minimum score.  This is the matches minus the 
>> mismatches minus some sort of gap penalty. Default is 30
>>
>> How can I possibly be getting any matches less than 60 bases then ?
>>
>> $ head -n 15 result.psl | cut -f 1,2 -
>>
>> match   mis-
>>          match
>> ---------------
>> 60      0
>> 60      0
>> 58      1
>> 51      0
>> 59      0
>> 58      1
>> 58      1
>> 56      3
>> 55      3
>> 58      1
>>
>> --------------------------------------
>> Dario Strbenac
>> Research Assistant
>> Cancer Epigenetics
>> Garvan Institute of Medical Research
>> Darlinghurst NSW 2010
>> Australia
>> _______________________________________________
>> Genome maillist  -  [email protected] 
>> https://lists.soe.ucsc.edu/mailman/listinfo/genome


--------------------------------------
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to