Hello Peng,

Jim Kent, the BLAT program author, has control over all aspects of the 
program design, but we thank you again for your input!

Take care,
Jennifer

---------------------------------
Jennifer Jackson
UCSC Genome Informatics Group
http://genome.ucsc.edu/

On 4/30/10 5:21 AM, Peng Yu wrote:
> On Tue, Apr 27, 2010 at 9:00 PM, Galt Barber<[email protected]>  wrote:
>>
>> Hi, Peng!
>>
>> As the FAQ points out
>>   http://genome.ucsc.edu/FAQ/FAQblat.html
>>
>> "A note on filtering output: increasing the -minScore parameter value beyond
>> one-half of the query size has no further effect. Therefore, use either the
>> pslReps or pslCDnaFilter  program available in the Genome Browser source
>> code to filter for the size, score, coverage, or quality desired. For
>> information on obtaining the source code, see our FAQ on source code
>> licensing and downloads. "
>>
>> This seems to have been an odd restriction
>> which was removed at the urging of users,
>> however, the change came only in 2008:
>>
>> blat/version.doc
>> 1.72 (galt 09-Dec-08): (in blat version 34x3)
>> Fixed -minScore, filter was not working when over half query-size.
>>     v197_branch: 1.72.0.2
>>
>> revision 1.72
>> date: 2008/12/09 08:11:46;  author: galt;  state: Exp;  lines: +1 -0
>> fixing minScore
>> ----------------------------
>>
>> galt
>>   Tue Dec 9 08:11:46 2008 +0000
>> fixing minScore
>> diff --git src/jkOwnLib/gfBlatLib.c src/jkOwnLib/gfBlatLib.c
>> --- src/jkOwnLib/gfBlatLib.c
>> +++ src/jkOwnLib/gfBlatLib.c
>> @@ -18,7 +18,7 @@
>>
>>
>> static void saveAlignments(char *chromName, int chromSize, int chromOffset,
>>         struct ssBundle *bun, struct hash *t3Hash,
>>         boolean qIsRc, boolean tIsRc,
>>         enum ffStringency stringency, int minMatch, struct gfOutput *out)
>>   /* Save significant alignments to file in .psl format. */
>>   {
>>   struct dnaSeq *tSeq = bun->genoSeq, *qSeq = bun->qSeq;
>>   struct ssFfItem *ffi;
>> -if (minMatch>  qSeq->size/2) minMatch = qSeq->size/2;
>> -if (minMatch<  1) minMatch = 1;
>>   for (ffi = bun->ffList; ffi != NULL; ffi = ffi->next)
>>      {
>>      struct ffAli *ff = ffi->ff;
>>      struct trans3 *t3List = NULL;
>>      int score;
>>      if (t3Hash != NULL)
>>         t3List = hashMustFindVal(t3Hash, tSeq->name);
>>      score = scoreAli(ff, bun->isProt, stringency, tSeq, t3List);
>>      if (score>= minMatch)
>>         {
>>         out->out(chromName, chromSize, chromOffset, ff, tSeq, t3Hash, qSeq,
>>             qIsRc, tIsRc, stringency, minMatch, out);
>>         }
>>      }
>>   }
>>
>> See the two lines leading with "-" ?
>> They were deleted.  They seemed to be
>> unneeded and causing unexpected behavior
>> to users.
>>
>> Unfortunately, Jim Kent's official release
>> seems to date back to 2007, but you could
>> get the source and compile it.
>>
>> Any blat version after 34x3 should have the fix.
>>
>> With the newer version, the cutoff works more
>> as you would expect.  And for your example
>> of a 25bp stretch of dna with one mismatch,
>> your score would be +24 for the matches and
>> -1 for the 1 mismatch, thus score=24-1==23.
>>
>> And thus if you use minScore of 23 or lower
>> you can see the output psl record.
>>   -minScore=23
>>
>> As we mentioned before,
>> you can just set minScore to zero and
>> then filter the psl output
>> with other tools afterwards.
>
> Hi,
>
> Since setting minScore to zero would probably more common than other
> cases. I think that it is make sense to change its default value to 0
> rather than an arbitrary number 30 as it is right now. Do you agree?
>
>> -Galt
>>
>> Ar 4/27/2010 3:35 PM, scríobh Peng Yu:
>>>
>>> Hi Galt,
>>>
>>> Here is the command that I use. You mentioned "Generally people don't
>>> much bother with using BLAT's own commandline options for minScore,
>>> etc." But I want to understand what minScore is and when it can be
>>> ignored. Would you please let me know?
>>>
>>>
>>> $ blat -t=dna -q=dna -stepSize=5 -minScore=25 -maxGap=0 -noHead \
>>>                 database.fasta \
>>>                 query.fasta \
>>>                 query.psl
>>> $ cat query.fasta
>>>>
>>>> test_sequence
>>>
>>> cttgcaccggaaagtctgctccaga
>>> $ cat database.fasta
>>>>
>>>> database_chr1
>>>
>>> ctagcaccggaaagtctgctccaga
>>> $ cat query.psl
>>> 24      1       0       0       0       0       0       0       +
>>> test_sequence   25      0       25      database_chr1   25      0       25
>>>     1       25,     0,      0,
>>>
>>>
>>>
>>> On Mon, Apr 26, 2010 at 4:30 PM, Jennifer Jackson<[email protected]>
>>>   wrote:
>>>>
>>>> Hello Peng,
>>>>
>>>> Very sorry, your reply went to the genome mailing list only, not to your
>>>> email address as well. Our apologies.
>>>>
>>>> Here is the posting:
>>>> https://lists.soe.ucsc.edu/pipermail/genome/2010-April/022012.html
>>>>
>>>> Jennifer
>>>>
>>>> ---------------------------------
>>>> Jennifer Jackson
>>>> UCSC Genome Informatics Group
>>>> http://genome.ucsc.edu/
>>>>
>>>> On 4/24/10 12:09 PM, Peng Yu wrote:
>>>>>
>>>>> Could somebody answer me the following question?
>>>>>
>>>>> On Wed, Apr 21, 2010 at 2:48 PM, Peng Yu<[email protected]>      wrote:
>>>>>>
>>>>>> I'm wondering what "some sort of gap penalty" refers to. Also I query
>>>>>> 25bp sequence using the default, BLAT still gives the result. By
>>>>>> definition 25bp sequence should at most have a score of 25, which is
>>>>>> less than 30. Why the query still returns the the result?
>>>>>>
>>>>>>    -minScore=N sets minimum score.  This is the matches minus the
>>>>>>                mismatches minus some sort of gap penalty.  Default is 30
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Peng
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
>
>
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to