Hi Avi - blat really is not the best tool for primate/rodent  
alignments.  I'd suggest you switch to lastz from Penn State  
University.  See
http://www.bx.psu.edu/miller_lab/dist/README.lastz-1.01.50/README.lastz-1.01.50.html
 
.



On Mar 9, 2010, at 7:58 AM, Fungazid wrote:

> Thank you Galt for your detailed information,
>
> I understand the optimal configuration depends the needs. So... my  
> query sequences are cDNAs of 100-5000bp. One of the goals is to  
> detect variations like intron retention between related mammals like  
> primates vs. rodents (therefore I need genomes as targets).
> The basic configuration finds most but not all HSPs per hit  
> (accordingly sometimes small exons are not detected, or larger  
> intronic regions). But the optimization is problematic because I see  
> that often even -stepSize=5 is less sensitive than the default  
> stepSize. As far as I understand this can happen because of  
> repetitive sequences that are ignored if they occur too many times  
> when sensitivity rises. Should I increase -repMatch to prevent it ?  
> but which value is the program default repMatch for [-stepSize=5,- 
> tileSize=10] and for [-stepSize=5,-tileSize=default] ?
>
> thanks,
> Avi
>
>
> -repMatch
>
> --- On Mon, 3/8/10, Galt Barber <[email protected]> wrote:
>
>> From: Galt Barber <[email protected]>
>> Subject: Re: [Genome] gfServer/gfClient and -tileSize
>> To: [email protected]
>> Date: Monday, March 8, 2010, 7:35 PM
>>
>> Higher tileSize increases memory,
>> increases speed, decreases sensitivity slightly.
>>
>> The default tileSize 11 is very good.
>> On rare occasions you see 10 or 12 used.
>> Smaller tileSizes tend to lead to
>> dramatically longer runtime.
>>
>> It's a little complex to state easily
>> in a formula because there are multiple
>> phases internally that have each different
>> characteristics.
>>
>> The default stepSize is just tileSize.
>> This means that you are sampling a
>> position of the genome every stepSize bases.
>>
>> For PCR primer searching, we leave tileSize at 11
>> and lower stepSize to 5 for increased
>> sensitivity.  Of course this will also
>> cause the runtime to grow.
>>
>> Increasing sensitivity means increasing
>> the number of hits, and each hit that
>> has to be explored can take a lot of
>> processing.
>>
>> And of course, whatever generalizations
>> one would make, the real power, speed,
>> and memory-required will depend
>> on the characteristics of the genome,
>> the queries.  Not to mention several command-line
>> switches that are available.
>>
>> But luckily the defaults have good
>> performance and sensitivity
>> for a wide-range of applications.
>>
>> If you are doing short-reads then
>> perhaps one of the many good freely
>> available short-read aligners like
>> would be useful.
>>
>> BLAT is free for non-commercial use.
>>
>> -Galt
>>
>> Ar 3/8/2010 7:03 AM, scríobh Fungazid:
>>> Hello people,
>>>
>>>
>>> About gfServer/gfClient :
>>>
>>> I see that higher -tileSize leads to higher memory
>> requirement. Does higher -tileSize expected to decrease
>> detection power ?
>>> In addition, should higher -tileSize enhance the speed
>> of gfServer/gfClient ?
>>>
>>> And, what is the -stepSize and how it effects the
>> detection power, speed and memory requirement ?
>>>
>>>
>>> Thanks,
>>> Avi
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Genome maillist  -  [email protected]
>>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>
>> _______________________________________________
>> Genome maillist  -  [email protected]
>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>
>
>
>
>
>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome


_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to