Hi,

This might be a naive question, but we have some questions to the 
parameters of BLAT.

We want to get all identical (100% matched) blocks between the query and 
target sequences, which means we don't want the blocks with gaps or 
mismatch inside. How can we control BLAT to output hits without gaps / 
mismatch, which means the blockSize=1, and mismatch=0?

Did I explain clear here? For example, the following block is expected 
to be separated into 3 blocks (or 2 if the minScore>10, for example). 
How can I make it in BLAT, without doing a sliding window scanning?

000100 caaattagaaatttggagagtcgtcaaatgataatgctct-agcagcattagctcaagtg 000159
>>>>>> ||||||||||||||||||||||||||||||  |||||||| ||||||||||||||||||| <<<<<<
474529 caaattagaaatttggagagtcgtcaaatgcaaatgctctcagcagcattagctcaagtg 474588

000160 gcccacctgcgataactactcaattaaagtatttaaaagctcgtcagcccaaatcctata 000219
>>>>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| <<<<<<
474589 gcccacctgcgataactactcaattaaagtatttaaaagctcgtcagcccaaatcctata 474648


I've tried the following parameters:
blat assembly.2bit input.fa -stepSize=5 -minIdentity=0 -minScore=0 -maxGap=0
but still, there are entries with several gap-separated blocks. It seems 
that the maxGap=0 also does not work. I don't really understand why.

I also tried to set maxIntron=0. This seem improve a bit, at least there 
is no gap allowed in the target (which means in the output, all 
T_gap_count=0), but there is still gap/mismatch in the query. It seems 
this maxIntron is only designed for target sequence, not for the query.
blat assembly.2bit input.fa -stepSize=5 -minIdentity=0 -minScore=0 
-maxIntron=0

Do you guys have any tips for this?

Of course, I can always write a script to scan the axt file to parse all 
100% identical blocks.

Thanks,

Xianjun

-- 
---------------------------
Sterding (Xianjun) Dong
PhD student, Boris Lenhard's group
Bergen Center of Computational Science
Bergen University, Norway
Mobile: 0047-47361688
Telephone: 0047-55276381
Skype: xianjun.dong

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to