I have reproduced this behavior.

The sequence is not unique
despite its length of 27, and maps
to at least hundreds of places on the genome.

This location is in a LINE repeat at the end of chr1.

This is the known behavior of the over-used tile with blat,
where blat stops adding locations to the tile index
once it's reached a maximum number.

I tried to increase -repMatch to a large number,
but it did not help.  Then I extracted a subset
of chr1, keeping only bases after 200Million.
Only the last 47 million were left.

I ran blat against this portion of chr1.
This worked fine and turned up 32 hits including
the one that you were expecting.

It is difficult to index the entire genome
to its full depth when some times would be
hit millions of times.  Handling repeats exhaustively
is not one of BLAT's goals.

Jim Kent is working on a new short-sequence aligner program
that might help here, but it is still in development.

Repeat
RepeatMasker Information
Name: L1PA7
Family: L1
Class: LINE
SW Score: 16289
Divergence: 8.7%
Deletions: 0.8%
Insertions: 0.8%
Begin in repeat: 3132
End in repeat: 4244
Left in repeat: 1902
Position: chr1:247183181-247184298
Band: 1q44
Genomic Size: 1118
Strand: -

-Galt


On Tue, 4 Nov 2008, Yuan Jian wrote:

Hi there,
 
I got a seuquence
TGAAAACTGGCACAAGACAAGGATGCC
located in chr1:247183338-247183364 strand-.
 
but when I blat it. I can not find that location for the sequence.
but I found it in other loci of chr1 and strand-:
browser details YourSeq           27     1    27    27 100.0%     1   -  
219139486 219139512     27
browser details YourSeq           27     1    27    27 100.0%     1   -  
192113590 192113616     27
browser details YourSeq           27     1    27    27 100.0%     1   -  
189295101 189295127     27
browser details YourSeq           27     1    27    27 100.0%     1   -  
156553131 156553157     27
browser details YourSeq           27     1    27    27 100.0%     1   -  
156665481 156665507     27

 
can you please tell me why?
 
thanks
 
Yu



_______________________________________________
Genome maillist  -  [email protected]
http://www.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
http://www.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to