Hi, This might be a naive question, but we have some questions to the parameters of BLAT.
We want to get all identical (100% matched) blocks between the query and target sequences, which means we don't want the blocks with gaps or mismatch inside. How can we control BLAT to output hits without gaps / mismatch, which means the blockSize=1, and mismatch=0? Did I explain clear here? For example, the following block is expected to be separated into 3 blocks (or 2 if the minScore>10, for example). How can I make it in BLAT, without doing a sliding window scanning? 000100 caaattagaaatttggagagtcgtcaaatgataatgctct-agcagcattagctcaagtg 000159 >>>>>> |||||||||||||||||||||||||||||| |||||||| ||||||||||||||||||| <<<<<< 474529 caaattagaaatttggagagtcgtcaaatgcaaatgctctcagcagcattagctcaagtg 474588 000160 gcccacctgcgataactactcaattaaagtatttaaaagctcgtcagcccaaatcctata 000219 >>>>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| <<<<<< 474589 gcccacctgcgataactactcaattaaagtatttaaaagctcgtcagcccaaatcctata 474648 I've tried the following parameters: blat assembly.2bit input.fa -stepSize=5 -minIdentity=0 -minScore=0 -maxGap=0 but still, there are entries with several gap-separated blocks. It seems that the maxGap=0 also does not work. I don't really understand why. I also tried to set maxIntron=0. This seem improve a bit, at least there is no gap allowed in the target (which means in the output, all T_gap_count=0), but there is still gap/mismatch in the query. It seems this maxIntron is only designed for target sequence, not for the query. blat assembly.2bit input.fa -stepSize=5 -minIdentity=0 -minScore=0 -maxIntron=0 Do you guys have any tips for this? Of course, I can always write a script to scan the axt file to parse all 100% identical blocks. Thanks, Xianjun -- --------------------------- Sterding (Xianjun) Dong PhD student, Boris Lenhard's group Bergen Center of Computational Science Bergen University, Norway Mobile: 0047-47361688 Telephone: 0047-55276381 Skype: xianjun.dong _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
