I am working with Xianjun offline. If anything useful presents itself, I'll add a note to this thread. -Galt
Xianjun Dong wrote: > Hi, Galt > > I guess I am asking help at the wrong time :) > > Actually, my question can be quite simple: How can I set BLAT to get > gap-free output? > > I am doing BLAT to get all identical parts(100%) with minLength>30bp in > the query DNA sequence on target DNA sequences. The following setting > should work by the explanation of BLAT options, but it does not. > > blat assembly.2bit input.fa -stepSize=5 -minIdentity=100 -minScore=30 > -maxIntron=0 > > Regards, > > Xianjun > > > Galt Barber wrote: >> please read the faq again paying particular attention to pslCdnaFilter >> and pslReps. >> >> can you say more about what you are doing and how many seqs of what >> type and size and number you might have in your qry and target? >> >> I will be back in the office on mon and can look more closely at your >> question then. >> >> Sent from my iPhone >> >> On Oct 30, 2009, at 7:03 AM, Xianjun Dong <[email protected]> >> wrote: >> >>> Hi, >>> >>> This might be a naive question, but we have some questions to the >>> parameters of BLAT. >>> >>> We want to get all identical (100% matched) blocks between the query and >>> target sequences, which means we don't want the blocks with gaps or >>> mismatch inside. How can we control BLAT to output hits without gaps / >>> mismatch, which means the blockSize=1, and mismatch=0? >>> >>> Did I explain clear here? For example, the following block is expected >>> to be separated into 3 blocks (or 2 if the minScore>10, for example). >>> How can I make it in BLAT, without doing a sliding window scanning? >>> >>> 000100 caaattagaaatttggagagtcgtcaaatgataatgctct-agcagcattagctcaagtg >>> 000159 >>>>>>>>> |||||||||||||||||||||||||||||| |||||||| ||||||||||||||||||| >>>>>>>>> <<<<<< >>> 474529 caaattagaaatttggagagtcgtcaaatgcaaatgctctcagcagcattagctcaagtg >>> 474588 >>> >>> 000160 gcccacctgcgataactactcaattaaagtatttaaaagctcgtcagcccaaatcctata >>> 000219 >>>>>>>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>>>>>>> <<<<<< >>> 474589 gcccacctgcgataactactcaattaaagtatttaaaagctcgtcagcccaaatcctata >>> 474648 >>> >>> >>> I've tried the following parameters: >>> blat assembly.2bit input.fa -stepSize=5 -minIdentity=0 -minScore=0 >>> -maxGap=0 >>> but still, there are entries with several gap-separated blocks. It seems >>> that the maxGap=0 also does not work. I don't really understand why. >>> >>> I also tried to set maxIntron=0. This seem improve a bit, at least there >>> is no gap allowed in the target (which means in the output, all >>> T_gap_count=0), but there is still gap/mismatch in the query. It seems >>> this maxIntron is only designed for target sequence, not for the query. >>> blat assembly.2bit input.fa -stepSize=5 -minIdentity=0 -minScore=0 >>> -maxIntron=0 >>> >>> Do you guys have any tips for this? >>> >>> Of course, I can always write a script to scan the axt file to parse all >>> 100% identical blocks. >>> >>> Thanks, >>> >>> Xianjun >>> >>> -- >>> --------------------------- >>> Sterding (Xianjun) Dong >>> PhD student, Boris Lenhard's group >>> Bergen Center of Computational Science >>> Bergen University, Norway >>> Mobile: 0047-47361688 >>> Telephone: 0047-55276381 >>> Skype: xianjun.dong >>> >>> _______________________________________________ >>> Genome maillist - [email protected] >>> https://lists.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
