Hi Mohsen, I see, my solution was not correct. I actually don't understand why minMatch=1 doesn't give you the right answer and am forwarding your question to the mailing list again...
cheers Max ---------- Forwarded message ---------- From: Mohsen Sabouri <[email protected]> Date: Wed, Jun 1, 2011 at 6:18 PM Subject: RE: [Genome] finding conserved positions with 100% seq. identity on a different species genome To: Maximilian Haussler <[email protected]> Hi Max, Thanks for your help. I donwloaded pslCDnaFilter. It takes BLAT outputs that are in psl format. However, I am using the liftover for genome conversion between human and mouse where the outputs are in BED format. Is there anyway that I can pipe the output of liftover(batch coordinate conversion) to pslCDnaFilter or any other filter that can do the job. Many thanks! Mohsen Sabouri, PhD The Scripps Research Institute 10550 N. Torrey Pines Road La Jolla, CA 92037 ________________________________________ From: Maximilian Haussler [[email protected]] Sent: Tuesday, May 31, 2011 8:35 PM To: Mohsen Sabouri; genome Subject: Re: [Genome] finding conserved positions with 100% seq. identity on a different species genome Hi Mohsen, the pslCDnaFilter program has an option -minId which you could set to 1.00 to remove the non-identical alignments. Does this solve your problem? cheers Max On Tue, May 31, 2011 at 10:11 PM, Mohsen Sabouri <[email protected]> wrote: > Hi > > For a short sequence (7nt) in human genome assembly, hg19, I want to find the > corresponding 7nt sequence in Mouse assembly mm9, with 100% sequence Identity > to my human 7nt sequence (assuming its 100% conserved). > I am using liftover. For some of my human sequences liftover points to > locations on Mouse that are not 100% identical to my original human sequence. > Is there a parameter in liftover that can be adjusted for this. > > The following example shows the problem. > > content of the inputfile containing one human seq location on hg19: > > chr11 120899760 120899766 pos1 1 + > > the sequence is : CACTTTA > > liftover command line used: > > ./liftOver -minMatch=1 -multiple -minSizeT=7 -minSizeQ=7 -bedPlus=6 > inputfile hg19ToMm9.over.chain outputfile unmapped > > content of the outputfile produced by liftover containing corresponding > conserved mouse coordinates in mm9 assembly is: > > chr9 42275345 42275351 pos1 1 - > > when I plug the new coordinates in the genome browser, mm9, it shows content > of this sequence as: > > TAAGGAG > > which is not 100% match with my original human seq. > > Ideally if there is no 100% identity, then the outputfile should be empty. > > Is there any way to fix this? > > Many thanks! > > > Mohsen Sabouri, PhD > The Scripps Research Institute > 10550 N. Torrey Pines Road > La Jolla, CA 92037 > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
