You have run into the most common question:
Half-open coordinates used in files and tables
versus closed positions as displayed by the browser for users:

see
  http://genome.ucsc.edu/FAQ/FAQformat.html#format1

 >> For a short sequence (7nt) [...]
 >> The following example shows the problem.
 >>
 >>   content of the inputfile containing one human seq location on hg19:
 >>
 >>   chr11          120899760       120899766       pos1     1       +

This actually describes a 6 nt sequence
with the first base in the chromosome being 0
and the last base in your chromEnd (120899766) not being included.

What you probably wanted to put in your bed file is:
      chr11          120899759       120899766       pos1     1       +
which would be 7 bases long.

So the minMatch=1 liftOver option does work.

-Galt

6/1/2011 10:20 PM, Maximilian Haussler:
> Hi Mohsen,
>
> I see, my solution was not correct. I actually don't understand why
> minMatch=1 doesn't give you the right answer and am forwarding your
> question to the mailing list again...
>
> cheers
> Max
>
> ---------- Forwarded message ----------
> From: Mohsen Sabouri<[email protected]>
> Date: Wed, Jun 1, 2011 at 6:18 PM
> Subject: RE: [Genome] finding conserved positions with 100% seq.
> identity on a different species genome
> To: Maximilian Haussler<[email protected]>
>
>
> Hi Max,
>
> Thanks for your help. I donwloaded pslCDnaFilter. It takes BLAT
> outputs that are in psl format. However,  I am using the liftover for
> genome conversion between human and mouse where the outputs are in BED
> format. Is there anyway that I can pipe the output of liftover(batch
> coordinate conversion)  to pslCDnaFilter or any other filter that can
> do the job.
>
> Many thanks!
>
> Mohsen Sabouri, PhD
> The Scripps Research Institute
> 10550 N. Torrey Pines Road
> La Jolla, CA 92037
> ________________________________________
> From: Maximilian Haussler [[email protected]]
> Sent: Tuesday, May 31, 2011 8:35 PM
> To: Mohsen Sabouri; genome
> Subject: Re: [Genome] finding conserved positions with 100% seq.
> identity on a different species genome
>
> Hi Mohsen,
>
> the pslCDnaFilter program has an option -minId which you could set to
> 1.00 to remove the non-identical alignments. Does this solve your
> problem?
>
> cheers
> Max
>
>
>
>
>
>
> On Tue, May 31, 2011 at 10:11 PM, Mohsen Sabouri<[email protected]>  wrote:
>> Hi
>>
>> For a short sequence (7nt) in human genome assembly, hg19, I want to find 
>> the corresponding 7nt sequence in Mouse assembly mm9, with 100% sequence 
>> Identity to my human 7nt sequence (assuming its 100% conserved).
>>   I am using liftover. For some of my human sequences liftover points to 
>> locations on Mouse that are not 100% identical to my original human 
>> sequence. Is there a parameter in liftover that can be adjusted for this.
>>
>> The following example shows the problem.
>>
>>   content of the inputfile containing one human seq location on hg19:
>>
>>   chr11          120899760       120899766       pos1     1       +
>>
>>   the sequence is : CACTTTA
>>
>>   liftover command line used:
>>
>>   ./liftOver -minMatch=1 -multiple -minSizeT=7 -minSizeQ=7 -bedPlus=6 
>> inputfile hg19ToMm9.over.chain outputfile unmapped
>>
>>   content of the outputfile produced by liftover containing corresponding 
>> conserved mouse coordinates in mm9 assembly is:
>>
>>   chr9           42275345          42275351           pos1     1       -
>>
>>   when I plug the new coordinates in the genome browser, mm9, it shows 
>> content of this sequence as:
>>
>>   TAAGGAG
>>
>>   which is not 100% match with my original human seq.
>>
>>   Ideally if there is no 100% identity, then the outputfile should be empty.
>>
>>   Is there any way to fix this?
>>
>>   Many thanks!
>>
>>
>> Mohsen Sabouri, PhD
>> The Scripps Research Institute
>> 10550 N. Torrey Pines Road
>> La Jolla, CA 92037
>> _______________________________________________
>> Genome maillist  -  [email protected]
>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>
>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to