Hi, After using the liftOver tool (web version) between Zebrafish assemblies Zv7 and Zv8 for several thousand 60mer sequences, spot-checking with BLAT revealed instances of perfect matches missing from the liftOver results as detailed below. I can imagine reasons why this effect may be to be expected at some frequency (e.g. due to minimum length requirements for a genomic region in one assembly to correspond to a region in another), but I have not been able to find detailed enough documentation on the liftOver utility to make a qualified assessment in this regard.
I would be grateful for more information on the liftOver tool that can shed light on this issue. Thanks much and kind regards Sebastian _____________________________________________________ Sebastian Hoersch Koch Institute for Integrative Cancer Research at MIT Bioinformatics and Computing Core 77 Massachusetts Avenue (E18-366) Cambridge, MA 02139 phone: 1-617-324-1728 email: [email protected] Examples: (1) Sequence with one perfect BLAT match in Zv7 and two perfect BLAT matches in Zv8 – liftOver returns only one match (despite allowing multiple output regions): >P07475181 AAAAAATGTAAATAACGTGGGAAAAATCCTTGTTAAATTGTTAACGTGATCCTTGCTGAA Zv7:BLAT Search Results ACTIONS QUERY SCORE START END QSIZE IDENTITY CHRO STRAND START END SPAN --------------------------------------------------------------------------------------------------- browser details P07475181 60 1 60 60 100.0% 25 - 11982433 11982492 60 Zv8: BLAT Search Results ACTIONS QUERY SCORE START END QSIZE IDENTITY CHRO STRAND START END SPAN --------------------------------------------------------------------------------------------------- browser details P07475181 60 1 60 60 100.0% 25 - 21776926 21776985 60 browser details P07475181 60 1 60 60 100.0% 25 - 21878781 21878840 60 liftOver results with checkbox “Allow multiple output regions:” checked (BED format): chr25 21776926 21776985 - 1 Note: Based on SelfChain data, it appears that genomic sequence surrounding the query sequence is duplicated in Zv8, but not in Zv7. (2) Sequence with one perfect BLAT match in Zv7 and in Zv8 – liftOver fails with “#Deleted in new” >P01179900 AAGAAACTGAGAGAACAAATCAAGGAAAAAAATGACAATCTGCAGAGAGAGAACTTCCAT Zv7:BLAT Search Results ACTIONS QUERY SCORE START END QSIZE IDENTITY CHRO STRAND START END SPAN --------------------------------------------------------------------------------------------------- browser details P01179900 60 1 60 60 100.0% 1 + 28694772 28694831 60 browser details P01179900 27 1 31 60 93.6% Zv7_scaffold2625 - 179464 179494 31 Zv8:BLAT Search Results ACTIONS QUERY SCORE START END QSIZE IDENTITY CHRO STRAND START END SPAN --------------------------------------------------------------------------------------------------- browser details P01179900 60 1 60 60 100.0% 1 - 35253773 35253832 60 browser details P01179900 27 1 31 60 93.6% 14 - 7435368 7435398 31 browser details P01179900 27 1 31 60 93.6% Zv8_scaffold1706 - 179464 179494 31 browser details P01179900 22 9 40 60 84.4% 2 + 43019472 43019503 32 browser details P01179900 20 29 50 60 95.5% 10 + 42695956 42695977 22 liftOver results in failure: #Deleted in new chr1:28694772-28694831 == _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
