Hi Harry, On each line of your pasted example, the start coordinate equals the end coordinate, and in the BED format that specifies a 0-base region, which liftOver is unable to map. BED start coordinates are 0-based (end coordinates are 0-based but one past the actual end, so they look like 1-based coords). If you subtract 1 from each start coordinate to get 1-bp regions like this, and resubmit to hgLiftOver:
chr20 25319 25320 0 chr20 38033 38034 1 chr20 46944 46945 2 chr20 59999 60000 3 chr20 67501 67502 4 chr20 67847 67848 5 chr20 73107 73108 6 chr20 76762 76763 7 chr20 76767 76768 8 -- well, two of the regions map successfully: chrUn.004.93 32682 32683 0 1 chr20 2974877 2974878 3 1 and the rest are "deleted in new", i.e. not successfully mapped to bosTau4. I extracted the MD3 sequence at chr20:46901-47000 and aligning it with blat to bosTau4, and it looks like MD3 20 base 46495 falls in a 6bp gap in the alignment of chr20:46901-47000 to bosTau4 chrUn.004.1977:25404-25496. The assemblies are different enough to make mapping tougher than it usually is between successive assemblies from the same source. If the 0 lengths were intentional, perhaps you could consider expanding them to 2 bases so liftOver can get a toehold? Hope that helps, Angie > At 1/19/2010 8:57 AM, Harry Noyes wrote: > > Dear UCSC > > I am struggling with liftover on a local installation to convert from > > cow UMD3 to cow Bta4. > > I have downloaded the program and the over chain files. > > I have an input file with 11 million lines in tab separated format > > (chr, start, end id) called boranMD3.bed: > > 20 25320 25320 0 > > 20 38034 38034 1 > > 20 46945 46945 2 > > 20 60000 60000 3 > > 20 67502 67502 4 > > 20 67848 67848 5 > > 20 73108 73108 6 > > 20 76763 76763 7 > > 20 76768 76768 8 > > > > I run the command > > ./liftOver boranMD3.bed bosTauMd3ToBosTau4.over.chain boranBt4.bed > > boranUnmapped.bed > > > > On completion boranBt4.bed is empty and boranUnmapped.bed has all > 11 > > million lines with the comment "Deleted in New". > > I have tried on the website (original assembly: Aug 2009 (UMD3); > new > > assembly: Oct 2007) with the first 12 lines of data and that gives > the > > same result. Also on the website I have tried adding a "chr" prefix > to > > the chromosome name ie chr20 but that did not work and I have tried > > reducing the ratio of bases that must remap to 0.1. All these fail > > with the same error for each line: "Deleted in New". I also tried > > using a single space as a delimiter but it complained that it was > not > > in BED format. > > > > Any idea what I might be doing wrong? > > > > Thanks > > > > Harry > > > > > > Harry Noyes > > Room 231 BioSciences Building > > University of Liverpool > > Crown Street > > Liverpool > > L69 7ZB > > 0151 795 4512 > > www.genomics.liv.ac.uk/tryps > > > > > > _______________________________________________ > > Genome maillist - [email protected] > > https://lists.soe.ucsc.edu/mailman/listinfo/genome > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
