Hi Lysander, I asked one of our engineers about this, and this is what she had to say: "Our .over.chain files are extracted from nets + chains by the netChainSubset program. Basically, they are the chain segments that end up at the top of the nets. Top-level chains can become fragmented if there are other chains that fill in for their gaps: netChainSubset will first write out the portion of the top chain up to the gap, then write out the (portion of the) chain that fills the gap, then write the next portion of the top chain after the gap.
For a long time, we have been post-processing netChainSubset's output with the chainStitchId program which is supposed to link the chain fragments back together. Hopefully, the user is seeing the fragments in a really old .over.chain (or one that wasn't created by our automated method). If so, running chainStitchId should help (or running [our] liftOver)." So, I think the file you downloaded might be an older file that was created before we started running the chainStichId script on them. Therefore GATK liftOver might require this step to be done. I would suggest either using our liftOver (which won't throw an error on this file) or running our chainStichId utility on the file before using the GATK liftOver. You can get chainStichId by downloading and building our source tree. Some info on that process can be found here: http://hgdownload.cse.ucsc.edu/downloads.html#source_downloads If you need more assistance, feel free to mail the browser list again. - Greg Roe UCSC Genome Browser On 1/23/11 9:55 AM, [email protected] wrote: > Dear UCSC team, > > I tried to do a liftover from hg18 to panTro2 using GATK and the > hg18ToPanTro2.over.chain file downloaded from > http://hgdownload.cse.ucsc.edu/goldenPath/hg18/liftOver/ > In the above mentioned file, chain ID '18' is defined multiple times > which results in an error when running GATK's liftover. Thus, I was > wondering whether it is intentional that ID '18' is used multiple times? > > Any advice is greatly appreciated. > Thank you for your help in advance. > _______________________________________________ > Genome maillist [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
