Hi Lysander,

I asked one of our engineers about this, and this is what she had to say:
"Our .over.chain files are extracted from nets + chains by the 
netChainSubset program. Basically, they are the chain segments that end 
up at the top of the nets. Top-level chains can become fragmented if 
there are other chains that fill in for their gaps: netChainSubset will 
first write out the portion of the top chain up to the gap, then write 
out the (portion of the) chain that fills the gap, then write the next 
portion of the top chain after the gap.

For a long time, we have been post-processing netChainSubset's output 
with the chainStitchId program which is supposed to link the chain 
fragments back together. Hopefully, the user is seeing the fragments in 
a really old .over.chain (or one that wasn't created by our automated 
method). If so, running chainStitchId should help (or running [our] 
liftOver)."

So, I think the file you downloaded might be an older file that was 
created before we started running the chainStichId script on them.  
Therefore GATK liftOver might require this step to be done. I would 
suggest either using our liftOver (which won't throw an error on this 
file) or running our chainStichId utility on the file before using the 
GATK liftOver.

You can get chainStichId by downloading and building our source tree. 
Some info on that process can be found here: 
http://hgdownload.cse.ucsc.edu/downloads.html#source_downloads

If you need more assistance, feel free to mail the browser list again.

-
Greg Roe
UCSC Genome Browser

On 1/23/11 9:55 AM, [email protected] wrote:
> Dear UCSC team,
>
> I tried to do a liftover from hg18 to panTro2 using GATK and the
> hg18ToPanTro2.over.chain file downloaded from
> http://hgdownload.cse.ucsc.edu/goldenPath/hg18/liftOver/
> In the above mentioned file, chain ID '18' is defined multiple times
> which results in an error when running GATK's liftover. Thus, I was
> wondering whether it is intentional that ID '18' is used multiple times?
>
> Any advice is greatly appreciated.
> Thank you for your help in advance.
> _______________________________________________
> Genome maillist  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to