Good Morning Peter:

You may find it more convenient to map your features with blat
to the other assembly than using the lift over chains since
you are trying to find the multiple mappings.  The lift over chains
may have already eliminated matches that may be better for your
specific features.

One work-around for the situation you mention below would be to change
your input file to combine the dataValue with the name string:
<chr> <start> <end> <name>_<dataValue>

You can then reverse that combination after the lift to separate
the name from the dataValue.

--Hiram

----- Original Message -----
From: "Peter Ebert" <[email protected]>
To: [email protected]
Sent: Tuesday, January 3, 2012 2:11:15 AM
Subject: Re: [Genome] liftOver -multiple BED4/5/6

Hi Luvina,

thank you very much for the detailed information on liftOver. Regarding the 
multiple option, I noticed the following:
I try to lift BED files of the following format between assemblies:
<chr>   <start> <end>   <name>  <dataValue>
The output of liftOver is a BED file as follows:
<chr>   <start> <end>   <name>  <multiplicity>
So, liftOver discards the data value in column 5 and replaces it with the 
value of the multiplicity for the mapped region (i.e. 1, 2, 3, 4 and so on for 
multiple output regions). Is it possible to set a liftOver parameter in order 
to get a file like
<chr>   <start> <end>   <name>  <dataValue>     <multiplicity>
Since this is still a 6 column BED file, it should not interfere with the 
restrictions of the multiple option?
Thanks for your help.

Best,
Peter

On Tuesday 29 November 2011 20:49:49 Luvina Guruvadoo wrote:
> Hi Peter,
> 
> One of our engineers mentioned the BED 4-6 option was added to the
> multiple option because this was requested by ENCODE users. However for
> BED12, he says "[looking at the code] if the range find on the chain
> returns two or more top-scoring chains of the same score, it calls it
> "duplicated-in-new" and returns without outputting anything else. In
> theory this could be relaxed so that multiple qualifying chains would be
> output. Obviously, it might be harder for the user to analyze the
> results if there are many chains. I suppose a different way would be to
> let the user choose the topN high-scoring chains as a way to deal with
> regions with many hundreds or thousands of hits."
> 
> Please contact us again at [email protected] if you have any further
> questions.
> 
> ---
> Luvina Guruvadoo
> UCSC Genome Bioinformatics Group
> 
> On 11/17/2011 6:34 AM, Peter Ebert wrote:
> > Hi,
> > liftOver informed me that, when using the -multiple option, it can only
> > use files in BED4/5/6 format as input. Is that a technical (=
> > programming related) limitation or is there a more complicated
> > explanation? It would be nice if someone could elaborate on that issue.
> > Thanks in advance.
> > Cheers,
> > Peter
> > _______________________________________________
> > Genome maillist  -  [email protected]
> > https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to