Hello Alexander, Please see this previously answered mailing list question, "How does LiftOver Work?":
https://lists.soe.ucsc.edu/pipermail/genome/2008-March/015810.html Hopefully this should answer some of your questions. If you still require assistance, please write back to us. Kayla Smith UCSC Genome Bioinformatics Group ----- "Alexander Stark" <[email protected]> wrote: > Hi all, > > we're using liftOver quite extensively to translate coordinates > between different species. In general, it seems to work quite well for > > us and the results typically make sense when we inspect them visually. > > However, sometimes we run into problems, especially for coordinate- > conversions between more distantly related species. Unfortunately, we > > could not find a more detailed description of how liftOver works > (apart from the short help it prints) and what the command line > parameters do - we hope someone can help. > > It is our understanding that liftOver essentially uses the UCSC > alignments (or the underlying data) for the conversions. This should > > mean that any input region can map to 0, 1, or several contiguous > regions in the target genome, that the region length can change, and > > that only a certain fraction of the input nucleotides correspond to > (i.e. map to) target nucleotides. > > We assume that the behavior of liftOver with respect to these can be > > controlled using the following parameters: > > -minMatch=0.N Minimum ratio of bases that must remap. Default 0.95 > -minBlocks=0.N Minimum ratio of alignment blocks/exons that must map > > (default 1.00) > -fudgeThick If thickStart/thickEnd is not mapped, use the closest > mapped base. Recommended if using -minBlocks. > -multiple Allow multiple output regions > -minChainT, -minChainQ Minimum chain size in target/query, when > mapping to multiple output regions (default 0, 0) > > Could you please give some details on what exactly the parameters do? > > This is very important for us to know in order to use the tool > appropriately. For example: > > 1. What does "remap" mean for the minMatch parameter? > Is it the fraction of input bases that have a target counterpart, i.e. > > that would appear aligned in a sequence alignment (or is it the > fraction of target-bases that have an input counterpart)? > > When relaxing this parameter, we typically get more lifted regions. > Are these however still orthologous/unique or will we run into a > specificity problem? I understand that liftOver only uses a pre- > computed alignment (or coordinate lookup-table) that - in principle - > > only contains alignments between orthologous regions. In other words, > > I do NOT expect liftOver to simply find more and more "matches" that > > make less and less sense as e.g. blast would do when lowering its > specificity. > > 2. How does the minMatch parameter influence the growing & shrinking > > of region-length > Does a more relaxed minMatch parameter allow for more variable region- > > length between input and target regions? In other words: if it only > assesses the fraction of input nucleotides that have a counterpart, > the region can grow freely but not shrink and vice versa. > > 3. Will we "loose" regions? > When lowering minMatch, will regions that are uniquely mapped with a > > stringent minMatch parameter map to multiple regions/blocks and thus > > become unmapped? > > 4. Does "multiple" allow that an input region spans multiple output > blocks or does it allow non-unique mapping (of the same region) > > 5. What does minChainT and minChainQ mean (i.e. what is a chain size, > > etc.)? > > 6. what does minBlocks do? does it apply to regions that span multiple > > alignment blocks and require that the same number of alignment blocks > > must be in the input and target region? > > Very many thanks for your help in advance and sorry for all the > questions. > > Best, > > Alex > > > > > > ********** > Alexander Stark, PhD > Group Leader > Institute of Molecular Pathology (IMP) > Dr. Bohr-Gasse 7; 1030 Vienna > Austria > > Tel. +43 (1) 79730-3380 > [email protected] > http://www.imp.ac.at/research/alexander-stark/ > > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
