Hi David,

One of our engineers has provided this information in response to your 
questions:

"It is OK to give a target sequence file to lastz that contains multiple 
sequences. Run lastz with the argument --help=files to see more 
information of what the target/query files can be to lastz.

The GTF file 454 read location file is not enough information to 
determine what the alignment was unless it is a CIGAR format."

Please don't hesitate to contact the mail list again if you have any 
further questions.

Katrina Learned
UCSC Genome Bioinformatics Group

David Garfield wrote, On 12/10/2010 10:00 AM:
> Hi folks,
>
> First off, many thanks for your earlier help in figuring out soft masking 
> (https://lists.soe.ucsc.edu/pipermail/genome/2010-November/024129.html). It 
> all worked out just fine. Now I have two (related) follow-up questions.
>
> PROJECT: I'm conducting some scans for selection on a mess of sea urchins. 
> One sea urchin (the reference) has an assembled genome. The others are 454 
> sequences. I'd like to generate .chain files so that I can use liftOver to 
> collect specified orthologous regions from the whole set of species.
>
> QUESTION 1: The first question is purely technical. The how-to page on 
> whole-genome alignments 
> (http://genomewiki.ucsc.edu/index.php/Whole_genome_alignment_howto) tells me 
> that lastz has replaced blastz. This is fine, however, it seems that there is 
> one significant difference between the two programs. While blastz (or the 
> wrapper Blastz) can produce .lav files when there are multiple sequences in 
> the target file, lastz cannot (or does not seem to be able to).
>
> How do y'all get around this limitation? Should I simply break apart the 
> genome such that each (reasonably sized) scaffold from the target genome has 
> its own file? Or do you use a different output format that can handle 
> multiple sequences in the target?
>
> QUESTION 2: For two of my sea urchin species, I've been given .gtf formatted 
> files that match each 454 read to a location within the target genome. Is 
> there a way to use this existing map to generate chain files? I haven't found 
> anything nearly so convenient for pulling out orthologous sequences for 
> multiple species than having chain files.
>
> Many thanks,
>
> David
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to