Hi Jennifer,

It would be grateful if you could provide me with more details about  
how to remove un-wanted species on hg18 conservation track because I  
tried a lot without a clue.  I can only get multiz17way from the  
Tables, then I tried to output to Galaxy, but failed. This may because  
of the size of the file is too large to be load in Galaxy. On the  
other hand, I don't really want to use Galaxy, because I have a bit  
concern about the method Galaxy used to remove species. There is a  
possibility that Galaxy simply remove lines from MAF without doing the  
multiple alignment again.

Kind regards,
Yuan

On 23 Jun 2009, at 19:06, Jennifer Jackson wrote:

> Hi Yuan,
> The Conservation track in hg18 has control options that would allow  
> you to remove any species not in your set. This is a compound track  
> - meaning that conserved regions are a part of the sub-track set.
> Download/data access info & options:
> http://genome-test.cse.ucsc.edu/FAQ/FAQdownloads#download1 
> http://genome-test.cse.ucsc.edu/FAQ/FAQdownloads#download29
> http://genome-test.cse.ucsc.edu/FAQ/FAQtracks#tracks21
>
> If you still want to do this on your own, the Conservation track is  
> still a good reference. For each species, the methods we used are  
> outlined. Different alignment methods were used for different  
> species based on biological reasoning (evolutionary distance,  
> quality of genomic, etc). For details about each pair-wise, see the  
> individual tracks in that species' genome browser and review the  
> creation methods. Some of these may not be on the public server, but  
> on the test server at http://genome-test.cse.ucsc.edu/. Please note  
> that all tracks on the test server /that are not/ on the regular  
> public server have not undergone formal QA and may have sparse  
> methods - although you should be able to identify similar tracks on  
> the public server (in another species) with complete methods listed.  
> However - with your list of genomes - this should not be a problem.
>
> Some notes from a UCSC Scientist that creates this type of data:
>> http://genomewiki.ucsc.edu/index.php/Whole_genome_alignment_howto
>>
>> That is a great write-up by a power-user who managed to sort of  
>> duplicate our process locally, for a small genome.
>>
>> And we should also stress upfront that the process requires big  
>> compute resources.  If they're working with vertebrate genomes,  
>> they should have access to a cluster with at least ~50 CPUs (more  
>> is better) and if mammalian genomes, at least a few hundred CPUs.   
>> Otherwise the compute time is prohibitive.
>> And we should probably tell them that we now use lastz, a greatly  
>> improved replacement for blastz.
>>
>> http://www.bx.psu.edu/miller_lab/dist/README.lastz-1.01.50/README.lastz-1.01.50.html
>>
>> Angie
>>
> Thanks,
> Jennifer Jackson
> UCSC Genome Bioinformatics Group
>
> Yuan Hao wrote:
>> Dear List,
>>
>> I would like to create my own multiple alignment file for hg18,  
>> mm9,  rn4 and canFam2 from UCSC pairwise alignments by using Multiz/ 
>> TBA  aligner. I got following several questions which I am not sure  
>> yet  after a broad reading. It would be very appreciated if you  
>> could shed  some lights on them:
>>
>> - Which aligner, Multiz or TBA, would be better if my purpose is  
>> to  study the motif conservation on the final MAF.
>>
>> - Multiz/TBA takes pairwise alignment in .maf format. While from  
>> UCSC  I can only find pairwise alignment in .chain, .net or axtNet  
>> format. I  found there are programs available in Kent source to do  
>> the format  convert: chainToAxt, netToAxt or axtToMaf. My question  
>> is which  pairwise format should I download to create multiple  
>> alignment?
>>
>> - Is there anything else I missed here during this process?
>>
>> Thank you very much in advance!
>>
>> Kind regards,
>> Yuan
>> _______________________________________________
>> Genome maillist  -  [email protected]
>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to