Re: [Genome] Building your own multiple sequence alignment

Mary Goldman Wed, 22 Jun 2011 14:05:39 -0700

Hi Nimrod,

The "CDS FASTA alignment from multiple alignment" output option in the 
Table Browser only outputs only the coding sequence. If you are looking 
for promoter, intron and UTR sequences as well, I would recommend using 
Galaxy or building your own alignment using the methods you described.


I hope this information is helpful.  Please feel free to contact the 
mail list again if you require further assistance.

Best,
Mary
------------------
Mary Goldman
UCSC Bioinformatics Group

On 6/20/11 5:31 PM, nimrod rubinstein wrote:
> On Mon, Jun 20, 2011 at 6:51 PM, nimrod rubinstein<[email protected]>wrote:
>
>> Hi Brooke,
>>
>> Thanks a lot for the answer, which as usual was clear and thorough.
>>
>> I hope you don't mind me asking one further question.
>> In addition to the protein coding sequences I'm also interested in creating
>> alignments of upstream (i.e., promoter), intron, and UTR sequences. Aside
>> from using galaxy, should the way I suggested in my previous email (using
>> the gene coordinates of to cut sequences from the pairwise alignments) work
>> for that purpose?
>>
>> Sorry I didn't fully explain this in my previous email.
>>
>> Thanks a lot,
>> Nimrod
>>
>>
>> On Tue, May 31, 2011 at 6:40 PM, Brooke Rhead<[email protected]>  wrote:
>>
>>> Hi Nimrod,
>>>
>>> You could create your own multiple sequence alignments, or you could just
>>> use the existing alignments and pull out only the species (and regions) you
>>> are interested in.
>>>
>>> If you want to create your own alignments, this page should be helpful:
>>> http://genomewiki.ucsc.edu/**index.php/Whole_genome_**alignment_howto<http://genomewiki.ucsc.edu/index.php/Whole_genome_alignment_howto>
>>>
>>> There are a couple of tools that could help you extract what you want from
>>> existing alignments.  The first is the "CDS FASTA alignment from multiple
>>> alignment" output option in the Table Browser (
>>> http://genome.ucsc.edu/cgi-**bin/hgTables<http://genome.ucsc.edu/cgi-bin/hgTables>).
>>>   Select the RefSeq Genes track in hg19, and the CDS FASTA output option 
>>> will
>>> become visible. After hitting "get output" you will see a page where you can
>>> select the organisms you want to include in your output.  See the user's
>>> guide for more info on this option: http://genome.ucsc.edu/**
>>> goldenPath/help/hgTablesHelp.**html#FASTA<http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html#FASTA>
>>> One caveat to be aware of is that, since not all species will be selected
>>> for output, there will be some columns in which all of the alignments will
>>> show only a "-".
>>>
>>> Another option is to use Galaxy (http://main.g2.bx.psu.edu/), which is
>>> run by our collaborators at Penn State and works in conjunction with the
>>> Genome Browser.  I have not personally used the tools there, but there are
>>> several that look like they might be useful to you -- see "Filter MAF blocks
>>> by Species," "Extract MAF blocks given a set of genomic intervals," and
>>> "Stitch Gene blocks given a set of coding exon intervals" on the left-hand
>>> side of the page under the "Fetch Alignments" header.  If you have questions
>>> about using Galaxy, their helpdesk addres is [email protected]
>>> .
>>>
>>> --
>>> Brooke Rhead
>>> UCSC Genome Bioinformatics Group
>>>
>>>
>>>
>>> On 05/27/11 09:49, nimrod rubinstein wrote:
>>>
>>>> Hi,
>>>>
>>>> I think my question is pretty trivial and has probably been raised many
>>>> times before, nevertheless I couldn't find a direct answer for it in the
>>>> archives.
>>>>
>>>> Anyway, I'm interested in building
>>>> Human-Chimp-Orangutan-Rhesus multiple sequence alignments for every human
>>>> refseq gene.
>>>> The way I thought of accomplishing this is to:
>>>> 1. Derive the coding sequence coordinates from the hg19 refGene file for
>>>> every human refseq gene.
>>>> 2. Get the sequences of human and each of the other organisms that map to
>>>> these coordinates from the syntenicNet pairwise alignment files
>>>> (e.g., chr1.hg19.panTro2.synNet.axt.**gz).
>>>> 3. Combine these pairwise sequence files to multiple sequence files and
>>>> run
>>>> my own multiple sequence alignment program.
>>>>
>>>> Does this make sense or is there any other better established way to do
>>>> that?
>>>>
>>>> Thanks a lot,
>>>> Nimrod Rubinstein
>>>> NESCent fellow
>>>> ______________________________**_________________
>>>> Genome maillist  -  [email protected]
>>>> https://lists.soe.ucsc.edu/**mailman/listinfo/genome<https://lists.soe.ucsc.edu/mailman/listinfo/genome>
>>>>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Re: [Genome] Building your own multiple sequence alignment

Reply via email to