On Mon, Jun 20, 2011 at 6:51 PM, nimrod rubinstein <[email protected]>wrote:

> Hi Brooke,
>
> Thanks a lot for the answer, which as usual was clear and thorough.
>
> I hope you don't mind me asking one further question.
> In addition to the protein coding sequences I'm also interested in creating
> alignments of upstream (i.e., promoter), intron, and UTR sequences. Aside
> from using galaxy, should the way I suggested in my previous email (using
> the gene coordinates of to cut sequences from the pairwise alignments) work
> for that purpose?
>
> Sorry I didn't fully explain this in my previous email.
>
> Thanks a lot,
> Nimrod
>
>
> On Tue, May 31, 2011 at 6:40 PM, Brooke Rhead <[email protected]> wrote:
>
>> Hi Nimrod,
>>
>> You could create your own multiple sequence alignments, or you could just
>> use the existing alignments and pull out only the species (and regions) you
>> are interested in.
>>
>> If you want to create your own alignments, this page should be helpful:
>> http://genomewiki.ucsc.edu/**index.php/Whole_genome_**alignment_howto<http://genomewiki.ucsc.edu/index.php/Whole_genome_alignment_howto>
>>
>> There are a couple of tools that could help you extract what you want from
>> existing alignments.  The first is the "CDS FASTA alignment from multiple
>> alignment" output option in the Table Browser (
>> http://genome.ucsc.edu/cgi-**bin/hgTables<http://genome.ucsc.edu/cgi-bin/hgTables>).
>>  Select the RefSeq Genes track in hg19, and the CDS FASTA output option will
>> become visible. After hitting "get output" you will see a page where you can
>> select the organisms you want to include in your output.  See the user's
>> guide for more info on this option: http://genome.ucsc.edu/**
>> goldenPath/help/hgTablesHelp.**html#FASTA<http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html#FASTA>
>> One caveat to be aware of is that, since not all species will be selected
>> for output, there will be some columns in which all of the alignments will
>> show only a "-".
>>
>> Another option is to use Galaxy (http://main.g2.bx.psu.edu/), which is
>> run by our collaborators at Penn State and works in conjunction with the
>> Genome Browser.  I have not personally used the tools there, but there are
>> several that look like they might be useful to you -- see "Filter MAF blocks
>> by Species," "Extract MAF blocks given a set of genomic intervals," and
>> "Stitch Gene blocks given a set of coding exon intervals" on the left-hand
>> side of the page under the "Fetch Alignments" header.  If you have questions
>> about using Galaxy, their helpdesk addres is [email protected]
>> .
>>
>> --
>> Brooke Rhead
>> UCSC Genome Bioinformatics Group
>>
>>
>>
>> On 05/27/11 09:49, nimrod rubinstein wrote:
>>
>>> Hi,
>>>
>>> I think my question is pretty trivial and has probably been raised many
>>> times before, nevertheless I couldn't find a direct answer for it in the
>>> archives.
>>>
>>> Anyway, I'm interested in building
>>> Human-Chimp-Orangutan-Rhesus multiple sequence alignments for every human
>>> refseq gene.
>>> The way I thought of accomplishing this is to:
>>> 1. Derive the coding sequence coordinates from the hg19 refGene file for
>>> every human refseq gene.
>>> 2. Get the sequences of human and each of the other organisms that map to
>>> these coordinates from the syntenicNet pairwise alignment files
>>> (e.g., chr1.hg19.panTro2.synNet.axt.**gz).
>>> 3. Combine these pairwise sequence files to multiple sequence files and
>>> run
>>> my own multiple sequence alignment program.
>>>
>>> Does this make sense or is there any other better established way to do
>>> that?
>>>
>>> Thanks a lot,
>>> Nimrod Rubinstein
>>> NESCent fellow
>>> ______________________________**_________________
>>> Genome maillist  -  [email protected]
>>> https://lists.soe.ucsc.edu/**mailman/listinfo/genome<https://lists.soe.ucsc.edu/mailman/listinfo/genome>
>>>
>>
>
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to