Re: [Jalview-discuss] Phylogenetic trees calculated for specified alignment zones

Jim Procter Thu, 06 Nov 2014 03:30:17 -0800

Hi James.

Romain and Andreas have provided some great suggestions. I thought itworth adding that when using Jalview with web sites and other tools -the most convenient way to prepare selections for input (e.g. to a treecalculation) is by creating a new view (Ctrl or Command + T), select allthe columns you want to employ for calculation and use the'View->Hide->All but selected region' to hide everything that you *donot* want to employ for the tree calculation (shift+CMD/CTRL + H).Creating a view minimises the number of additional windows you createcontaining the same alignment data, and the input data view can be givenit's own name and archived in a jalview project. Once you have someresults from the other tools, you can then add them the view, andexplore the alignment further via Jalview's subfamily shading methods(or even the Sequence Harmony service since its designed to predictfunctional site variation based on a set of defined subgroups on thealignment).

If you have any problems exporting alignment data from Jalview orimporting the trees back in to Jalview from RaxML or FastTree, send anemail ! We also now have PHYLIP format input/export support in thedevelopment versions of Jalview which is useful when working with RaxML.


Jim.

PS. Just to expand on Romain's comment about accuracy: Jalview's treealgorithms are rigorous but relatively primitive, and not consideredappropriate for general phylogenetic analysis tasks. Also, Jalviewdoesn't provide support for model selection (picking the right model tocalculate the intersequence distances from the alignment) orbootstrapping (identifying the statistically significant branches in thetree). FastTree and RaxML both employ heuristic maximum likelihoodsearches to produce an accurate tree more quickly and includesapproximate support calculations.


On 06/11/2014 09:35, rstuder wrote:

Yes, trees in Jalview are calculated only based on marked positions.

For accuracy, I would not use phylogenetic tools from Jalview.

I would rather do the following:
1) Select the positions in Jalview.
2) Copy them.
3) Paste them as new alignment.
4) Save the alignment in a new file.

And then I use FastTree or RAxML (or even PhyML if you have access togood computer cluster).


Romain


On 05/11/2014 20:18, James Starlight wrote:

Hi Romain,

thank you very much for the explanation!

I've already used TrimAl as the part of the Phylemon2 server andfound it very useful :-)Regarding calculating Trees in JalView using subset of the residues:as I noticed the tree are calculated just in case when thepositions of ligand-contacting residues are marked by red color inthe top, aren't it? Is it possible in addition to check on whatexactly subset trees has been calculated based on the output results?Finally regarding accuracy of the calculation of the trees- whatmethod should produce best results for the alignment consisted ofseveral hundred of sequences?


James

2014-11-05 16:59 GMT+01:00 rstuder <[email protected]<mailto:[email protected]>>:


    Dear James,

    For aligning the sequences, I would recommend MAFFT (L-INS-i) or
    Clustal-Omega:
    http://mafft.cbrc.jp/alignment/software/
    http://www.clustal.org/omega/

    Then, you can use Jalview to select manually the ligand-binding
    domain in your alignment.

    You can also use TrimAl to select only well aligned position that
    corresponding to phylogenetic signal.
    http://trimal.cgenomics.org/

    For producing the tree on very big alignments, I would recommend
    FastTree. It produces quite good results and is very easy to use:
    http://www.microbesonline.org/fasttree/

    There is also RAxML which is developed for big alignments:
    http://sco.h-its.org/exelixis/web/software/raxml/index.html

    Best regards,
    Romain


    On 05/11/2014 15:06, James Starlight wrote:

    Dear JalView users!

    I need to perform large-scale phylogenetic analysis of  big
    dataset of GPCR sequences selecting as the input only residues
    involved in the ligand-binding site of those receptors (taken
    from the structural data) from the input multiple-sequence
    alignment. I wounder what method for phylogenetic trees
    calculation will be best (neighbourhood joining or
    pairs-distance calculations) for my task as well as how to make
    selection of the selecting residues properly (previously I've
    done it by cntrl-left click on the bottom of the alignment
    marking corresponded zone by red colour). On what additional
    details should I paid my attention during such calculations in
    case when I'm dealing with a very big number of sequences?


    Thank you for the help,

    James


    _______________________________________________
    Jalview-discuss mailing list
    [email protected]  <mailto:[email protected]>
    http://www.compbio.dundee.ac.uk/mailman/listinfo/jalview-discuss

--Romain Studer

    EMBL-EBI
    Wellcome Trust Genome Campus
    Hinxton
    Cambridgeshire
    CB10 1SD, UK
    Tel:+44 (0)1223 492 547  <tel:%2B44%20%280%291223%20492%20547>
    Twitter: @RomainStuder


--
Romain Studer
EMBL-EBI
Wellcome Trust Genome Campus
Hinxton
Cambridgeshire
CB10 1SD, UK
Tel: +44 (0)1223 492 547
Twitter: @RomainStuder


_______________________________________________
Jalview-discuss mailing list
[email protected]
http://www.compbio.dundee.ac.uk/mailman/listinfo/jalview-discuss

_______________________________________________
Jalview-discuss mailing list
[email protected]
http://www.compbio.dundee.ac.uk/mailman/listinfo/jalview-discuss

Re: [Jalview-discuss] Phylogenetic trees calculated for specified alignment zones

Reply via email to