Hello,

Other types of statistics can be generated with the "Join, Subtract and Group -> Group" tool, by adding an additional file manipulation.

First run "Compute sequence length". Next run "Add column to an existing dataset" and set this to be the same value for all rows, "1" or something else simple. Nest, run "Group", with a group on the column containing the "Add column" value, and then click on "Add new Operation". Set "Type" to be "Mode" and "On column" to be the the sequence length.

Very large files, usually when run with multiple operations, are occasionally to too large to run with this tool on the public main Galaxy instance. If this occurs, the first option is to simplify the query. If that doesn't work, then moving to a cloud instance would be the recommendation.

Good luck with your project,

Jen
Galaxy team

On 3/15/12 3:17 PM, Elad Firnberg wrote:
Hi Jen,

Thank you, this was very helpful. Is there a way to get some more
statistical information such as the mode of read lengths? The summary
statistics tool only seems to provide the mean.

Thank you,
Elad


On Thu, Mar 15, 2012 at 3:02 PM, Jennifer Jackson <j...@bx.psu.edu
<mailto:j...@bx.psu.edu>> wrote:

    Hi Elad,

    Start with the tool "FASTA manipulation -> Compute sequence length"
    to generate a length value for each sequence.

    Next, use "Statistics -> Summary Statistics for any numerical
    column" on the result to generate specific R function statistics
    (see the tool form for which and how to enter the expression).

    To visualize the distribution, use "Graph/Display Data -> Histogram
    of a numeric column".

    Hopefully this helps.

    Best,

    Jen
    Galaxy team


    On 3/15/12 9:24 AM, Elad Firnberg wrote:

        Hi,

        Is there a tool or easy way to obtain a read length distribution
        on a
        set of 454 reads in fasta format? I can't seem to find such a
        tool in
        Galaxy.

        Thank you,
        Elad





        _____________________________________________________________
        The Galaxy User list should be used for the discussion of
        Galaxy analysis and other features on the public server
        at usegalaxy.org <http://usegalaxy.org>.  Please keep all
        replies on the list by
        using "reply all" in your mail client.  For discussion of
        local Galaxy instances and the Galaxy source code, please
        use the Galaxy Development list:

        http://lists.bx.psu.edu/__listinfo/galaxy-dev
        <http://lists.bx.psu.edu/listinfo/galaxy-dev>

        To manage your subscriptions to this and other Galaxy lists,
        please use the interface at:

        http://lists.bx.psu.edu/




--
Chemical and Biomolecular Engineering
Johns Hopkins University
3400 N. Charles St. Baltimore, MD 21218
Tel# (410) 516-3937
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/

Reply via email to