Hello,
Other types of statistics can be generated with the "Join, Subtract and
Group -> Group" tool, by adding an additional file manipulation.
First run "Compute sequence length". Next run "Add column to an existing
dataset" and set this to be the same value for all rows, "1" or
something else simple. Nest, run "Group", with a group on the column
containing the "Add column" value, and then click on "Add new
Operation". Set "Type" to be "Mode" and "On column" to be the the
sequence length.
Very large files, usually when run with multiple operations, are
occasionally to too large to run with this tool on the public main
Galaxy instance. If this occurs, the first option is to simplify the
query. If that doesn't work, then moving to a cloud instance would be
the recommendation.
Good luck with your project,
Jen
Galaxy team
On 3/15/12 3:17 PM, Elad Firnberg wrote:
Hi Jen,
Thank you, this was very helpful. Is there a way to get some more
statistical information such as the mode of read lengths? The summary
statistics tool only seems to provide the mean.
Thank you,
Elad
On Thu, Mar 15, 2012 at 3:02 PM, Jennifer Jackson <j...@bx.psu.edu
<mailto:j...@bx.psu.edu>> wrote:
Hi Elad,
Start with the tool "FASTA manipulation -> Compute sequence length"
to generate a length value for each sequence.
Next, use "Statistics -> Summary Statistics for any numerical
column" on the result to generate specific R function statistics
(see the tool form for which and how to enter the expression).
To visualize the distribution, use "Graph/Display Data -> Histogram
of a numeric column".
Hopefully this helps.
Best,
Jen
Galaxy team
On 3/15/12 9:24 AM, Elad Firnberg wrote:
Hi,
Is there a tool or easy way to obtain a read length distribution
on a
set of 454 reads in fasta format? I can't seem to find such a
tool in
Galaxy.
Thank you,
Elad
_____________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org <http://usegalaxy.org>. Please keep all
replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/__listinfo/galaxy-dev
<http://lists.bx.psu.edu/listinfo/galaxy-dev>
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/
--
Chemical and Biomolecular Engineering
Johns Hopkins University
3400 N. Charles St. Baltimore, MD 21218
Tel# (410) 516-3937
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/