Hello Sam,

When running Megablast, filtering by identity or evalue can help reduce the hits (the default values are all fairly permissive, if you are performing the query vs the same species target genome and the query has been filtered for base calling quality). Filtering out low-complexity would also be a big help, as a guess, considering the number of hits generated from your initial data.

There is also the "Parse blast XML output" tool. Modifying the data into interval format would allow the use of the "Operate on Genomic Intervals -> Cluster the intervals of a dataset". This is based on coverage, if that is one of your criteria (could be, if the threshold for identity is a range you consider to be candidate choices for "best"). Identity & coverage are commonly combined to identify "best", but this is just a suggestion. The same type of logic could be used with top scoring evalue matches combined with coverage (would likely be similar as using evalue alone, if the identity is set to be high).

The idea to add a filter for "single best" is a good one, but has some complexity associated with it. I will pass it along to the team as an enhancement request to consider.

Hopefully this helps!

Jen
Galaxy team

On 4/11/11 1:43 PM, Hsin-l (Sam) Chiang wrote:
Hi,

I used the Megablast function (in the NGS: Mapping\ROCHE-454\) to
analyze my FASTA sequences against nt database and it worked fine for
me. However, it generated 56,804 hits although my query has only 1000
sequences. I am wondering is there any way to suppress the number of
reported alignments to just one best hit per sequence?   (In the local
BLAST there are parameters such as -K1 -v 1 -b 1 to do so, but I can't
find similar options in Galaxy).

Many thanks!

Sam
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/

--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/

Reply via email to