On Mar 21, 2011, at 1:45 PM, JASON G. BANKERT wrote:
We're trying to only get hits of certain lengths. Is there a
setting to use that sets the minimum length for each hit?
The short answer is no, but I expect there are other tools in galaxy
that could do that filtering.
There are two reasons lastz doesn't provide filtering based on
length. First, there are three possible interpretations of what
"length" is, all equally valid. Should it be the length of the hit in
the reference, or in the read? Or should it be the number of
positions in the alignment? Second, even if there is no difference in
the three lengths, length is a poorer discriminator than the number of
matches. For example, a strict length cutoff of 100 would reject a
exact match of length 99 but keep a 90-match-10-mismatch hit.
I'm not familiar enough with galaxy to give you specific details of
how to filter by length. But if you choose tabular output from lastz
you should be able to use galaxy's "text manipulation" tools to
compute the length, then one of the "filter and sort" tools to discard
short alignments. Or, if you are using SAM output, it looks like you
could use "convert SAM to interval" in the "NGS: SAM Tools" group,
then compute the length and filter as above.
Hope that is helpful,
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
To manage your subscriptions to this and other Galaxy lists,
please use the interface at: