Re: [galaxy-user] Lastz very slow

2011-12-14 Thread Bob Harris

Howdy, Andrew,

One possibility is that there are more jobs on the cluster.  I'm not  
familiar enough with the galaxy interface to know if there's an easy  
way to tell how much time the job sits on the cluster's queue waiting  
and how much time the job is actually running.


Assuming there's no cluster difference, are the data conditions  
similar between what you are mapping now vs two weeks ago?  I.e.  
number and length of reads, size of reference sequence, divergence,  
repeat content?


Bob H


On Dec 14, 2011, at 10:10 AM, Andrew South wrote:

Hi - is there a reason Lastz is very slow right now? I am mapping  
2-3,000, reads against a single, 11Kb, sequence and find that jobs  
are either returning an error or taking 16-24hrs to get done. This  
time frame was more like 30 minutes two weeks ago. Is there a way to  
speed this up? Thanks in advance for any help. Andy




Please consider the environment. Do you really need to print this  
email?


The University of Dundee is a registered Scottish charity, No:  
SC015096


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] Lastz very slow

2011-12-14 Thread Andrew South
Hi - is there a reason Lastz is very slow right now? I am mapping 2-3,000, 
reads against a single, 11Kb, sequence and find that jobs are either returning 
an error or taking 16-24hrs to get done. This time frame was more like 30 
minutes two weeks ago. Is there a way to speed this up? Thanks in advance for 
any help. Andy
 
 
 
Please consider the environment. Do you really need to print this email? 

The University of Dundee is a registered Scottish charity, No: SC015096
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Lastz

2011-11-14 Thread Nate Coraor
On Nov 12, 2011, at 6:30 AM, Andrew South wrote:

> Dear Nate -
>  
>  It seems I may have upset the cluster again: I have 3 Lastz jobs running and 
> cannot upload any more data. I had been uploading fine; started 3 lastz 
> mapping jobs (Fasta (1 sequence) vs Fasta (~3000 sequences) output as SNP, 
> when I realised I made a mistake, cancelled these jobs, uploaded the correct 
> files and then ran the 3 comparisons (similar files) again and now these jobs 
> have been running >1hr and I cannot upload any more data.

Hi Andy,

What are the details related to being unable to upload?  Are the datasets 
sitting in a certain state (uploading, queued, etc.)?  Did you have any other 
jobs running at the time?  There are some concurrent job limits that could have 
been responsible.

--nate

>  
> Thanks in advance,
> 
> Andy
>  
> 
> Please consider the environment. Do you really need to print this email?
> 
> 
> >>> Nate Coraor  11/11/2011 17:16 >>>
> On Nov 11, 2011, at 10:01 AM, Andrew South wrote:
> 
> > Thanks Nate, hope it's a quick fix. Best wishes, Andy
> 
> Hi Andy,
> 
> All backlogged NGS jobs should now be running, new ones will likely queue 
> until free slots are available.  Please let us know if there's further 
> trouble, and thanks for using Galaxy,
> 
> --nate
> 
> >  
> > 
> > Please consider the environment. Do you really need to print this email?
> > 
> > 
> > >>> Nate Coraor  11/11/2011 14:55 >>>
> > On Nov 11, 2011, at 9:21 AM, Andrew South wrote:
> > 
> > > Hello folks
> > >  
> > >  Anyone else having trouble with running Lastz to map?
> > >  
> > >  Jobs are being sent but not running.
> > >  
> > >  It stopped working for me two days ago after working perfectly, I've 
> > > tried fiddling with the formats but no joy.
> > 
> > Hi Andy,
> > 
> > It looks like there's a problem with the cluster that runs our NGS jobs.  
> > I'm currently looking into it.  Sorry for the inconvenience.
> > 
> > --nate
> > 
> > >  
> > > Thanks,
> > > 
> > > Andy
> > >  
> > >  
> > > Dr A P South 
> > > Centre for Oncology and Molecular Medicine
> > > University of Dundee
> > > Ninewells Hospital & Medical School
> > > Dundee
> > > DD1 9SY
> > > Tel 01382 496432
> > > Fax 01382 633952
> > >  
> > > a.p.so...@dundee.ac.uk
> > >  
> > >  
> > > 
> > > Please consider the environment. Do you really need to print this email?
> > > 
> > > The University of Dundee is a registered Scottish charity, No: SC015096
> > > 
> > > ___
> > > The Galaxy User list should be used for the discussion of
> > > Galaxy analysis and other features on the public server
> > > at usegalaxy.org.  Please keep all replies on the list by
> > > using "reply all" in your mail client.  For discussion of
> > > local Galaxy instances and the Galaxy source code, please
> > > use the Galaxy Development list:
> > > 
> > >  http://lists.bx.psu.edu/listinfo/galaxy-dev
> > > 
> > > To manage your subscriptions to this and other Galaxy lists,
> > > please use the interface at:
> > > 
> > >  http://lists.bx.psu.edu/
> > 
> > 
> > The University of Dundee is a registered Scottish charity, No: SC015096
> > 
> 
> 
> The University of Dundee is a registered Scottish charity, No: SC015096
> 


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] Lastz

2011-11-12 Thread Andrew South
Dear Nate - 
 
 It seems I may have upset the cluster again: I have 3 Lastz jobs running and 
cannot upload any more data. I had been uploading fine; started 3 lastz mapping 
jobs (Fasta (1 sequence) vs Fasta (~3000 sequences) output as SNP, when I 
realised I made a mistake, cancelled these jobs, uploaded the correct files and 
then ran the 3 comparisons (similar files) again and now these jobs have been 
running >1hr and I cannot upload any more data.
 
Thanks in advance,

Andy
 
 
Please consider the environment. Do you really need to print this email? 


>>> Nate Coraor  11/11/2011 17:16 >>>
On Nov 11, 2011, at 10:01 AM, Andrew South wrote:

> Thanks Nate, hope it's a quick fix. Best wishes, Andy

Hi Andy,

All backlogged NGS jobs should now be running, new ones will likely queue until 
free slots are available.  Please let us know if there's further trouble, and 
thanks for using Galaxy,

--nate

>  
> 
> Please consider the environment. Do you really need to print this email?
> 
> 
> >>> Nate Coraor  11/11/2011 14:55 >>>
> On Nov 11, 2011, at 9:21 AM, Andrew South wrote:
> 
> > Hello folks
> >  
> >  Anyone else having trouble with running Lastz to map?
> >  
> >  Jobs are being sent but not running.
> >  
> >  It stopped working for me two days ago after working perfectly, I've tried 
> > fiddling with the formats but no joy.
> 
> Hi Andy,
> 
> It looks like there's a problem with the cluster that runs our NGS jobs.  I'm 
> currently looking into it.  Sorry for the inconvenience.
> 
> --nate
> 
> >  
> > Thanks,
> > 
> > Andy
> >  
> >  
> > Dr A P South 
> > Centre for Oncology and Molecular Medicine
> > University of Dundee
> > Ninewells Hospital & Medical School
> > Dundee
> > DD1 9SY
> > Tel 01382 496432
> > Fax 01382 633952
> >  
> > a.p.so...@dundee.ac.uk
> >  
> >  
> > 
> > Please consider the environment. Do you really need to print this email?
> > 
> > The University of Dundee is a registered Scottish charity, No: SC015096
> > 
> > ___
> > The Galaxy User list should be used for the discussion of
> > Galaxy analysis and other features on the public server
> > at usegalaxy.org.  Please keep all replies on the list by
> > using "reply all" in your mail client.  For discussion of
> > local Galaxy instances and the Galaxy source code, please
> > use the Galaxy Development list:
> > 
> >  http://lists.bx.psu.edu/listinfo/galaxy-dev
> > 
> > To manage your subscriptions to this and other Galaxy lists,
> > please use the interface at:
> > 
> >  http://lists.bx.psu.edu/
> 
> 
> The University of Dundee is a registered Scottish charity, No: SC015096
> 


The University of Dundee is a registered Scottish charity, No: SC015096
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Lastz

2011-11-11 Thread Nate Coraor
On Nov 11, 2011, at 10:01 AM, Andrew South wrote:

> Thanks Nate, hope it's a quick fix. Best wishes, Andy

Hi Andy,

All backlogged NGS jobs should now be running, new ones will likely queue until 
free slots are available.  Please let us know if there's further trouble, and 
thanks for using Galaxy,

--nate

>  
> 
> Please consider the environment. Do you really need to print this email?
> 
> 
> >>> Nate Coraor  11/11/2011 14:55 >>>
> On Nov 11, 2011, at 9:21 AM, Andrew South wrote:
> 
> > Hello folks
> >  
> >  Anyone else having trouble with running Lastz to map?
> >  
> >  Jobs are being sent but not running.
> >  
> >  It stopped working for me two days ago after working perfectly, I've tried 
> > fiddling with the formats but no joy.
> 
> Hi Andy,
> 
> It looks like there's a problem with the cluster that runs our NGS jobs.  I'm 
> currently looking into it.  Sorry for the inconvenience.
> 
> --nate
> 
> >  
> > Thanks,
> > 
> > Andy
> >  
> >  
> > Dr A P South 
> > Centre for Oncology and Molecular Medicine
> > University of Dundee
> > Ninewells Hospital & Medical School
> > Dundee
> > DD1 9SY
> > Tel 01382 496432
> > Fax 01382 633952
> >  
> > a.p.so...@dundee.ac.uk
> >  
> >  
> > 
> > Please consider the environment. Do you really need to print this email?
> > 
> > The University of Dundee is a registered Scottish charity, No: SC015096
> > 
> > ___
> > The Galaxy User list should be used for the discussion of
> > Galaxy analysis and other features on the public server
> > at usegalaxy.org.  Please keep all replies on the list by
> > using "reply all" in your mail client.  For discussion of
> > local Galaxy instances and the Galaxy source code, please
> > use the Galaxy Development list:
> > 
> >  http://lists.bx.psu.edu/listinfo/galaxy-dev
> > 
> > To manage your subscriptions to this and other Galaxy lists,
> > please use the interface at:
> > 
> >  http://lists.bx.psu.edu/
> 
> 
> The University of Dundee is a registered Scottish charity, No: SC015096
> 


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] Lastz

2011-11-11 Thread Nate Coraor
On Nov 11, 2011, at 9:21 AM, Andrew South wrote:

> Hello folks
>  
>  Anyone else having trouble with running Lastz to map?
>  
>  Jobs are being sent but not running.
>  
>  It stopped working for me two days ago after working perfectly, I've tried 
> fiddling with the formats but no joy.

Hi Andy,

It looks like there's a problem with the cluster that runs our NGS jobs.  I'm 
currently looking into it.  Sorry for the inconvenience.

--nate

>  
> Thanks,
> 
> Andy
>  
>  
> Dr A P South 
> Centre for Oncology and Molecular Medicine
> University of Dundee
> Ninewells Hospital & Medical School
> Dundee
> DD1 9SY
> Tel 01382 496432
> Fax 01382 633952
>  
> a.p.so...@dundee.ac.uk
>  
>  
> 
> Please consider the environment. Do you really need to print this email?
> 
> The University of Dundee is a registered Scottish charity, No: SC015096
> 
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-user] Lastz

2011-11-11 Thread Andrew South
Hello folks
 
 Anyone else having trouble with running Lastz to map? 
 
 Jobs are being sent but not running. 
 
 It stopped working for me two days ago after working perfectly, I've tried 
fiddling with the formats but no joy.
 
Thanks,

Andy
 
 
Dr A P South 
Centre for Oncology and Molecular Medicine
University of Dundee
Ninewells Hospital & Medical School
Dundee
DD1 9SY
Tel 01382 496432
Fax 01382 633952
 
a.p.so...@dundee.ac.uk 
 
 
 
Please consider the environment. Do you really need to print this email? 

The University of Dundee is a registered Scottish charity, No: SC015096
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] LASTZ arguments in "Full parameter list"-mode

2011-06-07 Thread Appelt, Uwe
Thanks, Bob, for your reply and also for pointing to the correct
chapters in the lastz-readme! Having said that, it's great to hear
the wrapper is scheduled to be extended/improved.

> It's not clear whether the wrapper would be better if it allowed
> each scoring option to be set as a separate field, or if they were
> incorporated in a file, or if (somehow) they were one of your
> galaxy history items. Do you have any thoughts on that?

I'm a bit curious. On the one hand, it'd be truely convenient to
have all arguments specified just by submitting a "lastz-config"-
history-item to the wrapper. On the other hand, I assume most
galaxy users are sticking to YASRA-modes and only those with special
needs would go for custom-configs. I, for example, will have to
include lastz-mapping in a workflow and will save my desired
lastz-config within the according workflow item. In this regard, it
wouldn't be too nice, if I had to have a config-file handy (anyone
else willing to execute this workflow would need the config file as
well, right?!).

Thanks again!
Uwe

-Ursprüngliche Nachricht-
Von: Bob Harris [mailto:rshar...@bx.psu.edu] 
Gesendet: Montag, 6. Juni 2011 18:06
An: Appelt, Uwe
Cc: galaxy-user@lists.bx.psu.edu
Betreff: Re: [galaxy-user] LASTZ arguments in "Full parameter list"-mode

Howdy, Uwe,

(I don't speak for the galaxy group, but I watch the list for lastz- 
related messages.)

> Is it possible to specify more arguments in "Full parameter list"- 
> mode, particularly those that're implicitely described in YASRA-mode  
> (match/mismatch rewards and step-size (Z/step))?

As you indicate, the galaxy wrapper for lastz only provides access to  
certain combinations of arguments.  At present, the only ways I know  
that you would be able to access other lastz parameters would be  
either to (1) write your own custom galaxy wrapper or (2) install  
lastz and run it from the command line.

Kelly Vincent is planning to update the galaxy wrapper some time this  
summer, so this is something she's likely to address with that update.

> However, heading to "Full parameter list"-mode leads to excessive  
> CPU-use, because (presumably) the step-size (default=1?) is left  
> unset.

You are correct that the default step is 1.  (defaults are shown at  
the bottom of each command section in the lastz readme file which can  
be found at www.bx.psu.edu/~rsharris/lastz or at the Miller Lab  
website)  Depending on your data, a step of 1 may be overly sensitive  
causing lastz to spend a more time than necessary.

> Also, it'd be great to be able to alter not only gap penalties, but  
> match/mismatch rewards as well. (Obviously a mismatch would cause  
> the same problem).


Ideally I think the wrapper interface should allow you to point at a  
lastz scoring file, which can contain all the scoring parameters.   
This is an oversight-- the thinking when the wrapper was written was  
that the yasra settings would be sufficient (and work well) and would  
simplify the user's choices, making it more likely that the user would  
choose settings appropriate for their data.

I've discussed this with Kelly some, this morning.  It's not clear  
whether the wrapper would be better if it allowed each scoring option  
to be set as a separate field, or if they were incorporated in a file,  
or if (somehow) they were one of your galaxy history items.  Do you  
have any thoughts on that?

> I'm dealing with 454 reads and it's crucial to the scenario to have  
> the 5' ends aligned properly (in terms of sensitivity), so all the  
> YASRA-templates comprising heavy gap penalties perform fairly poor  
> as soon as there's a gap nearby the 5' end.

The idea behind having such severe gap penalties is that 454 often  
incorrectly calls the length of homopolymer runs, introducing what  
will look like short gaps.  As used within yasra (an assembler from  
the Miller Lab, not in galaxy) these settings are probably  
appropriate.  But this is not an ideal general solution because it can  
keep us from discovering true gaps.  A better solution would probably  
be to have less-severe gap penalties, with an additional context- 
related gap penalty (or reward) for gaps at homopolymer runs.   
However, it would be costly to add this to the alignment core inside  
lastz.

The problem with gaps or mismatches close to the end of a read is  
discussed in the lastz readme file, in the section "Y-drop Mismatch  
Shadow".  The situation can can be improved by using the --noytrim  
option.  This option was added to lastz after the wrapper was written,  
and so is not currently available from galaxy.  --noytrim tells lastz  
to accept a lower-than-maximal-scoring alignment if it can reach the  
end of the read.  I intend to add that into the yasra settings, but  
cha

Re: [galaxy-user] LASTZ arguments in "Full parameter list"-mode

2011-06-06 Thread Bob Harris

Howdy, Uwe,

(I don't speak for the galaxy group, but I watch the list for lastz- 
related messages.)


Is it possible to specify more arguments in "Full parameter list"- 
mode, particularly those that're implicitely described in YASRA-mode  
(match/mismatch rewards and step-size (Z/step))?


As you indicate, the galaxy wrapper for lastz only provides access to  
certain combinations of arguments.  At present, the only ways I know  
that you would be able to access other lastz parameters would be  
either to (1) write your own custom galaxy wrapper or (2) install  
lastz and run it from the command line.


Kelly Vincent is planning to update the galaxy wrapper some time this  
summer, so this is something she's likely to address with that update.


However, heading to "Full parameter list"-mode leads to excessive  
CPU-use, because (presumably) the step-size (default=1?) is left  
unset.


You are correct that the default step is 1.  (defaults are shown at  
the bottom of each command section in the lastz readme file which can  
be found at www.bx.psu.edu/~rsharris/lastz or at the Miller Lab  
website)  Depending on your data, a step of 1 may be overly sensitive  
causing lastz to spend a more time than necessary.


Also, it'd be great to be able to alter not only gap penalties, but  
match/mismatch rewards as well. (Obviously a mismatch would cause  
the same problem).



Ideally I think the wrapper interface should allow you to point at a  
lastz scoring file, which can contain all the scoring parameters.   
This is an oversight-- the thinking when the wrapper was written was  
that the yasra settings would be sufficient (and work well) and would  
simplify the user's choices, making it more likely that the user would  
choose settings appropriate for their data.


I've discussed this with Kelly some, this morning.  It's not clear  
whether the wrapper would be better if it allowed each scoring option  
to be set as a separate field, or if they were incorporated in a file,  
or if (somehow) they were one of your galaxy history items.  Do you  
have any thoughts on that?


I'm dealing with 454 reads and it's crucial to the scenario to have  
the 5' ends aligned properly (in terms of sensitivity), so all the  
YASRA-templates comprising heavy gap penalties perform fairly poor  
as soon as there's a gap nearby the 5' end.


The idea behind having such severe gap penalties is that 454 often  
incorrectly calls the length of homopolymer runs, introducing what  
will look like short gaps.  As used within yasra (an assembler from  
the Miller Lab, not in galaxy) these settings are probably  
appropriate.  But this is not an ideal general solution because it can  
keep us from discovering true gaps.  A better solution would probably  
be to have less-severe gap penalties, with an additional context- 
related gap penalty (or reward) for gaps at homopolymer runs.   
However, it would be costly to add this to the alignment core inside  
lastz.


The problem with gaps or mismatches close to the end of a read is  
discussed in the lastz readme file, in the section "Y-drop Mismatch  
Shadow".  The situation can can be improved by using the --noytrim  
option.  This option was added to lastz after the wrapper was written,  
and so is not currently available from galaxy.  --noytrim tells lastz  
to accept a lower-than-maximal-scoring alignment if it can reach the  
end of the read.  I intend to add that into the yasra settings, but  
changing those raises an issue of backward compatibility that I need  
to resolve.


I hope that helps.  Please post a reply if I've left something  
unanswered or if you have other thoughts on this.


Bob H


On Jun 6, 2011, at 6:47 AM, Appelt, Uwe wrote:


Hi @All,

short version:
Is it possible to specify more arguments in "Full parameter list"- 
mode, particularly those that're implicitely described in YASRA-mode  
(match/mismatch rewards and step-size (Z/step))?


long version:
I'm dealing with 454 reads and it's crucial to the scenario to have  
the 5' ends aligned properly (in terms of sensitivity), so all the  
YASRA-templates comprising heavy gap penalties perform fairly poor  
as soon as there's a gap nearby the 5' end. However, heading to  
"Full parameter list"-mode leads to excessive CPU-use, because  
(presumably) the step-size (default=1?) is left unset. Also, it'd be  
great to be able to alter not only gap penalties, but match/mismatch  
rewards as well. (Obviously a mismatch would cause the same problem).


Examples:
AA (target)
AT (query)

=> I need the alignment boundary to be found at position 1, rather  
than behind the mismatched T and this cleary doesn't work as long as  
the mismatch penalty is too large (same applies for gap-open/extend  
penalties).


Thanks in advance and Cheers,
Uwe
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the publi

[galaxy-user] LASTZ arguments in "Full parameter list"-mode

2011-06-06 Thread Appelt, Uwe
Hi @All,

short version:
Is it possible to specify more arguments in "Full parameter list"-mode, 
particularly those that're implicitely described in YASRA-mode (match/mismatch 
rewards and step-size (Z/step))?

long version:
I'm dealing with 454 reads and it's crucial to the scenario to have the 5' ends 
aligned properly (in terms of sensitivity), so all the YASRA-templates 
comprising heavy gap penalties perform fairly poor as soon as there's a gap 
nearby the 5' end. However, heading to "Full parameter list"-mode leads to 
excessive CPU-use, because (presumably) the step-size (default=1?) is left 
unset. Also, it'd be great to be able to alter not only gap penalties, but 
match/mismatch rewards as well. (Obviously a mismatch would cause the same 
problem).

Examples:
AA (target)
AT (query)

=> I need the alignment boundary to be found at position 1, rather than behind 
the mismatched T and this cleary doesn't work as long as the mismatch penalty 
is too large (same applies for gap-open/extend penalties).

Thanks in advance and Cheers,
Uwe
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] LASTZ: Controlling Length of Hits

2011-03-22 Thread Bossers, Alex
Hi
there is an example in the windshield splatter analysis using galaxy where for 
metagenomics they filter their data on hit length related to the initial 
individual subject sequence length in megablast.
In simple steps (from memory so don't shoot me :) ) all in galaxy assuming 
fasta input;
1) upload fasta
2) compute sequence lengths on 1
3) on set 1 perform megablast (or whatever) that give a hit length
4) combine 2 and 3 on basis of unique seqname
5) use the filter tool to filter on hitlngth collumn divided by original length 
collumn (in the example > 50% hitlength)
6) strip additional collumns of length to return a valid megablast or lastZ 
file

You can save the history as a workflow for repetive use.

something like this you were looking for?

The video is in the screencasts sections using 454 data and megabast...but it 
looks similar to your question...

Alex


Van: galaxy-user-boun...@lists.bx.psu.edu 
[galaxy-user-boun...@lists.bx.psu.edu] namens Bob Harris [rshar...@bx.psu.edu]
Verzonden: maandag 21 maart 2011 22:04
Aan: JASON G. BANKERT
CC: galaxy-u...@bx.psu.edu
Onderwerp: Re: [galaxy-user] LASTZ: Controlling Length of Hits

On Mar 21, 2011, at 1:45 PM, JASON G. BANKERT wrote:
> We're trying to only get hits of certain lengths.  Is there a
> setting to use that sets the minimum length for each hit?

Howdy, Jason,

Lastz (the underlying program) has some options that are geared toward
filtering by length, though none uses length exactly.  In the lastz
wrapper for galaxy, the only length-relevant filtering option is "Do
not report matches that cover less than this percentage of each
read".  If your reads are all the same length, or close to the same
length, this could meet your needs.  If the length distribution of
your reads is pretty wide (as can occur with 454), then probably not.

I'm not familiar with all the rest of the galaxy toolset, but it seems
like there's bound to be a tool that can compute interval length from
the interval's start and end, and then filter on that.

Bob H


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] LASTZ: Controlling Length of Hits

2011-03-22 Thread Bob Harris

On Mar 21, 2011, at 1:45 PM, JASON G. BANKERT wrote:
We're trying to only get hits of certain lengths.  Is there a  
setting to use that sets the minimum length for each hit?


The short answer is no, but I expect there are other tools in galaxy  
that could do that filtering.


There are two reasons lastz doesn't provide filtering based on  
length.  First, there are three possible interpretations of what  
"length" is, all equally valid.  Should it be the length of the hit in  
the reference, or in the read?  Or should it be the number of  
positions in the alignment?  Second, even if there is no difference in  
the three lengths, length is a poorer discriminator than the number of  
matches.  For example, a strict length cutoff of 100 would reject a  
exact match of length 99 but keep a 90-match-10-mismatch hit.


I'm not familiar enough with galaxy to give you specific details of  
how to filter by length.  But if you choose tabular output from lastz  
you should be able to use galaxy's "text manipulation" tools to  
compute the length, then one of the "filter and sort" tools to discard  
short alignments.  Or, if you are using SAM output, it looks like you  
could use "convert SAM to interval" in the "NGS: SAM Tools" group,  
then compute the length and filter as above.


Hope that is helpful,
Bob H

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-user] LASTZ: Controlling Length of Hits

2011-03-21 Thread Bob Harris

On Mar 21, 2011, at 1:45 PM, JASON G. BANKERT wrote:
We're trying to only get hits of certain lengths.  Is there a  
setting to use that sets the minimum length for each hit?


Howdy, Jason,

Lastz (the underlying program) has some options that are geared toward  
filtering by length, though none uses length exactly.  In the lastz  
wrapper for galaxy, the only length-relevant filtering option is "Do  
not report matches that cover less than this percentage of each  
read".  If your reads are all the same length, or close to the same  
length, this could meet your needs.  If the length distribution of  
your reads is pretty wide (as can occur with 454), then probably not.


I'm not familiar with all the rest of the galaxy toolset, but it seems  
like there's bound to be a tool that can compute interval length from  
the interval's start and end, and then filter on that.


Bob H


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/


[galaxy-user] LASTZ: Controlling Length of Hits

2011-03-21 Thread JASON G. BANKERT
We're trying to only get hits of certain lengths.  Is there a setting to use
that sets the minimum length for each hit?






___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/