Re: [galaxy-dev] naming of history steps

2012-02-02 Thread Jeremy Goecks

> Ususally the most important information is the first and last step
> E.g.
> The TopHat run should be called 
> TopHat on SOLiD 24A
> 
> The alignment stats should be 
> SAM/BAM Summary Metrics of Solid 24A
> With the rest of the tools in the chain identified in the "more information" 
> box.
> 
> This would also give graph generating tools a fighting chance to present 
> something useful in any graphs generated.
> E.g.
> GC Bias Plot of Solid 24A could have a title of Solid 24A instead of 
> dataset_234.dat
> 
> What do you think of this first-last model?

This model breaks down during experimentation. E.g. let's say three different 
methods for trimming a FastQ dataset are tried before mapping with Bowtie. 
Currently, the Bowtie runs are named differently b/c each trimmed dataset is a 
unique input. Using first-last model, all datasets are named the same and it is 
not possible to differentiate b/t them without looking at the inputs, which 
requires clicking on the rerun/info button and finding the input(s). The 
current approach used by Galaxy lists the inputs in the dataset title to avoid 
these issues.

Datasets with the same name becomes more problematic as more steps are added 
b/t first and last because, while they have the same name, the steps taken to 
produce them may be very different.

The first-last model could be nice for workflows, though, perhaps as an 
extension of the "rename dataset" actions or a kind of global "rename dataset" 
action.

J.

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] naming of history steps

2012-02-01 Thread Paul Gordon
+1 votes :-)

> This would also give graph generating tools a fighting chance to present
> something useful in any graphs generated.
> E.g. GC Bias Plot of Solid 24A could have a title of Solid 24A instead of
> dataset_234.dat
> 
> What do you think of this first-last model?


___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] naming of history steps

2012-02-01 Thread Langhorst, Brad
Hi Jeremy and Ross

I agree that the current chaining mechanism would get very long after even a 
few steps.

Ususally the most important information is the first and last step
E.g.
The TopHat run should be called
TopHat on SOLiD 24A

The alignment stats should be
SAM/BAM Summary Metrics of Solid 24A
With the rest of the tools in the chain identified in the "more information" 
box.

This would also give graph generating tools a fighting chance to present 
something useful in any graphs generated.
E.g.
GC Bias Plot of Solid 24A could have a title of Solid 24A instead of 
dataset_234.dat

What do you think of this first-last model?

Brad

--
Brad Langhorst
New England Biolabs
langho...@neb.com



From: Jeremy Goecks mailto:jeremy.goe...@emory.edu>>
Date: Tue, 31 Jan 2012 09:00:38 -0500
To: Brad Langhorst mailto:langho...@neb.com>>
Cc: "galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>" 
mailto:galaxy-dev@lists.bx.psu.edu>>
Subject: Re: [galaxy-dev] naming of history steps

Brad,

But I think it might be a lot easier to manage if step names were based on the 
titles of the history items instead of "data 2" or whatever.

Has this been tried and rejected for some reason?

It's been tried and rejected because dataset names get very long and unwieldy. 
E.g. "Sam/Bam Alignment Summary Metrics on Tophat on Filter FASTQ on 
my_rna_seq_reads"

Would a pull request implementing this change be welcomed?

What we imagine would help is a way to easily show/find a dataset's analysis 
path -- its parents and its decendants -- so that it's possible to trace the 
datasets/tools used to create a dataset and the tools/datasets subsequently 
used.

This is something we'd like to do but haven't put much effort into yet. 
Community contributions in this space would be great.

Best,
J.

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] naming of history steps

2012-01-31 Thread Jeremy Goecks
Brad,

> But I think it might be a lot easier to manage if step names were based on 
> the titles of the history items instead of "data 2" or whatever. 
>  
> Has this been tried and rejected for some reason? 

It's been tried and rejected because dataset names get very long and unwieldy. 
E.g. "Sam/Bam Alignment Summary Metrics on Tophat on Filter FASTQ on 
my_rna_seq_reads"

> Would a pull request implementing this change be welcomed?

What we imagine would help is a way to easily show/find a dataset's analysis 
path -- its parents and its decendants -- so that it's possible to trace the 
datasets/tools used to create a dataset and the tools/datasets subsequently 
used. 

This is something we'd like to do but haven't put much effort into yet. 
Community contributions in this space would be great.

Best,
J.

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] naming of history steps

2012-01-30 Thread Langhorst, Brad
I'm having a tough time keeping track of which data is which after analysis...


I can do a bunch of work customizing each tool and each workflow, renaming 
results etc.
But I think it might be a lot easier to manage if step names were based on the 
titles of the history items instead of "data 2" or whatever.

Has this been tried and rejected for some reason?
Would a pull request implementing this change be welcomed?

Am I just "doing it wrong"?
Any suggestions are welcome.



Brad

--
Brad Langhorst
New England Biolabs
langho...@neb.com


___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/