On Jan 7, 2013, at 10:19 PM, Cameron Jack wrote:

> Hi, I hope I'm writing to the right area (if not, please point me in the 
> right direction and I'll go on my way).
> 
> I have a set of tools that our team have built and been using for functional 
> genomics analysis, which I would like to integrate into Galaxy. 
> 
> My first concern is with the tests. Our code is unit tested but one script is 
> tasked with downloading gene sets from Ensembl. We have a local copy of much 
> of the Ensembl DB (5TB worth) and even this is fairly slow to pull from (5-10 
> mins). Pulling from the publicly available Ensembl site would be horrifically 
> slow. Since it's stated here: 
> http://wiki.galaxyproject.org/Admin/Tools/Writing%20Tests that functional 
> tests are required, I'm looking for suggestions as to how to meet this 
> requirement without taking up an hour of the test runner's time (for one 
> tool).

Hi Cameron,

Tests should be self-contained so that they can be run without requiring 
external network access.  You may want to consider providing your Ensembl data 
as locally cached reference data rather than downloaded by the tool itself.

If the tool needs that locally cached data to run tests, what we generally do 
is take the smallest possible subset of the reference data that will allow you 
to test proper tool operation and provide this with the tool as test data.

> My second question is if a file is produced by a script and given a specific 
> name based on its inputs, how should I (if at all) define the output of this 
> script.

Does the tool work from the UI?  For Galaxy to find a tool's output, it has to 
be written to the output dataset path provided by Galaxy when the job is run.  
For tools that hardcode output names there are a few options: the 
'from_work_dir' attribute on the output dataset tag, or having a wrapper script 
around the tool that does the appropriate move after the wrapped tool has 
finished executing.

> My last question is "how much functional testing is required?". I have a 
> script here whose job is to produce plots of data. There are about 30 
> command-line options and many of these have multiple choices available to 
> them. Do I need to provide functional tests for every one of these? I can 
> imagine having to provide 100 PNG files to compare my output again....

Coverage of all of the possible option combinations is not necessary.  Most 
tools have 2-4 tests, some will have more, but usually less than 10.  I would 
suggest testing the option combinations most likely to be used.

--nate

> I have searched for answers to the above (made much more tricky thanks to 
> Samsung), I don't wish to waste anyone's time.
> 
> Many thanks in advance,
> Cameron
> 
> 
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
> 
>  http://lists.bx.psu.edu/


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to