Re: [galaxy-dev] Twill compares wrong file; was: Twill comparisons ignore GALAXY_TEST_NO_CLEANUP

2014-02-13 Thread Peter Cock
On Thu, Feb 13, 2014 at 7:51 AM, Björn Grüning
bjoern.gruen...@gmail.com wrote:
 Am Mittwoch, den 12.02.2014, 10:46 + schrieb Peter Cock:

 ...

 However, at only 5kb the FASTA file is small and fine to bundle -
 but the BAM (49kb), log (123kb) and MAF (764kb) quickly add
 up so I would prefer not to have to include these (under source
 code control, and for the Tool Shed upload). Also, as you might
 guess from the large number of allowed differences, the log
 file is quite variable (lots of time stamps plus Galaxy's temp
 filenames appear).

 Do you have any advise which filesize is feasible and at which point we
 should skip the tests? I'm working currently on MACS2 wrappers and to
 have reasonable tests I need up to 50mb ob test files.

I try to use small files under say 10kb if possible - partly as smaller
unit tests are often easier to diagnose when they break, but also
as noted above to reduce overhead for version control, and the
ToolShed upload/download time.

50mb does sound excessive, but if that's the minimum maybe it
is still better than no test at all?

 Providing gz
 input files would be one option to get it down to 10mb, but that does
 not work and it's still 10mb ...

I thought the test framework would allow you to provide gzipped
input since it uses the upload tool internally which would handle
this. If that breaks I'd consider it a bug.

Gzipped output files might be more difficult, since the validation
is based on direct file comparison.

Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Twill compares wrong file; was: Twill comparisons ignore GALAXY_TEST_NO_CLEANUP

2014-02-13 Thread Daniel Blankenberg
Hi all,

I just wanted to point out that there are several different ways to compare the 
history output item to a test file beyond the default “diff”, including 
contains, re_match, sim_size, etc, this is set by the “compare” attribute in 
the xml tag for the test output.  If you really don’t care about the contents 
of an output, you could, for example, use contains against an empty file.

Sorry that I didn’t provide more specific examples, as I am at a conference 
this week.


Thanks for using Galaxy,

Dan


On Feb 13, 2014, at 5:06 AM, Peter Cock p.j.a.c...@googlemail.com wrote:

 On Thu, Feb 13, 2014 at 7:51 AM, Björn Grüning
 bjoern.gruen...@gmail.com wrote:
 Am Mittwoch, den 12.02.2014, 10:46 + schrieb Peter Cock:
 
 ...
 
 However, at only 5kb the FASTA file is small and fine to bundle -
 but the BAM (49kb), log (123kb) and MAF (764kb) quickly add
 up so I would prefer not to have to include these (under source
 code control, and for the Tool Shed upload). Also, as you might
 guess from the large number of allowed differences, the log
 file is quite variable (lots of time stamps plus Galaxy's temp
 filenames appear).
 
 Do you have any advise which filesize is feasible and at which point we
 should skip the tests? I'm working currently on MACS2 wrappers and to
 have reasonable tests I need up to 50mb ob test files.
 
 I try to use small files under say 10kb if possible - partly as smaller
 unit tests are often easier to diagnose when they break, but also
 as noted above to reduce overhead for version control, and the
 ToolShed upload/download time.
 
 50mb does sound excessive, but if that's the minimum maybe it
 is still better than no test at all?
 
 Providing gz
 input files would be one option to get it down to 10mb, but that does
 not work and it's still 10mb ...
 
 I thought the test framework would allow you to provide gzipped
 input since it uses the upload tool internally which would handle
 this. If that breaks I'd consider it a bug.
 
 Gzipped output files might be more difficult, since the validation
 is based on direct file comparison.
 
 Peter
 
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/
 
 To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
 


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Twill compares wrong file; was: Twill comparisons ignore GALAXY_TEST_NO_CLEANUP

2014-02-13 Thread Peter Cock
On Thu, Feb 13, 2014 at 1:58 PM, Daniel Blankenberg d...@bx.psu.edu wrote:
 Hi all,

 I just wanted to point out that there are several different ways to
 compare the history output item to a test file beyond the default
 diff, including contains, re_match, sim_size, etc, this is set by
 the compare attribute in the xml tag for the test output.  If you
 really don't care about the contents of an output, you could, for
 example, use contains against an empty file.

 Sorry that I didn't provide more specific examples, as I am at
 a conference this week.


Thanks Dan - I don't think I'd seen the contains tag before,
and it isn't on the wiki yet:

https://wiki.galaxyproject.org/Admin/Tools/Writing%20Tests

Contains an empty file is a rather sneaky hack - I like it:

  test
param name=job_type value=genome /
param name=job_quality value=accurate /
param name=type value=none /
param name=filenames value=ecoli.fastq ftype=fastqsanger /
output name=out_fasta file=ecoli.mira4_de_novo.fasta
ftype=fasta /
output name=out_bam file=empty_file.dat compare=contains /
output name=out_maf file=empty_file.dat compare=contains /
output name=out_log file=empty_file.dat compare=contains /
/test

It will do in the short term until the test framework supports
omitting some of the output test files... or I  could switch to
providing a snippet of data from each file.

Thanks,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Twill compares wrong file; was: Twill comparisons ignore GALAXY_TEST_NO_CLEANUP

2014-02-13 Thread Greg Von Kuster
Hello Peter,

I'm also sure we could enhance hte twill code to use the dictionary approach 
used in the API based testing.  However, pervasive use of Javascript is being 
introduced into the Galaxy code base at a fast pace and twill does not work 
with Javascript.  So clearly the use of twill for functional testing in Galaxy 
will have to be eliminated although I'm not aware of any formal plans to do so. 
  All of the tool shed functional tests and the tool shed's install and test 
framework use twill as well, so I'll be working to eliminate its use in those 
frameworks as soon as possible.  

John has been working on some pieces of the puzzle for eliminating twill (e.g., 
the API based testing), but I assume there is significant work left to 
completely eliminate the use of twill.  

So perhaps we should plan to continue to add enhancements like this to the 
twill framework even though we'll have to soon eliminate it due to Javascript.  
I'm not sure if others agree with this though.

Greg Von Kuster


On Feb 12, 2014, at 5:46 AM, Peter Cock p.j.a.c...@googlemail.com wrote:

 
 I think it ought be possible to refactor the Twill test code to use
 the same dictionary approach used in the API based testing -
 and so support listing the test output in any order and only
 testing some of the files.
 
 However, I'll wait and see what you guys have to say.
 
 Peter.
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/
 
 To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Twill compares wrong file; was: Twill comparisons ignore GALAXY_TEST_NO_CLEANUP

2014-02-13 Thread Björn Grüning
Hi Dan,

 Hi all,
 
 I just wanted to point out that there are several different ways to compare 
 the history output item to a test file beyond the default “diff”, including 
 contains, re_match, sim_size, etc, this is set by the “compare” attribute in 
 the xml tag for the test output.  If you really don’t care about the contents 
 of an output, you could, for example, use contains against an empty file.

That is indeed a nice trick! It is potentially a way to only compare the
first 1000 lines, assuming that the rest is identical.
Unfortunately, I'm not able to complete the MACS2 tests, because I was
not able to get conditionals of conditionals working. Is that a known
bug?
As I understood Greg correctly, we should not invest much more time in
the twill testing and are waiting for the new API based tests? Is that
correct?

Thanks,
Bjoern

 Sorry that I didn’t provide more specific examples, as I am at a conference 
 this week.
 
 
 Thanks for using Galaxy,
 
 Dan
 
 
 On Feb 13, 2014, at 5:06 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 
  On Thu, Feb 13, 2014 at 7:51 AM, Björn Grüning
  bjoern.gruen...@gmail.com wrote:
  Am Mittwoch, den 12.02.2014, 10:46 + schrieb Peter Cock:
  
  ...
  
  However, at only 5kb the FASTA file is small and fine to bundle -
  but the BAM (49kb), log (123kb) and MAF (764kb) quickly add
  up so I would prefer not to have to include these (under source
  code control, and for the Tool Shed upload). Also, as you might
  guess from the large number of allowed differences, the log
  file is quite variable (lots of time stamps plus Galaxy's temp
  filenames appear).
  
  Do you have any advise which filesize is feasible and at which point we
  should skip the tests? I'm working currently on MACS2 wrappers and to
  have reasonable tests I need up to 50mb ob test files.
  
  I try to use small files under say 10kb if possible - partly as smaller
  unit tests are often easier to diagnose when they break, but also
  as noted above to reduce overhead for version control, and the
  ToolShed upload/download time.
  
  50mb does sound excessive, but if that's the minimum maybe it
  is still better than no test at all?
  
  Providing gz
  input files would be one option to get it down to 10mb, but that does
  not work and it's still 10mb ...
  
  I thought the test framework would allow you to provide gzipped
  input since it uses the upload tool internally which would handle
  this. If that breaks I'd consider it a bug.
  
  Gzipped output files might be more difficult, since the validation
  is based on direct file comparison.
  
  Peter
  
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/
  
  To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
  
 


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Twill compares wrong file; was: Twill comparisons ignore GALAXY_TEST_NO_CLEANUP

2014-02-13 Thread Greg Von Kuster
Hi Björn,

I think twill is going to be around for a while yet, but it is quickly becoming 
more critical to eliminate it.  I haven't yet looked at Johns tests using the 
API, but my understanding is that they are focused on tools.  We're using twill 
a lot in other areas besides tools, so it will require some work to eliminate 
it completely.  Some enhancements to our twill framework will undoubtedly be 
necessaruy form some time.  However, non-trivial enhancements should be thought 
about carefully.

Thanks!

Greg Von Kuster

On Feb 13, 2014, at 5:46 PM, Björn Grüning bjoern.gruen...@gmail.com wrote:

 Hi Dan,
 
 Hi all,
 
 I just wanted to point out that there are several different ways to compare 
 the history output item to a test file beyond the default “diff”, including 
 contains, re_match, sim_size, etc, this is set by the “compare” attribute in 
 the xml tag for the test output.  If you really don’t care about the 
 contents of an output, you could, for example, use contains against an empty 
 file.
 
 That is indeed a nice trick! It is potentially a way to only compare the
 first 1000 lines, assuming that the rest is identical.
 Unfortunately, I'm not able to complete the MACS2 tests, because I was
 not able to get conditionals of conditionals working. Is that a known
 bug?
 As I understood Greg correctly, we should not invest much more time in
 the twill testing and are waiting for the new API based tests? Is that
 correct?
 
 Thanks,
 Bjoern
 
 Sorry that I didn’t provide more specific examples, as I am at a conference 
 this week.
 
 
 Thanks for using Galaxy,
 
 Dan
 
 
 On Feb 13, 2014, at 5:06 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 
 On Thu, Feb 13, 2014 at 7:51 AM, Björn Grüning
 bjoern.gruen...@gmail.com wrote:
 Am Mittwoch, den 12.02.2014, 10:46 + schrieb Peter Cock:
 
 ...
 
 However, at only 5kb the FASTA file is small and fine to bundle -
 but the BAM (49kb), log (123kb) and MAF (764kb) quickly add
 up so I would prefer not to have to include these (under source
 code control, and for the Tool Shed upload). Also, as you might
 guess from the large number of allowed differences, the log
 file is quite variable (lots of time stamps plus Galaxy's temp
 filenames appear).
 
 Do you have any advise which filesize is feasible and at which point we
 should skip the tests? I'm working currently on MACS2 wrappers and to
 have reasonable tests I need up to 50mb ob test files.
 
 I try to use small files under say 10kb if possible - partly as smaller
 unit tests are often easier to diagnose when they break, but also
 as noted above to reduce overhead for version control, and the
 ToolShed upload/download time.
 
 50mb does sound excessive, but if that's the minimum maybe it
 is still better than no test at all?
 
 Providing gz
 input files would be one option to get it down to 10mb, but that does
 not work and it's still 10mb ...
 
 I thought the test framework would allow you to provide gzipped
 input since it uses the upload tool internally which would handle
 this. If that breaks I'd consider it a bug.
 
 Gzipped output files might be more difficult, since the validation
 is based on direct file comparison.
 
 Peter
 
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 http://lists.bx.psu.edu/
 
 To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/
 
 
 
 
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/
 
 To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
 


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Twill compares wrong file; was: Twill comparisons ignore GALAXY_TEST_NO_CLEANUP

2014-02-12 Thread Peter Cock
On Tue, Feb 11, 2014 at 5:09 PM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Tue, Feb 11, 2014 at 4:42 PM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Tue, Feb 11, 2014 at 3:41 PM, Dave Bouvier d...@bx.psu.edu wrote:
 Peter,

 Thanks for pointing this out, I've applied your changes in
 12478:15fc8675064e on the default branch.

--Dave B.

 Thanks Dave,

 RE: http://lists.bx.psu.edu/pipermail/galaxy-dev/2014-February/018154.html
 https://bitbucket.org/galaxy/galaxy-central/commits/15fc8675064ea46b7e081d9643393be354f07d65

 I'm still digging into the underlying problem, by focussing on just
 one unit test ...

 As a workaround, I could probably provide expected data for all
 four output files...

Confirmed, providing a reference file for each output works:

test
param name=job_type value=genome /
param name=job_quality value=accurate /
param name=type value=none /
param name=filenames value=ecoli.fastq ftype=fastqsanger /
output name=out_fasta file=ecoli.mira4_de_novo.fasta
ftype=fasta /
output name=out_bam file=ecoli.mira4_de_novo.bam ftype=bam /
output name=out_maf file=ecoli.mira4_de_novo.mira
ftype=mira /
output name=out_log file=ecoli.mira4_de_novo.log
ftype=txt lines_diff=2000 /
/test

However, at only 5kb the FASTA file is small and fine to bundle -
but the BAM (49kb), log (123kb) and MAF (764kb) quickly add
up so I would prefer not to have to include these (under source
code control, and for the Tool Shed upload). Also, as you might
guess from the large number of allowed differences, the log
file is quite variable (lots of time stamps plus Galaxy's temp
filenames appear).

 I've been talking to John (off list), he will reply here later, but in
 the meantime pointed me at a relevant commit,
 https://bitbucket.org/galaxy/galaxy-central/commits/d05be33ad6b772c2a688d875c4cf895d6afac4e3

 Basically the current Twill tests are extremely fragile and expect the
 test output files in the same order used in the outputs definitions
 and also there must be sample data for ALL the output files.

 (I may have hit this bug before, I forget now).

I think it ought be possible to refactor the Twill test code to use
the same dictionary approach used in the API based testing -
and so support listing the test output in any order and only
testing some of the files.

However, I'll wait and see what you guys have to say.

Peter.
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Twill compares wrong file; was: Twill comparisons ignore GALAXY_TEST_NO_CLEANUP

2014-02-12 Thread Björn Grüning
Am Mittwoch, den 12.02.2014, 10:46 + schrieb Peter Cock:
 On Tue, Feb 11, 2014 at 5:09 PM, Peter Cock p.j.a.c...@googlemail.com wrote:
  On Tue, Feb 11, 2014 at 4:42 PM, Peter Cock p.j.a.c...@googlemail.com 
  wrote:
  On Tue, Feb 11, 2014 at 3:41 PM, Dave Bouvier d...@bx.psu.edu wrote:
  Peter,
 
  Thanks for pointing this out, I've applied your changes in
  12478:15fc8675064e on the default branch.
 
 --Dave B.
 
  Thanks Dave,
 
  RE: http://lists.bx.psu.edu/pipermail/galaxy-dev/2014-February/018154.html
  https://bitbucket.org/galaxy/galaxy-central/commits/15fc8675064ea46b7e081d9643393be354f07d65
 
  I'm still digging into the underlying problem, by focussing on just
  one unit test ...
 
  As a workaround, I could probably provide expected data for all
  four output files...
 
 Confirmed, providing a reference file for each output works:
 
 test
 param name=job_type value=genome /
 param name=job_quality value=accurate /
 param name=type value=none /
 param name=filenames value=ecoli.fastq ftype=fastqsanger /
 output name=out_fasta file=ecoli.mira4_de_novo.fasta
 ftype=fasta /
 output name=out_bam file=ecoli.mira4_de_novo.bam ftype=bam 
 /
 output name=out_maf file=ecoli.mira4_de_novo.mira
 ftype=mira /
 output name=out_log file=ecoli.mira4_de_novo.log
 ftype=txt lines_diff=2000 /
 /test
 
 However, at only 5kb the FASTA file is small and fine to bundle -
 but the BAM (49kb), log (123kb) and MAF (764kb) quickly add
 up so I would prefer not to have to include these (under source
 code control, and for the Tool Shed upload). Also, as you might
 guess from the large number of allowed differences, the log
 file is quite variable (lots of time stamps plus Galaxy's temp
 filenames appear).

Do you have any advise which filesize is feasible and at which point we
should skip the tests? I'm working currently on MACS2 wrappers and to
have reasonable tests I need up to 50mb ob test files. Providing gz
input files would be one option to get it down to 10mb, but that does
not work and it's still 10mb ...

  I've been talking to John (off list), he will reply here later, but in
  the meantime pointed me at a relevant commit,
  https://bitbucket.org/galaxy/galaxy-central/commits/d05be33ad6b772c2a688d875c4cf895d6afac4e3
 
  Basically the current Twill tests are extremely fragile and expect the
  test output files in the same order used in the outputs definitions
  and also there must be sample data for ALL the output files.
 
  (I may have hit this bug before, I forget now).
 
 I think it ought be possible to refactor the Twill test code to use
 the same dictionary approach used in the API based testing -
 and so support listing the test output in any order and only
 testing some of the files.
 
 However, I'll wait and see what you guys have to say.
 
 Peter.
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/
 
 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] Twill compares wrong file; was: Twill comparisons ignore GALAXY_TEST_NO_CLEANUP

2014-02-11 Thread Peter Cock
On Tue, Feb 11, 2014 at 3:41 PM, Dave Bouvier d...@bx.psu.edu wrote:
 Peter,

 Thanks for pointing this out, I've applied your changes in
 12478:15fc8675064e on the default branch.

--Dave B.

Thanks Dave,

RE: http://lists.bx.psu.edu/pipermail/galaxy-dev/2014-February/018154.html
https://bitbucket.org/galaxy/galaxy-central/commits/15fc8675064ea46b7e081d9643393be354f07d65

I'm still digging into the underlying problem, by focussing on just
one unit test from
https://github.com/peterjc/pico_galaxy/blob/master/tools/mira4/mira4_de_novo.xml

Namely:

test
param name=job_type value=genome /
param name=job_quality value=accurate /
param name=type value=none /
param name=filenames value=ecoli.fastq ftype=fastqsanger /
output name=out_fasta file=ecoli.mira4_de_novo.fasta
ftype=fasta /
/test

After the test runs, we have five files, which is expected as I have
one input and four outputs:

outputs
data name=out_fasta format=fasta label=MIRA de novo
contigs (FASTA) /
data name=out_bam format=bam label=MIRA de novo assembly (BAM) /
data name=out_maf format=mira label=MIRA de novo assembly /
data name=out_log format=txt label=MIRA de novo log /
/outputs

$ ls /tmp/tmp09TawY/tmpFOX9hS/database/files/000/
dataset_1.dat  dataset_2.dat  dataset_3.dat  dataset_4.dat  dataset_5.dat

dataset_1.dat -- test-data/ecoli.fastq
dataset_2.dat -- out_fasta
dataset_3.dat -- out_bam
dataset_4.dat -- out_maf
dataset_5.dat -- out_log

Of these, the test should only attempt to compare dataset_2.dat (out_fasta)
to my sample output file test-data/ecoli.mira4_de_novo.fasta which it matches:

$ diff test-data/ecoli.mira4_de_novo.fasta
/tmp/tmp09TawY/tmpFOX9hS/database/files/000/dataset_2.dat

However, for reasons unknown, it is comparing dataset_5.dat (out_log)
to the sample output file test-data/ecoli.mira4_de_novo.fasta instead.

A little more logging (e.g. [1]) shows the problem is when calling into the
verify_dataset_correctness method (should use hid=2):

base.interactor: DEBUG: About to call
verify_dataset_correctness('ecoli.mira4_de_novo.fasta', '5', ...)
base.interactor: DEBUG: About to call verify_dataset_correctness(...)
for output_data 'state': 'queued'
base.interactor: DEBUG: About to call verify_dataset_correctness(...)
for output_data 'hid': '5'
base.interactor: DEBUG: About to call verify_dataset_correctness(...)
for output_data 'id': '5'
base.interactor: DEBUG: About to call verify_dataset_correctness(...)
for output_data 'dbkey': 'hg17'
base.interactor: DEBUG: About to call verify_dataset_correctness(...)
for output_data 'name': 'MIRA de novo log'
base.twilltestcase: DEBUG: Verifying dataset correctness for
'ecoli.mira4_de_novo.fasta' (hid='5')

Now what is strange is how can the call to get the hid for the current
outfile can know if it should be 2, 3, 4, or 5? The bug appears to be
in method _verify_outputs in test/functional/test_toolbox.py - which
was last touched by John...

output_data = data_list[ len(data_list) - len(testdef.outputs) + output_index ]

In this example, we have data_list[5 - 1 + 0] = 4, meaning since
data_list is essentially [dataset_1, ..., dataset_5] we get dataset_5

I believe the error is that rather than the number of defined test outputs
(here 1), you should be subtracting the number of tool outputs (here 4).
i.e. data_list[5 - 4 + 0] = data_list[1] = dataset_2

As a workaround, I could probably provide expected data for all
four output files...

Regards,

Peter

[1] Patch which adds more logging to test/base/interactor.py
(similar debugging added to other files to trace this.)

$ hg diff test/base/interactor.py
diff -r f3dc213a5773 test/base/interactor.py
--- a/test/base/interactor.pyMon Feb 10 22:13:35 2014 -0600
+++ b/test/base/interactor.pyTue Feb 11 16:19:51 2014 +
@@ -309,6 +309,9 @@

 def verify_output( self, history, output_data, outfile,
attributes, shed_tool_id, maxseconds ):
 hid = output_data.get( 'hid' )
+log.debug(About to call verify_dataset_correctness(%r, %r,
...) % (outfile, hid))
+for key in output_data.keys():
+   log.debug(About to call verify_dataset_correctness(...)
for output_data %r: %r % (key, output_data.get(key)))
 self.twill_test_case.verify_dataset_correctness( outfile,
hid=hid, attributes=attributes, shed_tool_id=shed_tool_id,
maxseconds=maxseconds )

 def get_job_stream( self, history_id, output_data, stream ):
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Twill compares wrong file; was: Twill comparisons ignore GALAXY_TEST_NO_CLEANUP

2014-02-11 Thread Peter Cock
On Tue, Feb 11, 2014 at 4:42 PM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Tue, Feb 11, 2014 at 3:41 PM, Dave Bouvier d...@bx.psu.edu wrote:
 Peter,

 Thanks for pointing this out, I've applied your changes in
 12478:15fc8675064e on the default branch.

--Dave B.

 Thanks Dave,

 RE: http://lists.bx.psu.edu/pipermail/galaxy-dev/2014-February/018154.html
 https://bitbucket.org/galaxy/galaxy-central/commits/15fc8675064ea46b7e081d9643393be354f07d65

 I'm still digging into the underlying problem, by focussing on just
 one unit test ...

 As a workaround, I could probably provide expected data for all
 four output files...

I've been talking to John (off list), he will reply here later, but in
the meantime pointed me at a relevant commit,
https://bitbucket.org/galaxy/galaxy-central/commits/d05be33ad6b772c2a688d875c4cf895d6afac4e3

Basically the current Twill tests are extremely fragile and expect the
test output files in the same order used in the outputs definitions
and also there must be sample data for ALL the output files.

(I may have hit this bug before, I forget now).

Regards,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/