Looks good, John.

I tested with: 
https://testtoolshed.g2.bx.psu.edu/view/jjohnson/snpsift_dbnsfp_datatypes

lib/galaxy/datatypes/converters/tabular_to_dbnsfp.xml

reverting from hack:
        <command interpreter="python">tabular_to_dbnsfp.py $input 
$dbnsfp.dataset.extra_files_path/dbNSFP.gz</command>
back to:
        <command interpreter="python">tabular_to_dbnsfp.py $input 
$dbnsfp.files_path/dbNSFP.gz</command>

On 10/15/14, 12:05 PM, John Chilton wrote:
Hey JJ,

Opened a pull request to stable with my best guess at the right to
proceed and hopefully a best practice recommendation we can all get
behind. Do you want to try it out and let me know if it fixes snpeff?
(It does fix the velvet datatypes you contributed to Galaxy.)

https://bitbucket.org/galaxy/galaxy-central/pull-request/532/fix-for-datatypes-consuming-output-extra/diff

Dan, Bjoern - does this make sense - can we move forward with this
approach ($input.extra_files_path for inputs and $output.files_path
for outputs) as the best practices for how to reference these
directories.

-John

On Wed, Oct 15, 2014 at 11:44 AM, Jim Johnson <johns...@umn.edu> wrote:
I agree with you about the inadvisable use of:   input.dataset.*.

I'm looking at:

lib/galaxy/model/__init__.py
class Dataset( object ):
     ...
     def __init__( self, id=None, state=None, external_filename=None,
extra_files_path=None, file_size=None, purgable=True, uuid=None ):
        ...
        self._extra_files_path = extra_files_path
        ...
     @property
     def extra_files_path( self ):
         return self.object_store.get_filename( self, dir_only=True,
extra_dir=self._extra_files_path or "dataset_%d_files" % self.id )

I'm trying to see when self._extra_files_path gets set. Otherwise, would
this return the path relative to the current file location of dataset?




On 10/15/14, 9:36 AM, John Chilton wrote:
Okay - so this is what broke things:


https://bitbucket.org/galaxy/galaxy-central/commits/d781366bc120787e201b73a4dd99b56282169d86

My feeling with the commit was that wrappers and tools should never be
explicitly accessing paths explicitly through input.dataset.*. I think
this would circumvent options like outputs_to_working_directory and
break remote job execution through Pulsar. It also breaks the object
store abstraction I think - which is why I made the change for Bjoern
I guess.

I did not (and this was stupid on my part) realize that datatype code
would be running on the remote host and accessing these model
properties directly outside the abstractions setup by the wrappers
supplied to cheetah code and so they have become out of sync as of
that commit.

I am thinking somehow changing what the datatype code gets is the
right approach and not fixing things by circumvent the wrapper and
accessing properties directly on the dataset. Since you will find that
doing this breaks things for Bjoern object store and could probably
never run on usegalaxy.org say for the same reason.

Too many different competing deployment options all being incompatible
with each other :(.

Will keep thinking about this and respond again.

-John

On Wed, Oct 15, 2014 at 9:39 AM, John Chilton <jmchil...@gmail.com> wrote:
JJ,

Arg this is a mess. I am very sorry about this - I still don't
understand extra_files_path versus files_path myself. There are open
questions on Peter's blast repo and no one ever followed up on my
object store questions about this with Bjoern's issues a couple
release cycles ago. We need to get these to work - write documetation
explicitly declaring best practices we can all agree on and then write
some tests to ensure things don't break in the future.

When you say your tools broke recently - can you say for certain which
release broke these - the August14, October14, something older?

I'll try to do some more research and get back to you.

-John

On Tue, Oct 14, 2014 at 6:04 AM, Jim Johnson <johns...@umn.edu> wrote:
Andrew,

Thanks for investigating this. I changed the subject and sent to the
galaxy
dev list.

I've had a number of tools quit working recently.   Particularly tools
that
inspect the extra_files_path when setting metadata, Defuse, Rsem,
SnpEff.

I think there was a change in the galaxy framework:
The extra_files_path when referenced from an input or output in the
cheetah
template sections of the tool config xml will be relative to the job
working
directly rather than the files location.
   I've just changed a few of my tools on my server yesterday
from: <param_name>.extra_files_path
to:   <param_name>.dataset.extra_files_path
and they now work again.

Dan or John, is that the right way to handle this?
   Thanks,

JJ



On 10/13/14, 9:29 PM, Andrew Lonie wrote:
Hi Jim. I am probably going about this the wrong way, but I am not
clear on how to report tool errors (if in fact this is a tool error!)

I've been trialling your snpeff wrapper from the test toolshed and
getting a consistent error with the SnpEff Download and SnpEff sub
tools (the SnpSift dbNSFP works fine). The problem seems to be with an
attribute declaration and manifests during database download as:

Traceback (most recent call last):
     File "/mnt/galaxy/galaxy-app/lib/galaxy/jobs/runners/__init__.py",
line 564, in finish_job
       job_state.job_wrapper.finish( stdout, stderr, exit_code )
     File "/mnt/galaxy/galaxy-app/lib/galaxy/jobs/__init__.py", line
1107, in finish
       dataset.datatype.set_meta( dataset, overwrite=False )  # call
datatype.set_meta directly for the initial set_meta call during
dataset creation
     File

"/mnt/galaxy/shed_tools/testtoolshed.g2.bx.psu.edu/repos/iuc/snpeff/1938721334b3/snpeff/lib/galaxy/datatypes/snpeff.py",
line 21, in set_meta
       data_dir = dataset.files_path
AttributeError: 'HistoryDatasetAssociation' object has no attribute
'files_path'


We fiddled around with the wrapper, eventually replacing
'dataset.files_path' with 'dataset.extra_files_path' in snpeff.py,
which fixed the download bug, but then SnpEff subtool itself threw a
similar error when I tried to use that database from the history.

I chased up a bit more but cannot understand the various posts on
files_path vs extra_files_path

I've shared a history with both of these errors here:
    http://130.56.251.62/galaxy/u/alonie/h/unnamed-history

Maybe this is a problem with our Galaxy image?

Any help appreciated!

Andrew




A/Prof Andrew Lonie
University of Melbourne


--
James E. Johnson Minnesota Supercomputing Institute University of
Minnesota
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/


--
James E. Johnson Minnesota Supercomputing Institute University of Minnesota


--
James E. Johnson Minnesota Supercomputing Institute University of Minnesota
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/

Reply via email to