Hello Jean-Frédéric,

Sorry for the delay in this response.  Please see my inline comments.

On Feb 8, 2013, at 10:33 AM, Jean-Frédéric Berthelot wrote:

> Hi list,
> 
> The tool I am currently wrapping has built-in data, which may be used by the 
> tool users (through a relevant <from_data_table> + .LOC file configuration).
> They are .fasta databases which are rather small and are thus bundled in the 
> tool distribution package.
> 
> Thanks to the tool_dependencies.xml file, said distribution package is 
> downloaded at install time, code is compiled, and since they are here, the 
> data files are copied to $INSTALL_DIR too, ready to be used.
> 
> After that, the user still has to edit tool-data/my_fancy_data_files.loc ; 
> but the thing is, during the install I know where these data files are (since 
> I copied those there), so I would like to save the user the trouble and set 
> up this file automagically.
> 
> I would have two questions:
> 
> 1/ Is it okay to have tool built-in data files in $INSTALL_DIR, or would it 
> be considered bad practice?


This is difficult to answer.  Generally, data files should be located in a 
shared location so that other tools can access them as well.  However, there 
are potentially exceptions to this that are acceptable.  The fact that the 
fasta data files are small and you are using a tool_dependencies.xml file to 
define a relationship to them for your tools is a good approach because it 
allows the data files to be used by other tools in separate repositories via a 
complex repository dependency definition in the remote repository.

If these fasta data files are available for download via a clone or a url, then 
in the near future the new Galaxy Data Manager (which uses a new, special 
category of Galaxy tools which are of type "data_manager") may be useful in 
this scenario.  Data Manager tools can be associated with tools in a repository 
like yours using repository dependency definitions, so they will be installed 
along with the selected repository.  These data manager tools allow for 
specified data to be installed into the Galaxy environment for use by tools.  
This new component is not yet released, but it is close.  In the meantime, your 
approach is the only way to make this work.  

If your files are not downloadable, then we might plan to allow simplified 
bootstrapping of .loc files in the tol-data directory with files included in 
the repository.  This would take some planning, and it's availability would not 
be in the short term


> 
> 2/ Is there a way to set up the tool-data/my_fancy_data_files.loc during the 
> install? Here are the options I though of:
> *shipping a “real” my_fancy_data_files.loc.sample with the good paths already 
> set-up, which is going to be copied as the .loc file (a rather ugly hack)

Assuming you use a file name that is not already in the Galaxy tool-data 
subdirectory, the above approach is probably the only way you can do this in a 
fully automated right now.  Again, when the new Data Manager is released, it 
will handle this kind of automated configuration.  But in the meantime, manual 
intervention is generally required to add the information to appropriate .loc 
files in the tool-data directory.


> *using more <action type="shell_command"> during install to create 
> my_fancy_data_files.loc (but deploying this file it is not part of the tool 
> dependency install per se)

I advise against the above approach.  The "best practice" use of tool 
dependency definitions is to restrict movement of files to location within the 
defined $INSTALL_DIR (the installation directory of the tol dependency package) 
or $REPOSITORY_INSTALL_DIR (the installation directory of the repository), 
which is set at installation time.  Hard-coding file paths in <action> tags is 
fragile, and not recommeded.


> *variant of the previous : shipping my_fancy_data_files.loc as part of the 
> tool distribution package, and copy it through shell_command (same concern 
> than above).

The above approach is not recommended either - same issue as above.

> 
> Any thoughts?
> 
> Cheers,
> 
> -- 
> Jean-Frédéric
> Bonsai Bioinformatics group

Thanks very much Jean-Frédéric,

Greg Von Kuster

> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
> 
>  http://lists.bx.psu.edu/

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to