Re: [galaxy-dev] Per-tool configuration

2014-06-27 Thread John Chilton
On Fri, Jun 27, 2014 at 5:16 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Wed, Jun 18, 2014 at 12:14 PM, Peter Cock p.j.a.c...@googlemail.com 
 wrote:
 On Wed, Jun 18, 2014 at 12:04 PM, Jan Kanis jan.c...@jankanis.nl wrote:
 I am not using job splitting, because I am implementing this for a client
 with a small (one machine) galaxy setup.

 Ah - this also explains why a job size limit is important for you.

 Implementing a query limit feature in galaxy core would probably be the best
 idea, but that would also probably require an admin screen to edit those
 limits, and I don't think I can sell the required time to my boss under the
 contract we have with the client.

 The wrapper script idea I outlined to you earlier would be the least
 invasive (although might cause trouble if BLAST is run at the command
 line outside Galaxy), while your idea of inserting the check script into
 the Galaxy Tool XML just before running BLAST itself should also
 work well.

 While looking an Jan's pull request to insert a query size limit before
 running BLAST https://github.com/peterjc/galaxy_blast/pull/43
 I realised that this will not work so well if job-splitting is enabled.

 If using the job-splitting parallelism setting in Galaxy, then the BLAST
 query FASTA file is broken up into chunks of 1000 sequences. This
 means the new check would be make at the chunk level - so it could
 in effect catch extremely long query sequences (e.g. chromosomes),
 but could not block anyone submitting one query FASTA file containing
 many thousands of moderate length query sequences (e.g. genes).

 John - that Trello issue you logged, https://trello.com/c/0XQXVhRz
 Generic infrastructure to let deployers specify limits for tools based
 on input metadata (number of sequences, file size, etc...)

 Would it be fair to say this is not likely to be implemented in the near
 future? i.e. Should we consider implementing the BLAST query limit
 approach as a short term hack?

It would be good functionality - but I don't foresee myself or anyone
on the core team getting to it in the next six months say.

...

I am now angry with myself though because I realized that dynamic job
destinations are a better way to implement this in the meantime (that
environment stuff was very fresh when I responded so I think I just
jumped there). You can build a flexible infrastructure locally that is
largely decoupled from the tools and that may (?) work around the task
splitting problem Peter brought up.

Outline of the idea:

Create a Python script - say lib/galaxy/jobs/mapper_limits.py and add
some functions to it like:

--
# Helper utilities for limiting tool inputs.
from galaxy.jobs.mapper import JobMappingException

DEFAULT_QUERY_LIMIT_MESSAGE = Size of input exceeds query limit of
this Galaxy instance.

def assert_fewer_than_ n_sequences(input_path, n,
msg=DEFAULT_QUERY_LIMIT_MESSAGE):
  ...  # compute num_sequences
  if num_sequences  n:
raise JobMappingException(msg)

# Do same for other checks...
--

This is an abstract file that has nothing to do with the institution
or toolbox really. Once you get it working - open a pull request and
we can probably get this integrated into Galaxy (as long as it is
abstract enough). Then deployers can create specific rules for that
particular cluster and toolbox:

Create  lib/galaxy/jobs/runners/rules/instance_dests.py

--
from galaxy.jobs import mapper_limits

def limited_blast(job, app):
  inp_data = dict( [ ( da.name, da.dataset ) for da in job.input_datasets ] )
  query_file = inp_data[ query ].file_name
  mapper_limits.assert_fewer_than_ n_sequences( query_file, 300 )
  return app.job_config.get_destination( blast_base )

--

Then open job_conf.xml and add the correct destinations...

job_conf
   ...
  destinations
 ...
destination id=limited_blast runner=dynamic
  param id=functionlimited_blast/param
/destination
destination id=blast_base runner=torque !-- or whatever --
  
/destination
  /destinations
  tools
tool id=ncbi_blastn_wrapper destination=limited_blast /
tool id=ncbi_blastp_wrapper destination=limited_blast /
 ...
   /tools
/job_conf

Jan I am really sorry I didn't come up with this before you did all
that work. Hopefully what you did for limit_query_size.py can be
reused in this context.

-John


 Thanks,

 Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Per-tool configuration

2014-06-27 Thread Peter Cock
On Fri, Jun 27, 2014 at 3:13 PM, John Chilton jmchil...@gmail.com wrote:
 On Fri, Jun 27, 2014 at 5:16 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Wed, Jun 18, 2014 at 12:14 PM, Peter Cock p.j.a.c...@googlemail.com 
 wrote:

 John - that Trello issue you logged, https://trello.com/c/0XQXVhRz
 Generic infrastructure to let deployers specify limits for tools based
 on input metadata (number of sequences, file size, etc...)

 Would it be fair to say this is not likely to be implemented in the near
 future? i.e. Should we consider implementing the BLAST query limit
 approach as a short term hack?

 It would be good functionality - but I don't foresee myself or anyone
 on the core team getting to it in the next six months say.

 ...

 I am now angry with myself though because I realized that dynamic job
 destinations are a better way to implement this in the meantime (that
 environment stuff was very fresh when I responded so I think I just
 jumped there). You can build a flexible infrastructure locally that is
 largely decoupled from the tools and that may (?) work around the task
 splitting problem Peter brought up.

 Outline of the idea:
 snip

Hi John,

So the idea is to define a dynamic job mapper which checks the
query input size, and if too big raises an error, and otherwise
passes the job to the configured job handler (e.g. SGE cluster).

See https://wiki.galaxyproject.org/Admin/Config/Jobs

It sounds like this ought to be possible right now, but you are
suggesting since this seems quite a general use case, the
code to help build a dynamic mapper using things like file
size (in bytes or number of sequences) could be added to
Galaxy?

This approach would need the Galaxy Admin to setup a custom
job mapper for BLAST (which knows to look at the query file),
but it taps into an existing Galaxy framework. By providing a
reference implementation this ought to be fairly easy to setup,
and can be extended to be more clever about the limits.

e.g. For BLAST, we should consider both the number (and
length) of the queries, plus the size of the database.

Regards,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Per-tool configuration

2014-06-27 Thread John Chilton
On Fri, Jun 27, 2014 at 9:30 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Fri, Jun 27, 2014 at 3:13 PM, John Chilton jmchil...@gmail.com wrote:
 On Fri, Jun 27, 2014 at 5:16 AM, Peter Cock p.j.a.c...@googlemail.com 
 wrote:
 On Wed, Jun 18, 2014 at 12:14 PM, Peter Cock p.j.a.c...@googlemail.com 
 wrote:

 John - that Trello issue you logged, https://trello.com/c/0XQXVhRz
 Generic infrastructure to let deployers specify limits for tools based
 on input metadata (number of sequences, file size, etc...)

 Would it be fair to say this is not likely to be implemented in the near
 future? i.e. Should we consider implementing the BLAST query limit
 approach as a short term hack?

 It would be good functionality - but I don't foresee myself or anyone
 on the core team getting to it in the next six months say.

 ...

 I am now angry with myself though because I realized that dynamic job
 destinations are a better way to implement this in the meantime (that
 environment stuff was very fresh when I responded so I think I just
 jumped there). You can build a flexible infrastructure locally that is
 largely decoupled from the tools and that may (?) work around the task
 splitting problem Peter brought up.

 Outline of the idea:
 snip

 Hi John,

 So the idea is to define a dynamic job mapper which checks the
 query input size, and if too big raises an error, and otherwise
 passes the job to the configured job handler (e.g. SGE cluster).

 See https://wiki.galaxyproject.org/Admin/Config/Jobs

 It sounds like this ought to be possible right now, but you are
 suggesting since this seems quite a general use case, the
 code to help build a dynamic mapper using things like file
 size (in bytes or number of sequences) could be added to
 Galaxy?

Yes it is possible right now and everything could just be stuck right
the rule file itself. I was just suggesting that sharing some of the
helpers with the community might ease the process for future
deployers.


 This approach would need the Galaxy Admin to setup a custom
 job mapper for BLAST (which knows to look at the query file),
 but it taps into an existing Galaxy framework. By providing a
 reference implementation this ought to be fairly easy to setup,
 and can be extended to be more clever about the limits.

Yes. As you mention this can be much more expressive than an XML-based
fixed set of limit types. In addition to static sorts of limits - you
could combine inputs like you mentioned, one could allow local users
of the public resource to run as much as they want, allow larger jobs
on the weekend when things are slow, etc I recently added a
high-level utility for looking at job metrics in these rules - so you
can say restrict and or expand the limit based on how many jobs the
user has run in the last month or how many core hours they have
consumed, etc

https://bitbucket.org/galaxy/galaxy-central/commits/9a905e98e1550314cf821a99c2adc1b00a4eed83


 e.g. For BLAST, we should consider both the number (and
 length) of the queries, plus the size of the database.

Thanks for clarifying and providing some context to my (in retrospect)
seemingly random Python scripts :).


 Regards,

 Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Per-tool configuration

2014-06-18 Thread Jan Kanis
I am not using job splitting, because I am implementing this for a client
with a small (one machine) galaxy setup.

Implementing a query limit feature in galaxy core would probably be the
best idea, but that would also probably require an admin screen to edit
those limits, and I don't think I can sell the required time to my boss
under the contract we have with the client.

I gave a quick try before on making the blast2html tool run in both python
2.6 and 3, but I gave up due to too many encoding issues. The client's
machine has python 2.6. Maybe I should have another look.

Jan


On 17 June 2014 21:55, Peter Cock p.j.a.c...@googlemail.com wrote:

 On Tue, Jun 17, 2014 at 4:57 PM, Jan Kanis jan.c...@jankanis.nl wrote:
  Too bad there aren't any really good options. I will use the environment
  variable approach for the query size limit.

 Are you using the optional job splitting (parallelism) feature in Galaxy?
 That seems to be me to be a good place to insert a Galaxy level
 job size limit. e.g. BLAST+ jobs are split into 1000 query chunks,
 so you might wish to impose a 25 chunk limit?

 Long term being able to set limits on the input file parameters
 of each tool would be nicer - e.g. Limit BLASTN to at most
 20,000 queries, limit MIRA to at most 50GB FASTQ files, etc.

  For the gene bank links I guess modifying the .loc file is the least
  bad way. Maybe it can be merged into galaxy_blast, that would at
  least solve the interoperability problems.

 It would have to be sufficiently general, and backward compatible.

 FYI other people have also looked at extending the blast *.loc
 files (e.g. adding a category column for helping filter down a
 very large BLAST database list).

  @Peter: One potential problem in merging my blast2html tool
  could be that I have written it in python3, and the current tool
  wrapper therefore installs python3 and a host of its dependencies,
  making for a quite large download.

 Without seeing your code, it is hard to say, but actually writing
 Python code which works unmodified under Python 2.7 and
 Python 3 is quite doable (and under Python 2.6 with a few
 more provisos). Both NumPy and Biopython do this if you
 wanted some reassurance.

 On the other hand, Galaxy itself will need to more to Python 3
 at some point, and certainly individual tools will too. This will
 probably mean (as with Linux Python packages) having double
 entries on the ToolSehd (one for Python 2, one for Python 3),

 e.g ToolShed package for NumPy under Python 2 (done)
 and under Python 3 (needed).

 Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Per-tool configuration

2014-06-18 Thread Peter Cock
On Wed, Jun 18, 2014 at 12:04 PM, Jan Kanis jan.c...@jankanis.nl wrote:
 I am not using job splitting, because I am implementing this for a client
 with a small (one machine) galaxy setup.

Ah - this also explains why a job size limit is important for you.

 Implementing a query limit feature in galaxy core would probably be the best
 idea, but that would also probably require an admin screen to edit those
 limits, and I don't think I can sell the required time to my boss under the
 contract we have with the client.

The wrapper script idea I outlined to you earlier would be the least
invasive (although might cause trouble if BLAST is run at the command
line outside Galaxy), while your idea of inserting the check script into
the Galaxy Tool XML just before running BLAST itself should also
work well.

 I gave a quick try before on making the blast2html tool run in both python
 2.6 and 3, but I gave up due to too many encoding issues. The client's
 machine has python 2.6. Maybe I should have another look.

 Jan

It gets easier with practice - a mixture of little syntax things, and
the big pain about bytes versus unicode (and thus encodings,
and raw versus text mode for file handles).

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Per-tool configuration

2014-06-17 Thread Jan Kanis
Too bad there aren't any really good options. I will use the environment
variable approach for the query size limit. For the gene bank links I guess
modifying the .loc file is the least bad way. Maybe it can be merged into
galaxy_blast, that would at least solve the interoperability problems.

@Peter: One potential problem in merging my blast2html tool could be that I
have written it in python3, and the current tool wrapper therefore installs
python3 and a host of its dependencies, making for a quite large download.

Jan


On 16 June 2014 09:08, Peter Cock p.j.a.c...@googlemail.com wrote:

 On Mon, Jun 16, 2014 at 4:18 AM, John Chilton jmchil...@gmail.com wrote:
  Hello Jan,
 
  Thanks for the clarification. Not quite what I was expecting so I am
  glad I asked - I don't have great answers for either case so hopefully
  other people will have some ideas.
 
  For the first use case - I would just specify some default input to
  supply to the input wrapper - lets call this N - add a parameter to
  the tool wrapper --limit-size=N - test that and then allow it to be
  overridden via an environment variable - so in your command block use
  --limit-size=\${BLAST_QUERY_LIMIT:N}. This will use N is not limit
  is set, but deployers can set limits. There are a number of ways to
  set such variables - DRM specific environment files, login rc files,
  etc Just this last release I added the ability to define
  environment variables right in job_conf.xml
  (
 https://bitbucket.org/galaxy/galaxy-central/pull-request/378/allow-specification-of-environment/diff
 ).
  I thought the tool shed might have a way to collect such definitions
  as well and insert them into package files - but Google failed to find
  this for me.

 Hmm. Jan emailed me off list earlier about this. We could insert
 a pre-BLAST script to check the size of the query FASTA file,
 and abort if it is too large (e.g. number of queries, total sequence
 length, perhaps scaled according to the database size if we want
 to get clever?).

 I was hoping there was a more general mechanism in Galaxy -
 after all, BLAST is by no means the only computationally
 expensive tool ;)

 We have had query files of 20,000 and more genes against NR
 (both BLASTP and BLASTX), but our Galaxy has task-splitting
 enabled so this becomes 20 (or more) individual cluster jobs
 of 1000 queries each. This works fine apart from the occasional
 glitch with the network drive when the data is merged afterwards.
 (We know this failed once shortly after the underlying storage
 had been expanded, and would have been under heavy load
 rebalancing the data across the new disks.)

  Not sure about how to proceed with the second use case - extending the
  .loc file should work locally - I am not sure it is feasible within
  the context of the existing tool shed tools, data manager, etc You
  could certainly duplicate this stuff with your modifications - this
  how down sides in terms of interoperability though.

 Currently the BLAST wrappers use the *.loc files directly, but
 this is likely to switch to the newer Data Manager approach.
 That may or may not complicate local modifications like adding
 extra columns...

  Sorry I don't have great answers for either question,
  -John

 Thanks John,

 Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Per-tool configuration

2014-06-17 Thread Peter Cock
On Tue, Jun 17, 2014 at 4:57 PM, Jan Kanis jan.c...@jankanis.nl wrote:
 Too bad there aren't any really good options. I will use the environment
 variable approach for the query size limit.

Are you using the optional job splitting (parallelism) feature in Galaxy?
That seems to be me to be a good place to insert a Galaxy level
job size limit. e.g. BLAST+ jobs are split into 1000 query chunks,
so you might wish to impose a 25 chunk limit?

Long term being able to set limits on the input file parameters
of each tool would be nicer - e.g. Limit BLASTN to at most
20,000 queries, limit MIRA to at most 50GB FASTQ files, etc.

 For the gene bank links I guess modifying the .loc file is the least
 bad way. Maybe it can be merged into galaxy_blast, that would at
 least solve the interoperability problems.

It would have to be sufficiently general, and backward compatible.

FYI other people have also looked at extending the blast *.loc
files (e.g. adding a category column for helping filter down a
very large BLAST database list).

 @Peter: One potential problem in merging my blast2html tool
 could be that I have written it in python3, and the current tool
 wrapper therefore installs python3 and a host of its dependencies,
 making for a quite large download.

Without seeing your code, it is hard to say, but actually writing
Python code which works unmodified under Python 2.7 and
Python 3 is quite doable (and under Python 2.6 with a few
more provisos). Both NumPy and Biopython do this if you
wanted some reassurance.

On the other hand, Galaxy itself will need to more to Python 3
at some point, and certainly individual tools will too. This will
probably mean (as with Linux Python packages) having double
entries on the ToolSehd (one for Python 2, one for Python 3),

e.g ToolShed package for NumPy under Python 2 (done)
and under Python 3 (needed).

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Per-tool configuration

2014-06-17 Thread John Chilton
On Tue, Jun 17, 2014 at 2:55 PM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Tue, Jun 17, 2014 at 4:57 PM, Jan Kanis jan.c...@jankanis.nl wrote:
 Too bad there aren't any really good options. I will use the environment
 variable approach for the query size limit.

 Are you using the optional job splitting (parallelism) feature in Galaxy?
 That seems to be me to be a good place to insert a Galaxy level
 job size limit. e.g. BLAST+ jobs are split into 1000 query chunks,
 so you might wish to impose a 25 chunk limit?

 Long term being able to set limits on the input file parameters
 of each tool would be nicer - e.g. Limit BLASTN to at most
 20,000 queries, limit MIRA to at most 50GB FASTQ files, etc.

Trello card created, please vote!

https://trello.com/c/0XQXVhRz


 For the gene bank links I guess modifying the .loc file is the least
 bad way. Maybe it can be merged into galaxy_blast, that would at
 least solve the interoperability problems.

 It would have to be sufficiently general, and backward compatible.

 FYI other people have also looked at extending the blast *.loc
 files (e.g. adding a category column for helping filter down a
 very large BLAST database list).

 @Peter: One potential problem in merging my blast2html tool
 could be that I have written it in python3, and the current tool
 wrapper therefore installs python3 and a host of its dependencies,
 making for a quite large download.

 Without seeing your code, it is hard to say, but actually writing
 Python code which works unmodified under Python 2.7 and
 Python 3 is quite doable (and under Python 2.6 with a few
 more provisos). Both NumPy and Biopython do this if you
 wanted some reassurance.

 On the other hand, Galaxy itself will need to more to Python 3
 at some point, and certainly individual tools will too. This will
 probably mean (as with Linux Python packages) having double
 entries on the ToolSehd (one for Python 2, one for Python 3),

I certainly hope Galaxy can move to Python 3 at some point... being a
pessimist though I would place bets against it :).


 e.g ToolShed package for NumPy under Python 2 (done)
 and under Python 3 (needed).

 Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Per-tool configuration

2014-06-16 Thread Peter Cock
On Mon, Jun 16, 2014 at 4:18 AM, John Chilton jmchil...@gmail.com wrote:
 Hello Jan,

 Thanks for the clarification. Not quite what I was expecting so I am
 glad I asked - I don't have great answers for either case so hopefully
 other people will have some ideas.

 For the first use case - I would just specify some default input to
 supply to the input wrapper - lets call this N - add a parameter to
 the tool wrapper --limit-size=N - test that and then allow it to be
 overridden via an environment variable - so in your command block use
 --limit-size=\${BLAST_QUERY_LIMIT:N}. This will use N is not limit
 is set, but deployers can set limits. There are a number of ways to
 set such variables - DRM specific environment files, login rc files,
 etc Just this last release I added the ability to define
 environment variables right in job_conf.xml
 (https://bitbucket.org/galaxy/galaxy-central/pull-request/378/allow-specification-of-environment/diff).
 I thought the tool shed might have a way to collect such definitions
 as well and insert them into package files - but Google failed to find
 this for me.

Hmm. Jan emailed me off list earlier about this. We could insert
a pre-BLAST script to check the size of the query FASTA file,
and abort if it is too large (e.g. number of queries, total sequence
length, perhaps scaled according to the database size if we want
to get clever?).

I was hoping there was a more general mechanism in Galaxy -
after all, BLAST is by no means the only computationally
expensive tool ;)

We have had query files of 20,000 and more genes against NR
(both BLASTP and BLASTX), but our Galaxy has task-splitting
enabled so this becomes 20 (or more) individual cluster jobs
of 1000 queries each. This works fine apart from the occasional
glitch with the network drive when the data is merged afterwards.
(We know this failed once shortly after the underlying storage
had been expanded, and would have been under heavy load
rebalancing the data across the new disks.)

 Not sure about how to proceed with the second use case - extending the
 .loc file should work locally - I am not sure it is feasible within
 the context of the existing tool shed tools, data manager, etc You
 could certainly duplicate this stuff with your modifications - this
 how down sides in terms of interoperability though.

Currently the BLAST wrappers use the *.loc files directly, but
this is likely to switch to the newer Data Manager approach.
That may or may not complicate local modifications like adding
extra columns...

 Sorry I don't have great answers for either question,
 -John

Thanks John,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Per-tool configuration

2014-06-15 Thread John Chilton
Hello Jan,

Thanks for the clarification. Not quite what I was expecting so I am
glad I asked - I don't have great answers for either case so hopefully
other people will have some ideas.

For the first use case - I would just specify some default input to
supply to the input wrapper - lets call this N - add a parameter to
the tool wrapper --limit-size=N - test that and then allow it to be
overridden via an environment variable - so in your command block use
--limit-size=\${BLAST_QUERY_LIMIT:N}. This will use N is not limit
is set, but deployers can set limits. There are a number of ways to
set such variables - DRM specific environment files, login rc files,
etc Just this last release I added the ability to define
environment variables right in job_conf.xml
(https://bitbucket.org/galaxy/galaxy-central/pull-request/378/allow-specification-of-environment/diff).
I thought the tool shed might have a way to collect such definitions
as well and insert them into package files - but Google failed to find
this for me.

Not sure about how to proceed with the second use case - extending the
.loc file should work locally - I am not sure it is feasible within
the context of the existing tool shed tools, data manager, etc You
could certainly duplicate this stuff with your modifications - this
how down sides in terms of interoperability though.

Sorry I don't have great answers for either question,
-John




On Sat, Jun 14, 2014 at 5:12 AM, Jan Kanis jan.c...@jankanis.nl wrote:
 I have two use cases: the first is for a modification of the ncbi blast
 wrapper to limit the query input size (for a publically accessible galaxy
 instance), so this needs a configuration option for the query size limit. I
 was thinking about a separate config file in tool-data for this.

 The second is is for a tool I have written to convert a blast xml output
 into a html report. The report contains links for each match to a gene bank
 (e.g. the ncbi database). These links should be configurable per database
 that was searched, and preferrably have an option of linking to the location
 of the match within the gene if the gene bank supports such links. One
 option is to add an extra column to the blast .loc files (if that doesn't
 break blast), where the databases are already configured.

 Jan

 Op 13 jun. 2014 18:02 schreef John Chilton jmchil...@gmail.com het
 volgende:

 I would have different answers for your depending on what options are
 available to the server admin. What exactly about the tool is
 configurable - can you be more specific?

 -John

 On Fri, Jun 13, 2014 at 10:59 AM, Jan Kanis jan.c...@jankanis.nl wrote:
  I am writing a tool that should be configurable by the server admin. I
  am
  considering adding a configuration file, but where should such a file be
  placed? Is the tool-data directory the right place? Is there another
  standard way for per-tool configuration?
 
  Jan
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
 
  To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Per-tool configuration

2014-06-14 Thread Jan Kanis
I have two use cases: the first is for a modification of the ncbi blast
wrapper to limit the query input size (for a publically accessible galaxy
instance), so this needs a configuration option for the query size limit. I
was thinking about a separate config file in tool-data for this.

The second is is for a tool I have written to convert a blast xml output
into a html report. The report contains links for each match to a gene bank
(e.g. the ncbi database). These links should be configurable per database
that was searched, and preferrably have an option of linking to the
location of the match within the gene if the gene bank supports such links.
One option is to add an extra column to the blast .loc files (if that
doesn't break blast), where the databases are already configured.

Jan
Op 13 jun. 2014 18:02 schreef John Chilton jmchil...@gmail.com het
volgende:

 I would have different answers for your depending on what options are
 available to the server admin. What exactly about the tool is
 configurable - can you be more specific?

 -John

 On Fri, Jun 13, 2014 at 10:59 AM, Jan Kanis jan.c...@jankanis.nl wrote:
  I am writing a tool that should be configurable by the server admin. I am
  considering adding a configuration file, but where should such a file be
  placed? Is the tool-data directory the right place? Is there another
  standard way for per-tool configuration?
 
  Jan
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
 
  To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] Per-tool configuration

2014-06-13 Thread Jan Kanis
I am writing a tool that should be configurable by the server admin. I am
considering adding a configuration file, but where should such a file be
placed? Is the tool-data directory the right place? Is there another
standard way for per-tool configuration?

Jan
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Per-tool configuration

2014-06-13 Thread John Chilton
I would have different answers for your depending on what options are
available to the server admin. What exactly about the tool is
configurable - can you be more specific?

-John

On Fri, Jun 13, 2014 at 10:59 AM, Jan Kanis jan.c...@jankanis.nl wrote:
 I am writing a tool that should be configurable by the server admin. I am
 considering adding a configuration file, but where should such a file be
 placed? Is the tool-data directory the right place? Is there another
 standard way for per-tool configuration?

 Jan

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/