[galaxy-dev] select with a value preselected

2014-02-19 Thread Ketan Maheshwari
Hello,

In my tool UI, I have a select tool as follows:

   param name=site type=select multiple=true label=Execution
Location help=Multi-select list - hold the appropriate key while clicking
to select multiple items
option value=localhostLocalhost/option
option value=midwayMidway/option
option value=uc3UC3/option
option value=stampedeStampede/option
option value=tukeyTukey/option
/param

How do I tell it to preselect value localhost by default?

Currently, user has to explicitly select the a value but if she forgets to
do so the tool breaks because nothing is selected by default.

Thanks,
-- 
Ketan
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] select with a value preselected

2014-02-19 Thread Ketan Maheshwari
Great! Thanks.


On Wed, Feb 19, 2014 at 2:18 PM, Saket Choudhary sake...@gmail.com wrote:

 Hi Ketan,

 You can specify selected=true[1]

 [1]
 https://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax#A.3Coption.3E_tag_set


 On 19 February 2014 20:14, Ketan Maheshwari ketancmaheshw...@gmail.com
 wrote:
  Hello,
 
  In my tool UI, I have a select tool as follows:
 
 param name=site type=select multiple=true label=Execution
  Location help=Multi-select list - hold the appropriate key while
 clicking
  to select multiple items
  option value=localhostLocalhost/option
  option value=midwayMidway/option
  option value=uc3UC3/option
  option value=stampedeStampede/option
  option value=tukeyTukey/option
  /param
 
  How do I tell it to preselect value localhost by default?
 
  Currently, user has to explicitly select the a value but if she forgets
 to
  do so the tool breaks because nothing is selected by default.
 
  Thanks,
  --
  Ketan
 
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
 
  To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/




-- 
Ketan
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] space in text tool results in two arguments

2014-02-17 Thread Ketan Maheshwari
Hi Ross,

I did try to use double quotes and curly braces:  ${outloc} but it does
not seem to address the issue.

Thanks,
Ketan


On Mon, Feb 17, 2014 at 10:37 PM, Ross ross.laza...@gmail.com wrote:

 Hi Ketan.
 Please try quotation marks to enclose any parameter containing spaces in
 the tool command template - eg something like:
 python myscript.py $text_with_spaces $param2 $param3

 Please confirm that this solves the problem?



 On Tue, Feb 18, 2014 at 12:47 PM, Ketan Maheshwari 
 ketancmaheshw...@gmail.com wrote:

 Hi,

 My tool in galaxy accepts text argument which can have zero or more
 spaces depending on user requirements. When user inputs one word it is
 parsed fine but in the case when user inputs more words separated by space
 it results in multiple separate arguments which messes with the way I am
 treating commandline args in my script. Is it possible to tell Galaxy to
 treat textbox as a single argument irrespective of spaces in the values
 provided?

 Thanks for any suggestions.

 Best,
 --
 Ketan


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/





-- 
Ketan
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] setting up Galaxy for torque pbs

2014-02-12 Thread Ketan Maheshwari
Hi,

I am trying to set up Galaxy to interface with a Cray system which runs
Torque/PBS.

After reading this Galaxy wiki page:
https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster#PBS

I was able to scramble torque egg with the following command:

LIBTORQUE_DIR=/opt/torque/2.4.11/lib/libtorque.so.2 python
scripts/scramble.py -e pbs_python

I do not fully understand the parameters and configuration part in PBS
section of the wiki page. Where exactly the runner XML snippet with
plugin and destinations should be placed?

Also wondering if the snippet will be required for all tools that I want to
run on compute nodes or is it a global setting? Can I tell existing tools
to use this setup?

Another question is, can I configure Galaxy to submit jobs to compute
cluster as a mortal user and not an admin of the system?

Thanks,
-- 
Ketan
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] running tools within tool

2014-02-10 Thread Ketan Maheshwari
Thanks Dannon for the reference. I checked out the tool and installed from
toolshed on my local Galaxy instance. I also checked out the related paper
which refers that the Blast executables run in parallel by partitioning the
input files into fragments and running batches in parallel. That sounds
cool. I browsed the code but could not find the exact mechanism. Is the
parallelism at workflow level aka branch parallelism or is it at the tool
level that is the tool invokes parallel code?

Thanks,
Ketan


On Thu, Feb 6, 2014 at 9:42 AM, Dannon Baker dannon.ba...@gmail.com wrote:

 Ketan,

 Have you taken a look at galaxy's built-in parallelism framework?  For a
 great current example of a tool using this, look at Peter's NCBI BLAST+
 wrappers.  https://github.com/peterjc/galaxy_blast

 -Dannon


 On Thu, Feb 6, 2014 at 10:32 AM, Ketan Maheshwari 
 ketancmaheshw...@gmail.com wrote:

 Hi John, Alex, All,

 Elaborating on the motivation behind my question of running tools within
 tool. First, running a tool in parallel at large-scale. For example, if I
 need to find a pattern from 1000 files via Galaxy Select tool from Text and
 Filter tool-group, I am limited by providing one file at a time to the tool
 which will take a long time to finish. Please correct me if there is a more
 sophisticated way to approach this problem. Second, related concern is
 running a tool in parallel on one or more HPC resources.

 We want to write a generic wrapper Galaxy tool, powered by Swift parallel
 framework such that it can run any arbitrary Galaxy tool in parallel on HPC
 resources. Currently, we have developed this capability but for external
 executables which is not a most secure way of using Galaxy as I understand
 from previous conversation.

 Having such a wrapper tool in a standard way is desirable so that it
 preserves the tool contract and binding within Galaxy environment. That is
 maintaining the history and metadata conventions of Galaxy.

 Thanks,
 Ketan


 On Wed, Feb 5, 2014 at 3:53 PM, John Chilton chil...@msi.umn.edu wrote:

 Galaxy has an API that is capable of running tools - certainly this is
 one path forward on something like this. I am not sure it is the best
 path forward though. Probably the best way to enhance Galaxy's
 execution capabilities is to extend the Galaxy core framework itself -
 this has its own downsides though.

 If you can offer more details about how you would like to enhance
 Galaxy - what it cannot do that you would like it to do - I or others
 may be able to provide more specific ideas. Otherwise, sorry I have
 not been or more help.

 -John



 On Tue, Feb 4, 2014 at 2:51 PM, Ketan Maheshwari ke...@mcs.anl.gov
 wrote:
  Hi,
 
  This is a question I posted to galaxy user mailing list a while back
 and was
  redirected to dev for possible answers:
  Is it possible in Galaxy to design a tool whose sole purpose is to run
 other
  tools. This is motivated by our desire to enhance execution
 capabilities of
  existing tools via a generic tool which acts as a wrapper.
 
  Thanks,
  Ketan
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
 
  To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/




 --
 Ketan


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] running tools within tool

2014-02-09 Thread Ketan Maheshwari
Thanks Dannon for the reference. I checked out the tool and installed from
toolshed on my local Galaxy instance. I also checked out the related paper
which refers that the Blast executables run in parallel by partitioning the
input files into fragments and running batches in parallel. That sounds
cool. I browsed the code but could not find the exact mechanism. Is the
parallelism at workflow level aka branch parallelism or is it at the tool
level that is the tool invokes parallel code?

Thanks,
Ketan


On Sun, Feb 9, 2014 at 7:50 PM, Ketan Maheshwari ke...@mcs.anl.gov wrote:

 Thanks Dannon for the reference. I checked out the tool and installed from
 toolshed on my local Galaxy instance. I also checked out the related paper
 which refers that the Blast executables run in parallel by partitioning the
 input files into fragments and running batches in parallel. That sounds
 cool. I browsed the code but could not find the exact mechanism. Is the
 parallelism at workflow level aka branch parallelism or is it at the tool
 level that is the tool invokes parallel code?

 Thanks,
 Ketan


 On Thu, Feb 6, 2014 at 9:42 AM, Dannon Baker dannon.ba...@gmail.comwrote:

 Ketan,

 Have you taken a look at galaxy's built-in parallelism framework?  For a
 great current example of a tool using this, look at Peter's NCBI BLAST+
 wrappers.  https://github.com/peterjc/galaxy_blast

 -Dannon


 On Thu, Feb 6, 2014 at 10:32 AM, Ketan Maheshwari 
 ketancmaheshw...@gmail.com wrote:

 Hi John, Alex, All,

 Elaborating on the motivation behind my question of running tools within
 tool. First, running a tool in parallel at large-scale. For example, if I
 need to find a pattern from 1000 files via Galaxy Select tool from Text and
 Filter tool-group, I am limited by providing one file at a time to the tool
 which will take a long time to finish. Please correct me if there is a more
 sophisticated way to approach this problem. Second, related concern is
 running a tool in parallel on one or more HPC resources.

 We want to write a generic wrapper Galaxy tool, powered by Swift
 parallel framework such that it can run any arbitrary Galaxy tool in
 parallel on HPC resources. Currently, we have developed this capability but
 for external executables which is not a most secure way of using Galaxy as
 I understand from previous conversation.

 Having such a wrapper tool in a standard way is desirable so that it
 preserves the tool contract and binding within Galaxy environment. That is
 maintaining the history and metadata conventions of Galaxy.

 Thanks,
 Ketan


 On Wed, Feb 5, 2014 at 3:53 PM, John Chilton chil...@msi.umn.eduwrote:

 Galaxy has an API that is capable of running tools - certainly this is
 one path forward on something like this. I am not sure it is the best
 path forward though. Probably the best way to enhance Galaxy's
 execution capabilities is to extend the Galaxy core framework itself -
 this has its own downsides though.

 If you can offer more details about how you would like to enhance
 Galaxy - what it cannot do that you would like it to do - I or others
 may be able to provide more specific ideas. Otherwise, sorry I have
 not been or more help.

 -John



 On Tue, Feb 4, 2014 at 2:51 PM, Ketan Maheshwari ke...@mcs.anl.gov
 wrote:
  Hi,
 
  This is a question I posted to galaxy user mailing list a while back
 and was
  redirected to dev for possible answers:
  Is it possible in Galaxy to design a tool whose sole purpose is to
 run other
  tools. This is motivated by our desire to enhance execution
 capabilities of
  existing tools via a generic tool which acts as a wrapper.
 
  Thanks,
  Ketan
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
 
  To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/




 --
 Ketan


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/






-- 
Ketan
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists

Re: [galaxy-dev] running tools within tool

2014-02-06 Thread Ketan Maheshwari
Hi John, Alex, All,

Elaborating on the motivation behind my question of running tools within
tool. First, running a tool in parallel at large-scale. For example, if I
need to find a pattern from 1000 files via Galaxy Select tool from Text and
Filter tool-group, I am limited by providing one file at a time to the tool
which will take a long time to finish. Please correct me if there is a more
sophisticated way to approach this problem. Second, related concern is
running a tool in parallel on one or more HPC resources.

We want to write a generic wrapper Galaxy tool, powered by Swift parallel
framework such that it can run any arbitrary Galaxy tool in parallel on HPC
resources. Currently, we have developed this capability but for external
executables which is not a most secure way of using Galaxy as I understand
from previous conversation.

Having such a wrapper tool in a standard way is desirable so that it
preserves the tool contract and binding within Galaxy environment. That is
maintaining the history and metadata conventions of Galaxy.

Thanks,
Ketan


On Wed, Feb 5, 2014 at 3:53 PM, John Chilton chil...@msi.umn.edu wrote:

 Galaxy has an API that is capable of running tools - certainly this is
 one path forward on something like this. I am not sure it is the best
 path forward though. Probably the best way to enhance Galaxy's
 execution capabilities is to extend the Galaxy core framework itself -
 this has its own downsides though.

 If you can offer more details about how you would like to enhance
 Galaxy - what it cannot do that you would like it to do - I or others
 may be able to provide more specific ideas. Otherwise, sorry I have
 not been or more help.

 -John



 On Tue, Feb 4, 2014 at 2:51 PM, Ketan Maheshwari ke...@mcs.anl.gov
 wrote:
  Hi,
 
  This is a question I posted to galaxy user mailing list a while back and
 was
  redirected to dev for possible answers:
  Is it possible in Galaxy to design a tool whose sole purpose is to run
 other
  tools. This is motivated by our desire to enhance execution capabilities
 of
  existing tools via a generic tool which acts as a wrapper.
 
  Thanks,
  Ketan
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
 
  To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/




-- 
Ketan
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] Cheetah code issue

2014-02-04 Thread Ketan Maheshwari
Hi,

While developing a tool, I am facing Cheetah code issue. I looked up the
mailing list archive and found many similar issues reported earlier but
could not debug this. Any tips in debugging is appreciated.

File cheetah_DynamicallyCompiledCheetahTemplate_1391530084_29_71368.py,
line 97, in respond
NotFound: cannot find 'rngstart'

Attached is the xml tool definition file.

Best,
Ketan
tool id=swiftforeach name=foreach
descriptionA generic tool to run executable via Swift foreach parallel construct/description

command interpreter=bash

#if $rangeorlist.rl == rng 
swiftforeachrange.sh
$site 
$interpret
$exec
$rngstart
$rend
$stepsize
$outloc
$logfile
$outlist
#for $a in $arg
${a.argname}
#end for
#else
swiftforeachlist.sh
$site 
$interpret
$exec
$listfile
$outloc
$logfile
$outlist
#for $a in $arg
${a.argname}
#end for
#end if

/command

inputs
param name=site type=select label=Execution Location
option value=localhostLocalhost/option
option value=midwayMidway/option
option value=uc3UC3/option
option value=stampedeStampede/option
/param

param name=interpret type=select label=Execution interpreter
option value=shsh/option
option value=pythonpython/option
option value=javajava/option
option value=RR/option
option value=matlabmatlab/option
/param

param format=sh,binexec name=exec type=data label=Executable/

conditional name=rangeorlist
param name=rl type=select label=Select range or list
option value=rngnumeric range/option
option value=lstitems list/option
/param

when value=rng
param name=rngstart size=2 type=integer  value=0 label=start/
param name=rend size=2 type=integer  value=9 label=end/
param name=stepsize size=2 type=integer  value=1 label=stepsize/
/when

when value=lst
param format=data name=listfile type=data label=List file/
/when

/conditional

repeat name=arg title=arg
param name=argname type=text label=arg /
/repeat

param name=outloc size=50 type=text value=$HOME/swift-sandbox label=location for output files help=Swift will write output files here on disc
sanitizer sanitize=False /
/param

conditional name=configuration
param name=mode type=select label=Swift configuration
option value=defaultdefault configuration/option
option value=advancedadvanced configuration/option
/param

when value=default
/when
when value=advanced
param name=remoteurl size=50 type=text value=midway.swift.rcc.uchicago.edu label=remote url help=remote resource to run jobs on/param
param name=throttle size=5 type=float value=0.07 label=job throttle help=number of parallel jobs to run/param
param name=project type=text value=TG-STA110005S label=Project allocation help=name/code of project/param
param name=slots size=5 type=integer value=1 label=slots help=number of scheduler jobs/param
param name=queue size=5 type=text value=normal label=queue help=queue to run jobs on/param
param name=nodes size=5 type=integer value=1 label=nodes help=number of nodes requested/param
param name=nodegranularity size=5 type=integer value=1 label=node granularity help=node granularity/param
param name=jobspernode size=5 type=integer value=8 label=jobs per node help=number of jobs per node requested/param
param name=ppn size=5 type=integer value=8 label=ppn help=processes per node requested/param
param name=walltime size=5 type=text value=00:10:00 label=Job walltime help=time in hh:mm:ss to request to scheduler for this job/param
param name=maxtime size=5 type=integer value=700 label=application maxtime help=Application maxtime in seconds for this job/param
/when
/conditional
/inputs
outputs
data format=txt name=logfile type=data label=Swift output /
data format=txt name=outlist type=data label=Output list /
/outputs

!--
  code file=postprocess.py
hook postprocess=writeoutlist /
  /code
  --

help
.. class:: warningmark

**TIP**. Add args to provide additional arguments to your executable. 

-

**What it does**

This is a generic Swift tool that runs an executable over a range of numbers with arbitrary stepsize. 

Execution location allows user to declare where to run the tool. Executable can be any arbitrary executable of type binexec uploaded by user.

Start, end and stepsize are integer values. Note that the stepsize cannot be less than 1. Optionally, 

Re: [galaxy-dev] Cheetah code issue

2014-02-04 Thread Ketan Maheshwari
That did the trick! Thanks!


On Tue, Feb 4, 2014 at 10:26 AM, bjoern.gruen...@googlemail.com 
bjoern.gruen...@gmail.com wrote:

 Hi,

 to access variables in a conditional you need to write something like that:

 $rangeorlist.rngstart
 $rangeorlist.rend
 $rangeorlist.stepsize

 Cheers,
 Bjoern


 2014-02-04 Ketan Maheshwari ketancmaheshw...@gmail.com:

 Hi,

 While developing a tool, I am facing Cheetah code issue. I looked up the
 mailing list archive and found many similar issues reported earlier but
 could not debug this. Any tips in debugging is appreciated.

 File cheetah_DynamicallyCompiledCheetahTemplate_1391530084_29_71368.py,
 line 97, in respond
 NotFound: cannot find 'rngstart'

 Attached is the xml tool definition file.

 Best,
 Ketan


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/





-- 
Ketan
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] Running tools from within a tool

2014-02-04 Thread Ketan Maheshwari
Hi,

This is a question I posted to galaxy user mailing list a while back and
was redirected to dev for possible answers:
Is it possible in Galaxy to design a tool whose sole purpose is to run
other tools. This is motivated by our desire to enhance execution
capabilities of existing tools via a generic tool which acts as a wrapper.

Thanks,
Ketan
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] running tools within tool

2014-02-04 Thread Ketan Maheshwari
Hi,

This is a question I posted to galaxy user mailing list a while back and
was redirected to dev for possible answers:
Is it possible in Galaxy to design a tool whose sole purpose is to run
other tools. This is motivated by our desire to enhance execution
capabilities of existing tools via a generic tool which acts as a wrapper.

Thanks,
Ketan
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] datatype for executables

2014-01-29 Thread Ketan Maheshwari
Thanks for your answer. Yes, the idea is to have runnable executables such
as executable binaries and shell scripts to run via a parallelizing tool
Swift. As for security, current plan is to run Swift enabled Galaxy tools
in controlled cloud instances, which we are already doing.

With this datatype, we are planning to design a suite of generic tools such
that any arbitrary executable can be parallelized within Galaxy and can be
run on Clouds and remote clusters.

Is it possible to write a type file bin_or_exe which can detect the
executable bit of data before they are part of Galaxy's indexed data.

Thanks,
Ketan


On Tue, Jan 28, 2014 at 2:42 AM, Peter Cock p.j.a.c...@googlemail.comwrote:



 On Tuesday, January 28, 2014, Ketan Maheshwari ketancmaheshw...@gmail.com
 wrote:

 Is there a data type in Galaxy that identifies executables uniquely, eg.
 from the executable bit in the file perms or some other way?

 Thanks,


 Galaxy's data types are for data files - runnable tools/executables
 are handled via XML tool wrappers which define their options etc.

 Are you really asking about creating a datatype for a binary
 executable file? Or letting users run arbitrary tools? Even the
 idea of electing users run an arbitrary R script is dangerous
 enough from a security point of view.

 Peter


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] datatype for executables

2014-01-28 Thread Ketan Maheshwari
Hi Peter,

Thanks for the advice. I was trying to say that the potential users for
this case will run the tool on:
1. cloud instances that they own
2. PBS/Torque/SLURM interfaced HPC resources which they will have
authenticated access to.

This means that say in the worse case if some one chooses to run a
forkbomb, it will only kill her own resource. In my opinion this is no less
secure than say I wrap a forkbomb into a torque script and submit it to my
department cluster. I am accountable and traceable to any harm I do this
way.

The benefit to users on the other hand will be that they can easily test
their arbitrary applications to run on a larger scale via the
task-parallelism provided by Swift. Once a user is satisfied with the
behavior of her task on a compute node via Galaxy, she can follow our
recipe which will concretize her implementation as a tool to be used in
practice.

Were there any scenarios you had in mind that would lead to security
issues?

Thanks,
Ketan


On Tue, Jan 28, 2014 at 4:17 PM, Peter Cock p.j.a.c...@googlemail.comwrote:

 On Tue, Jan 28, 2014 at 8:26 PM, Ketan Maheshwari ke...@mcs.anl.gov
 wrote:
 
  Is it possible to write a type file bin_or_exe which can detect the
  executable bit of data before they are part of Galaxy's indexed data.
 
  Thanks,
  Ketan

 You haven't convinced me this is a good idea, but I would try this
 by defining a new datatype class in Python with a sniffer method
 which just checks for the executable bit (probably defined as a
 subclass of the binary datatype, see [1]) and then add this and
 its sniffer to the datatype XML file.

 Peter

 [1]
 https://bitbucket.org/galaxy/galaxy-central/src/default/lib/galaxy/datatypes/binary.py




-- 
Ketan
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] datatype for executables

2014-01-27 Thread Ketan Maheshwari
Is there a data type in Galaxy that identifies executables uniquely, eg.
from the executable bit in the file perms or some other way?

Thanks,
-- 
Ketan
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Using Mesos to Enable distributed computing under Galaxy?

2013-10-29 Thread Ketan Maheshwari
Hi Kyle,

Swift indeed is a complete framework for distributed computing.
Distributing files out to cluster nodes, starting processes, bringing back
result files to submit host is done out of the box (stagein-exec-stageout
cycle).

We can discuss offline if you are interested in giving it a shot.

Best,
Ketan


On Mon, Oct 28, 2013 at 4:14 PM, Kyle Ellrott kellr...@soe.ucsc.edu wrote:

 You probably are a good person to get an opinion from. My plan isn't to
 write new frameworks, but rather use existing libraries that can
 communicate with Mesos to setup their parallel environments.
 But for Swift, you would probably want to write a new framework. Just
 looking at Swift, I imagine one of the harder parts is just getting the
 system setup on a cluster (ie distributing out files to remote nodes,
 making sure that you have a way to start processes on those nodes and have
 them know where to find the master), it seems like Swift could benefit from
 having a Mesos based framework. Do you think it would enable you to have a
 'zero-config' startup of a distributed Swift application?

 Kyle



 On Mon, Oct 28, 2013 at 1:51 PM, Ketan Maheshwari 
 ketancmaheshw...@gmail.com wrote:

 Hi Kyle,

 We have a similar ongoing development wherein we are working on
 integrating our Swift framework ( swift-lang.org ) with Galaxy. The goal
 is to enable Galaxy based applications to run on a variety of distributed
 resources via various integration schemes as suitable to application and
 underlying execution environment.

 Here is an abstract of a paper (co-authored with Ravi, who responded on
 this thread) we will be presenting in a workshop at the upcoming SC 13
 conference:

 The Galaxy platform is a web-based science portal for scientific
 computing supporting Life Sciences users community. While user-friendly and
 intuitive for doing small to medium scale computations, it currently has a
 limited support for large-scale, parallel and distributed computing. The
 Swift parallel scripting framework is capable of composing ordinary
 applications into parallel scripts that can be run on multi-scale
 distributed and performance computing platforms. In complex distributed
 environments, often the user end of application lifecycle slows down
 because of the technical complexities brought in by the scale, access
 methods and resource management nuances. Galaxy offers a simple way of
 designing, composing, executing, reusing, and reproducing application runs.
 An integration between Swift and Galaxy systems can accelerate science as
 well as bring the respective user communities together in an interactive,
 user-friendly, parallel and distributed data analysis environment enabled
 on a broad range of computational infrastructures.

 Kindly let us know if you need a hands on for the various tools we have
 already developed.


 Best,
 Ketan



 On Mon, Oct 28, 2013 at 3:07 PM, Kyle Ellrott kellr...@soe.ucsc.eduwrote:

 I don't think implementation will be very difficult. The bigger question
 is this a technology people are open to?
 The nearest competitor is YARN (
 http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html).
 Mesos seems a bit more geared toward general purpose usage (with several
 existing frameworks), while YARN seems more specific to Hadoop. But I'd be
 glad to hear some other thoughts.

 Kyle


 On Mon, Oct 28, 2013 at 12:55 PM, Ravi K Madduri madd...@mcs.anl.govwrote:

 Kyle
 This is something I am very interested in. The three parts below make
 sense to me. I would be very happy to discuss further and provide any help
 to move this forward.

 Regards
 On Oct 26, 2013, at 2:43 PM, Kyle Ellrott kellr...@soe.ucsc.edu
 wrote:

 I think one of the aspects where Galaxy is a bit soft is the ability to
 do distributed tasks. The current system of split/replicate/merge tasks
 based on file type is a bit limited and hard for tool developers to expand
 upon. Distributed computing is a non-trival thing to implement and I think
 it would be a better use of our time to use an already existing framework.
 And it would also mean one less API for tool writers to have to develop 
 for.
 I was wondering if anybody has looked at Mesos (
 http://mesos.apache.org/ ). You can see an overview of the Mesos
 architecture at
 https://github.com/apache/mesos/blob/master/docs/Mesos-Architecture.md
 The important thing about Mesos is that it provides an API for C/C++,
 Java/Scala and Python to write distributed frameworks. There are already
 implementations of frameworks for common parallel programming systems such
 as:
  - Hadoop (https://github.com/mesos/hadoop)
  - MPI (
 https://github.com/apache/mesos/blob/master/docs/Running-torque-or-mpi-on-mesos.md
 )
  - Spark (http://spark-project.org)
 And you can find example Python framework at
 https://github.com/apache/mesos/tree/master/src/examples/python

 Integration with Galaxy would have three parts:
 1) Add a system config variable to Galaxy called 'MESOS_URL