Re: [galaxy-dev] Running a workflow programatically

Dannon Baker Tue, 15 Mar 2011 18:07:36 -0700

On Mar 15, 2011, at 7:03 PM, Darren Brown wrote:

> But I am kinda stuck at what these guys actually mean.


The execute_workflow.py command line inputs are indeed a little clunky for all 
the information that has to go into each dataset mapping parameter.  I didn't 
imagine that this script would actually be used directly very often, but rather 
would serve as an example of how to execute a single workflow from code with 
particular inputs.

The three parts are workflow step, source type, and input id.  For the source 
component, use 'hda' with the encoded id you're getting from a history, or 
'ldda' for an id from a library dataset.

> Which brings me to my general question.  While I appear to be close in
> selecting the correct history and workflow ids, it only works right
> now as a proof of concept since I would need to sort of generate these
> on my own for a user to run a workflow via the galaxy interface.  It
> seems you are hashing these history, workflow and dataset ids, but I
> am not really sure what you are using to hash them.  Looks like not a
> SHA1 sum.  Given only access to the galaxy database, I would like to
> execute a workflow, so I would need to be able to generate the hashed
> values to throw at the api.  Does that make sense?

You could definitely generate the hashed values on your own if you wanted.  We 
use the blowfish implementation in pycrypto, with the 'id_secret' in your 
universe_wsgi.ini as the key.  Given an object_id and the id_secret from 
Galaxy, you should be able to do something like (code directly from 
lib/galaxy/web/security/__init__.py):

from Crypto.Cipher import Blowfish
cipher = Blowfish.new( id_secret_from_galaxy )
str_id = str(object_id)
padded_id = ( "!" * ( 8 - len(str_id) % 8 ) ) + str_id
encoded_id = cipher.encrypt(padded_id).encode('hex')

Ideally, however, this would all be done through the API and not reaching into 
the database directly.  Dataset level operations for pushing files into Galaxy, 
listing them ( and retrieving ldda's for use in things like the workflows api 
component) are supported for datasets in data libraries, but not in individual 
histories, yet.  I'd imagine this should be forthcoming soon.  In the 
meanwhile, at least for this, you might want to consider using a data library 
at least as an initial import destination from which you can do further work.  
The example_watch_folder.py in scripts/api has a more comprehensive example of 
doing programmatic execution on many datasets at once, as well as importing of 
those files from the filesystem into galaxy.  You should also be able to use 
the same approach I used there in finding or creating a data library to grab a 
workflow by name, instead of having to figure out the id ahead of time.

> Finally, can I generate an api key programatically as well?  Not the
> end of the world, but it would be nice.

No, though I suppose you could hack something together if you wanted, since you 
do have access directly to the database and don't seem to be opposed to poking 
around in there.  All you'd need is a user's id and whatever you want the key 
to be, toss that in the api_keys table making sure that the user doesn't 
already have one set.

Hope this helps, thanks for exploring all this new ground!

Dannon

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Running a workflow programatically

Reply via email to