Re: [galaxy-dev] Using API to identify all datasets that were part of a workflow?
the latter. starting with a dataset, pull it's full history. therefore if it was created by running a simple single-step tool it's one step. if it was created as part of a workflow, grab that whole series of steps/inputs/outputs. i agree on the python/java bindings being out of date, but even when i was scanning the JSON I wasnt able to see where I'd glean this information. the missing thing for me was always determining if a given dataset was connected to a larger workflow. -ben On Thu, Jun 25, 2015 at 7:26 AM, John Chilton jmchil...@gmail.com wrote: Can you clarify one thing for me - are you attempting to break a workflow invocation into steps, and then jobs, and then inputs and outputs (so working from the workflow invocation) or are you trying to scan existing histories and find a workflow for each dataset (so working from the history id and workflow id maybe)? I feel like this should be doable now - though blend4j and to a lesser extent even bioblend are pretty far behind what I would consider best practices for invoking workflows via the API so they may need to be updated. -John On Thu, Jun 25, 2015 at 10:04 AM, Ben Bimber bbim...@gmail.com wrote: Hello, I'm still relatively new to galaxy. I'm trying to use the API to identify the string of jobs/datasets that were created as part of executing a workflow. So far as I can tell, the API gives me the ID of the job, which corresponds to one step in the workflow. Each of these has inputs/outputs. I can walk outwards and try to connect any other jobs that happen to use one of these files as an input or output; however, I am not seeing any key that provides a more direct indication that a set of steps was executed as part of a given workflow. Am I missing something? Thanks in advance, Ben ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Using API to identify all datasets that were part of a workflow?
Hello, I'm still relatively new to galaxy. I'm trying to use the API to identify the string of jobs/datasets that were created as part of executing a workflow. So far as I can tell, the API gives me the ID of the job, which corresponds to one step in the workflow. Each of these has inputs/outputs. I can walk outwards and try to connect any other jobs that happen to use one of these files as an input or output; however, I am not seeing any key that provides a more direct indication that a set of steps was executed as part of a given workflow. Am I missing something? Thanks in advance, Ben ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] find UUID of current history in tool XML wrapper?
Hi Damion, Possibly a dumb question, but I thought I'd ask: if i understand your example, you're basically end-running galaxy due to the fact that for dynamic options you can execute code. When this is called, galaxy passes in an object that lets you access the current history ID. In other contexts, I have seen code within configfile or similar blocks that seems to essentially be calling native python code as well. Can a similar approach be used to accomplish what you're doing more directly? For example, the other day John Chilton posted this example for the API key: #from galaxy.managers import api_keys# ${api_keys.ApiKeyManager( $__app__ ).get_or_create_api_key( $__user__ )} If I'm already using non-public APIs to get the current history ID, can it be done more directly using something analogous to the above? Thanks, Ben On Tue, May 5, 2015 at 10:33 AM, Dooley, Damion damion.doo...@bccdc.ca wrote: About 1. find UUID of current history in tool XML wrapper? (Ben Bimber) 2. Re: find UUID of current history in tool XML wrapper? (John Chilton) I think this will work for you, I've simplified the code. I was able to do this somewhat circuitously (=bonfire of time) for my upcoming Versioned Data tool. In your tool XML definition file: param name=history_id display=radio type=drill_down dynamic_options=vdb_init_tool_user(__trans__) / ... code file=versioned_data_form.py / Not sure if making history_id a hidden field would work (I seemed to recall __trans__ variable only exposed to select param). And in a script named versioned_data_form.py we have: def vdb_init_tool_user(trans): ... ALSO: squeezing history_id in this way since no other way to pass it. trans is provided only by tool form presentation via code file=... ... history_id = str(trans.security.encode_id(trans.history.id)) items = [ { 'name': 'think of something to say here', 'value': history_id, 'options':[], 'selected': True } ] return items ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] find UUID of current history in tool XML wrapper?
Thanks to all of you for the comments and code help. John's example is grabbing the target history for the output, which is actually perfect for us. i get that 'current history' doesnt really mean anything, and the target history for the output file is a far more reliable option. I simplified the example to: $__tool_directory__/import_datasets_by_uuid.py -A $script_file -H ${ output.creating_job.history.name} -f ${output} which is working perfectly. I think my use cases and the example a few days ago from doug king all fall under the header of 'attempting to use the bioblend API from within a tool wrapper, executed via galaxy'. in those cases, being able to pass in some context about the current execution are required to get a reliable result. I get why it isnt support well right now, but I would argue that sort of application isnt inherently a misuse of galaxy. -Ben On Tue, May 5, 2015 at 12:14 PM, Dannon Baker dannon.ba...@gmail.com wrote: Galaxy's API goal is to be intentionally stateless, and methods that operate on a history accept that as a parameter and don't recognize the notion of 'current'. State like that is to be maintained in the client, whatever that is -- whether a webpage, cron job, etc. I think this approach is probably *more* flexible in that we don't box ourselves into relying on the notion of a 'current' history, though some few methods might currently rely on it. It's worth noting that this argument has conspicuous parallels to the argument against allowing tools to understand the 'current' history as well. On Tue, May 5, 2015 at 3:11 PM Dooley, Damion damion.doo...@bccdc.ca wrote: P.s. I tried using BioBlend and(or) the Galaxy API to get a given user's current history but found there was no way to do so. Seems like there is occasionally a tension between plaform by design (reduced instruction set, occasionally with security in mind, or dev resource limitations) and platform for creativity (everything exposed in case someone might find use for it). d. Hsiao lab, BC Public Health Microbiology Reference Laboratory, BC Centre for Disease Control 655 West 12th Avenue, Vancouver, British Columbia, V5Z 4R4 Canada From: Ben Bimber [bbim...@gmail.com] ... If I'm already using non-public APIs to get the current history ID, can it be done more directly using something analogous to the above? Thanks, Ben ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] find UUID of current history in tool XML wrapper?
Hello, I am trying to pass the UUID or name of the current history to a script as part of a tool XML wrapper. I was hoping some substitution like ${output.hid}, ${history_id} or similar would give this, but have not had luck. Is there an approach to do this? On a related note: does anyone have debugging tips or pointers to code that would help me figure out which substitutions galaxy will support? The tool wrapper docs list some; however, this isnt complete. Thanks in advance, Ben ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Possible to pass hostName to a tool?
Hello, I apologize for the second email as this is related to a question from two days ago. I am writing a galaxy tool that runs code that will query the galaxy server. Within the tool wrapper XML, is there a way to write the API URL of the current server? I dont see anything along these lines documented. Thanks in advance for any help, Ben ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Authenticating Against Galaxy from within a tool?
Hi John, Thanks. To elaborate a little more: we have a LabKey Server (web front end for a database) that manages raw files and metadata. The idea is to make a galaxy tool where a user could do thing like query for all genomes from males patients, age x-y, etc. I can use LabKey's APIs to return a list of those files. However, the information at this point is text (basically a list of filepaths). It would be nice to automatically create datasets for those files. Some of the time there's a simple 1:1 between file and the user-facing dataset (like images); however, for genomes we really want to make a paired collection. BioBlend makes it relatively easy to go from filepaths to datasets; however, the authentication issue is what wasnt clear. If there's a more standard path to go from list of files - galaxy dataset I'm all ears. -Ben On Wed, Apr 29, 2015 at 12:13 PM, John Chilton jmchil...@gmail.com wrote: Unofficial way of doing this and the workaround of using configfiles can be found in this thread: http://dev.list.galaxyproject.org/Simple-standard-for-API-use-of-a-global-user-key-that-all-loaded-tools-can-draw-upon-td4665659.html . There is a Trello card outlining platform work that should be done to support this better but we have not made progress on that. I would be interested in your use case. One can dynamically discover datasets with various db keys and build collections composted of dynamically discovered datasets and keys - but there is no way to use the tool xml to dynamically discover a variable number of collections - but it should be supported - as should building a list of paired or lists of lists where the outer list describes the key and the inner the sample. -John On Wed, Apr 29, 2015 at 2:58 PM, Ben Bimber bbim...@gmail.com wrote: Hello, I am new to galaxy. Im trying to write a data input tool wrapper. It will call a script that does a query, produces a list of files, and then the plan is to use the BioBlend API to create datasets/data collections in galaxy. In other words I am making the dataset(s) based on the contents of this file, rather than making a dataset from that file. In my case this file has a pair of columns representing FASTQ files, and I want to create one paired dataset collection in galaxy for each genome. In theory using bioblend to create datasets is easy. However, when I call this as a galaxy tool, I have not found a clean way to pass the credentials to bioblend. BioBlend needs to know the serverURL and either the user's API key or username/password. I think if I poke around $__app__ or $__user__ there's a good change I will find a property with the information I need; however, this has the obvious problem of writing the API key to the log. Is there another way to approach this problem? Thanks in advance for any help, Ben ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/