Re: [galaxy-dev] Existing efforts to convert the QIIME pipeline to Galaxy?
Pat, That sounds great. Do one of you want to take ownership of the toolshed repository? At minimum, we should add developers to the list that can push changes. Thanks, JJ On 1/28/12 9:37 AM, Gillevet Patrick wrote: Jim et al Amanda has most of the scripts working now and will be putting them up on the toolshed. She will be in touch as soon as the scripts are validated a couple of times with different datasets. cheers... Pat On Dec 29, 2011, at 3:02 PM, Jim Johnson wrote: It is easiest to generate tools for galaxy when the applications or scripts can take arbitrarily named input files and generate output to given path names. Input directories, output directories are very convenient on the command line, but more of a challenge when crafting a galaxy tool. That said, many applications require a wrapper script to work with in galaxy. Thank you for the consistent script_info[] help/usage syntax in the qiime scripts, which enabled me to generate a skeleton galaxy tool_config file for each qiime script. I had some time last spring to work on integrating qiime into galaxy. Unfortunately, I haven't had any time since to work on this. I put those partial results on the Galaxy Tool Shed: http://toolshed.g2.bx.psu.edu/ There's a continuing effort at George Mason University to incorporate qiime into galaxy tools, so you may want to ask them what they need. I started by generating galaxy tool_config files, e.g. align_seqs.xml, by using python to get the script_info[] from the qiime script: $ cat generate_tool_config.bash #!/usr/bin/env bash python $1 ${1%.*}.help cat tool_template.txt | sed s/__TOOL_BINARY__/${1}/ | python -i $1 -h ${1%.*}.log (I'll attach tool_template.txt ) This generated skeleton tool_config .xml files that I could then edit as needed. ( http://wiki.g2.bx.psu.edu/Admin/Tools/Tool%20Config%20Syntax ) I originally was calling all qiime scripts from a tool wrapper: qiime_wrapper.py But, if a script can be called with any input filepaths and write its results to any filepaths, and only writes to STDERR when it fails, then you could call that script directly. When should you use a tool_wrapper or call the qiime script directly? Many of the qiime scripts could probably be called directly, especially if it can be called with arbitary input/output file pathnames. The reasons for using a tool wrapper may be if input/output needs to be manipulated, moved, renamed in order to be used by the qiime script. You'll also need a tool wrapper if the names or number of the output files can not be determined from the parameter settings. ( http://wiki.g2.bx.psu.edu/Admin/Tools/Multiple%20Output%20Files ) If your tool relies on a file ext to determine a format, you'll have to rename the input. ( Galaxy dataset pathnames will look something like: /your_galaxy_file_path/072/dataset_72931.dat ) The format/type of a dataset is stored in its metadata, so the tool_config can use that information, especially if a script can take muliple alternative input formats. A tool_wrapper can also be used to manage the stdout or stderr from a tool. Galaxy currently interprets any output on stderr as a failure. A couple changes in galaxy should make somethings easier than when I first attempted this: - galaxy now accepts dataset requests with sub directories. ( https://bitbucket.org/galaxy/galaxy-central/issue/494/support-sub-dirs-in-extra_files_path-patch ) That means that output HTML files with links into sub directories can be left intact, with the html copied to the output dataset and the linked files to its extra_files_path. - if you know the pathname of an output relative to the working directory, galaxy can copy it automatically to the output dataset using the from_work_dir attribute. ( see example in: https://bitbucket.org/galaxy/galaxy-central/src/21b645303c02/tools/ngs_rna/tophat_wrapper.xml ) Datatypes You may want to create new datatypes to make it easier for the user to correctly select inputs to a tool from previous outputs. For example, the qiime mapping file is a tabular file with specific requirements. I put a 'qiimemapping' datatype in lib/galaxy/datatypes/metagenomics.py and datatypes_conf.xml so an input could generate a select list containing only qiimemapping datasets rather than all tabular ones. Generating a configfile You can generate configfiles in the galaxy tool_config .xml file. The configfile is generated by the Cheetah interpreter just as the commandline is. see: alpha_rarefaction.xml The qiime_wrapper.py was patterned after the mothur_wrapper.py with some of the same wrapper params to handle run time determined output (perhaps not needed): --galaxy_datasets a comma separated list of regex:output_dataset the wrapper searches the working_dir and copies the file that matches the regex to the outout dataset if the exact pathname is known, use the from_work_dir attribute
Re: [galaxy-dev] How to allow anonymous users to run workflows?
Hi all, An update on this topic now that I've implemented a more satisfactory solution on how to allow anonymous users to run workflows. This problem arose when it became apparent that the journal we want to publish our workflow in had as explicit requirement that users not be required to register before using the web service. Galaxy at the moment requires users to register before allowing them to use workflows, so a solution had to be found. I've combined ingredients from two Galaxy wiki pages to make resolve this issue: http://wiki.g2.bx.psu.edu/Admin/Config/Performance/Web%20Application%20Scaling http://wiki.g2.bx.psu.edu/Admin/Config/Apache%20Proxy From the scaling wiki page I've copied the setup of using a single runner Galaxy instance and multiple web front ends; in my case I'm running two web front ends. One of these web front ends lets Galaxy administrate users normally. The other front end I'm proxying through Apache while using the remote user feature, as specified on the Apache Proxy page. However, rather than relying on an external authentication mechanism, I've instead configured Apache to set the client's remote address as the remote user using the following directive: RequestHeader set REMOTE_USER '%{REMOTE_ADDR}s' This approach allows me to run a single normal web frontend allowing for normal registration, and another one available at a distinct subdomain using the client's IP address as implicit stable identifier. (I'm well aware IPs are not be stable for everyone, which is still the biggest caveat with this approach.) I'm hoping however this will still satisfy the editors for our publication. My intention to share this setup with the mailinglist is twofold: on the one hand this approach could be useful to others that have to implement the same requirements (no mandatory registration to run a workflow), while on the other hand my approach might unknowingly lead to problems down the road. Should this be the case please notify me how best to resolve this. Otherwise I'm quite happy using this approach. Best, Tim On Tue, Sep 6, 2011 at 5:00 PM, Tim te Beek tim.te.b...@nbic.nl wrote: I completely agree it's a bit of an unfortunate requirement in this case, but I'm not averse to (minor) code changes to achieve a more polished user experience. Something like the best guess mechanism Galaxy currently employs to recognize returning anonymous would be fine, but I don't know where to look to disable the login requirement to run workflows, or if that's at all possible. Best regards, Tim On Tue, Sep 6, 2011 at 11:01 AM, Ross ross.laza...@gmail.com wrote: Here's one bad option for dealing with a bad requirement - at least it requires no code changes... 1. create a new user called d...@where.ever.org for your Galaxy. Do not require login in universe_wsgi.ini 2. edit welcome.html - invite visitors not wanting all the benefits of an individual registration to login as user:d...@where.ever.org if they want to run workflows 3. Add an extremely blunt disclaimer - although Galaxy will likely be behaving 'correctly', unusual things will happen whenever 2 or more users using the same account are banging away at the same history - Individual registration is strongly recommended - register a free account with a throw away email address - it's only for password recovery. Clearly registration is not mandatory - just sensible and free, so the journal's off your back. Hope that demo account isn't multitasking too often. On Tue, Sep 6, 2011 at 6:17 PM, Tim te Beek tim.te.b...@nbic.nl wrote: Hi all, Was wondering how I can allow anonymous users to run workflows in my local Galaxy instance, as currently users need to be logged in to run workflows. I'd like drop this requirement in light of the intended publication of a workflow in a journal which demands that Web services must not require mandatory registration by the user.. Could any you tell me how I can accomplish this? I've seen the option to use an external authentication method which could be employed to artificially 'login' anonymous users for a single session, but it appears this would also disable the normal users administration mechanisms in Galaxy, so I'm not sure this would be a good fit. Any hints on how to proceed, either via this route or otherwise, would be much appreciated. Best regards, Tim ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] DRMAA error with latest update 26920e20157f
I am getting following error with the latest galaxy-dist revision '26920e20157f' update. The Python version is 2.6.6. {{{ galaxy.jobs.runners.drmaa ERROR 2012-01-29 21:00:28,577 Uncaught exception queueing job Traceback (most recent call last): File /projects/galaxy/galaxy-165/lib/galaxy/jobs/runners/drmaa.py, line 140, in run_next self.queue_job( obj ) File /projects/galaxy/galaxy-165/lib/galaxy/jobs/runners/drmaa.py, line 190, in queue_job command_line ) TypeError: not all arguments converted during string formatting }}} I was wondering if anyone else is experiencing this same issue. The system works fine when I rollback to revision 'b258de1e6cea'. Are there any additional configuration details required with the latest revision that I am missing?? -- Shantanu ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] How to delete user (and unshared files) in local instance by galaxy-admin (panel)
Greg, Thanks. Stupid to have missed that universe setting Alex Van: Greg Von Kuster [mailto:g...@bx.psu.edu] Verzonden: zaterdag 28 januari 2012 16:26 Aan: Bossers, Alex CC: Hans-Rudolf Hotz; galaxy-dev@lists.bx.psu.edu Onderwerp: Re: [galaxy-dev] How to delete user (and unshared files) in local instance by galaxy-admin (panel) Hello Alex, If you set the following in your universe_wsgi.ini, you'll be able to delete / purge users. # Can an admin user delete user accounts? allow_user_deletion = True With the above setting, from the Admin perspective, select Manage users. You're User list grid will include buttons for deleting and purging user accounts. A user account must be deleted before they can be purged. We keep the User in the database ( marked as purged ), and stuff associated with the user's private role in case we want the ability to unpurge the user some time in the future. Purging a deleted User deletes all of the following: - History where user_id = User.id - HistoryDatasetAssociation where history_id = History.id - Dataset where HistoryDatasetAssociation.dataset_id = Dataset.id - UserGroupAssociation where user_id == User.id - UserRoleAssociation where user_id == User.id EXCEPT FOR THE PRIVATE ROLE - UserAddress where user_id == User.id Purging Histories and Datasets must be handled via the cleanup_datasets.py script. [ ] Emailhttp://localhost:8763/admin/users?sort=-email ↓ User Namehttp://localhost:8763/admin/users?sort=username Groups Roles External Last Login Status [ ] d...@you.comhttp://localhost:8763/admin/users?sort=emailwebapp=galaxyoperation=informationid=f3f73e481f432006 admin-user 0 1 no never [ ] te...@bx.psu.eduhttp://localhost:8763/admin/users?sort=emailwebapp=galaxyoperation=informationid=529fd61ab1c6cc36 regular-user1 0 1 no 2 days ago [ ] te...@bx.psu.eduhttp://localhost:8763/admin/users?sort=emailwebapp=galaxyoperation=informationid=d9abeb98649a6a7e regular-user2 0 1 no 2 days ago [ ] t...@bx.psu.eduhttp://localhost:8763/admin/users?sort=emailwebapp=galaxyoperation=informationid=adb5f5c93f827949 test 0 1 no ~ 20 hours ago For 0 selected items: [Reset Password] [Delete] [Undelete] [Purge] On Jan 28, 2012, at 9:09 AM, Bossers, Alex wrote: Hi Hans, have to look up what it does... :) Yes its public but users are required to signup themselves. So all data is linked to usersbut which data to whome? Of course I can hack the DB and find out but I thought there should be a more convenient way of doing this for galaxy admins!? Alex Van: Hans-Rudolf Hotz [h...@fmi.ch] Verzonden: zaterdag 28 januari 2012 13:10 Aan: Bossers, Alex CC: galaxy-dev@lists.bx.psu.edumailto:galaxy-dev@lists.bx.psu.edu Onderwerp: Re: [galaxy-dev] How to delete user (and unshared files) in local instance by galaxy-admin (panel) Hi Alex Since you are talking about your public server, I assume you don't have external authentication. Hence, have you considered to turn on the allow_user_impersonation option? Nice and very efficient way of cleaning up. Regards, Hans On 01/27/2012 10:43 PM, Bossers, Alex wrote: Hi All, We are finally up-and-running again with the latest dist release. The previoous version was ok but already quite old. For our local public server at wur we are now encountering the awaited disk space issues. as had to come some daybut sooner anyway... So we have been cleaning up. Used the cleanusp scripts for marked as deleted files and such. But two issues remain for galaxy-admin users: 1) How to really delete a user and its non (no-longer) shared files? (from the admin panel) 2) Is there a way to get the user disk space usage in the admin panel (or using some other method?)? Thereby we can contact that user to push cleaning up files. Thanks Alex ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ Greg Von Kuster Galaxy Development Team g...@bx.psu.edumailto:g...@bx.psu.edu ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Disk size of all users?
Reading this on the wiki: http://wiki.g2.bx.psu.edu/Admin/Disk%20Quotas Shows that there is a record in the DB tracking the users allocated diskspace for histories. Is there a convenient way to get this info using the galaxy admin panels? Thereby we can track heavy users and urge them to cleanup or to improve data practice... Thanks Alex ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/