Re: [galaxy-dev] Existing efforts to convert the QIIME pipeline to Galaxy?

2012-01-29 Thread Jim Johnson

Pat,

That sounds great.   Do one of you want to take ownership of the toolshed 
repository?
At minimum, we should add developers to the list that can push changes.

Thanks,

JJ

On 1/28/12 9:37 AM, Gillevet Patrick wrote:

Jim et al

Amanda has most of the scripts working now and will be putting them up on the 
toolshed.
She will be in touch as soon as the scripts are validated a couple of times 
with different datasets.

cheers...
Pat



On Dec 29, 2011, at 3:02 PM, Jim Johnson wrote:



It is easiest to generate tools for galaxy when the applications or scripts can 
take arbitrarily named input files and generate output to given path names.
Input directories, output directories are very convenient on the command line, 
but more of a challenge when crafting a galaxy tool.
That said, many applications require a wrapper script to work with in galaxy.
Thank you for the consistent script_info[] help/usage syntax in the qiime 
scripts,  which enabled me to generate a skeleton galaxy tool_config file for 
each qiime script.

I had some time last spring to work on integrating qiime into galaxy.
Unfortunately, I haven't had any time since to work on this.
I put those partial results  on the Galaxy Tool Shed: 
http://toolshed.g2.bx.psu.edu/
There's a continuing effort at George Mason University to incorporate qiime 
into galaxy tools, so you may want to ask them what they need.


I started by generating galaxy tool_config files, e.g. align_seqs.xml,  by 
using python to get the script_info[] from the qiime script:

$ cat generate_tool_config.bash
#!/usr/bin/env bash
python $1  ${1%.*}.help
cat tool_template.txt | sed s/__TOOL_BINARY__/${1}/ | python -i $1 -h  
${1%.*}.log

(I'll attach tool_template.txt )

This generated skeleton tool_config .xml files that I could then edit as needed.
( http://wiki.g2.bx.psu.edu/Admin/Tools/Tool%20Config%20Syntax )

I originally was calling all qiime scripts from a tool wrapper:  
qiime_wrapper.py
But, if a script can be called with any input filepaths and write its results 
to any filepaths, and only writes to STDERR when it fails, then you could call 
that script directly.


When should you use a tool_wrapper or call the qiime script directly?
  Many of the qiime scripts could probably be called directly, especially if it 
can be called with arbitary input/output file pathnames.
  The reasons for using a tool wrapper may be if input/output needs to be 
manipulated, moved, renamed in order to be used by the qiime script.
  You'll also need a tool wrapper if the names or number of the output files 
can not be determined from the parameter settings.
  ( http://wiki.g2.bx.psu.edu/Admin/Tools/Multiple%20Output%20Files )
  If your tool relies on a file ext to determine a format, you'll have to 
rename the input.
  ( Galaxy dataset pathnames will look something like:  
/your_galaxy_file_path/072/dataset_72931.dat )
  The format/type of a dataset is stored in its metadata, so the tool_config 
can use that information, especially if a script can take muliple alternative 
input formats.
  A tool_wrapper can also be used to manage the stdout or stderr from a tool.   
Galaxy currently interprets any output on stderr as a failure.



A couple changes in galaxy should make somethings easier than when I first 
attempted this:
  - galaxy now accepts dataset requests with sub directories. ( 
https://bitbucket.org/galaxy/galaxy-central/issue/494/support-sub-dirs-in-extra_files_path-patch
 )
That means that output HTML files with links into sub directories can be left intact, 
with the html copied to the output dataset and the linked files to its 
extra_files_path.
  - if you know the pathname of an output relative to the working directory, 
galaxy can copy it automatically to the output dataset using the from_work_dir 
attribute.
( see example in: 
https://bitbucket.org/galaxy/galaxy-central/src/21b645303c02/tools/ngs_rna/tophat_wrapper.xml
 )

Datatypes
  You may want to create new datatypes to make it easier for the user to 
correctly select inputs to a tool from previous outputs.
  For example, the qiime mapping file is a tabular file with specific 
requirements.  I put a 'qiimemapping' datatype in 
lib/galaxy/datatypes/metagenomics.py and datatypes_conf.xml
  so an input could generate a select list containing only qiimemapping 
datasets rather than all tabular ones.

Generating a configfile
  You can generate configfiles in the galaxy tool_config .xml file.   The 
configfile is generated by the Cheetah interpreter just as the commandline is.
  see:  alpha_rarefaction.xml

The qiime_wrapper.py was patterned after the mothur_wrapper.py   with some of 
the same wrapper params to handle run time determined output (perhaps not 
needed):
  --galaxy_datasets
 a comma separated list of regex:output_dataset the wrapper searches 
the working_dir and copies the file that matches the regex to the outout dataset
 if the exact pathname is known, use the from_work_dir attribute 

Re: [galaxy-dev] How to allow anonymous users to run workflows?

2012-01-29 Thread Tim te Beek
Hi all,

An update on this topic now that I've implemented a more satisfactory
solution on how to allow anonymous users to run workflows. This problem
arose when it became apparent that the journal we want to publish our
workflow in had as explicit requirement that users not be required to
register before using the web service. Galaxy at the moment requires users
to register before allowing them to use workflows, so a solution had to be
found.

I've combined ingredients from two Galaxy wiki pages to make resolve this
issue:
http://wiki.g2.bx.psu.edu/Admin/Config/Performance/Web%20Application%20Scaling
http://wiki.g2.bx.psu.edu/Admin/Config/Apache%20Proxy

From the scaling wiki page I've copied the setup of using a single runner
Galaxy instance and multiple web front ends; in my case I'm running two
web front ends. One of these web front ends lets Galaxy administrate users
normally. The other front end I'm proxying through Apache while using the
remote user feature, as specified on the Apache Proxy page. However, rather
than relying on an external authentication mechanism, I've instead
configured Apache to set the client's remote address as the remote user
using the following directive:
RequestHeader set REMOTE_USER '%{REMOTE_ADDR}s'

This approach allows me to run a single normal web frontend allowing for
normal registration, and another one available at a distinct subdomain
using the client's IP address as implicit stable identifier. (I'm well
aware IPs are not be stable for everyone, which is still the biggest caveat
with this approach.) I'm hoping however this will still satisfy the editors
for our publication.

My intention to share this setup with the mailinglist is twofold: on the
one hand this approach could be useful to others that have to implement the
same requirements (no mandatory registration to run a workflow), while on
the other hand my approach might unknowingly lead to problems down the
road. Should this be the case please notify me how best to resolve this.
Otherwise I'm quite happy using this approach.

Best,
Tim

On Tue, Sep 6, 2011 at 5:00 PM, Tim te Beek tim.te.b...@nbic.nl wrote:

 I completely agree it's a bit of an unfortunate requirement in this
 case, but I'm not averse to (minor) code changes to achieve a more
 polished user experience. Something like the best guess mechanism
 Galaxy currently employs to recognize returning anonymous would be
 fine, but I don't know where to look to disable the login requirement
 to run workflows, or if that's at all possible.

 Best regards,
 Tim

 On Tue, Sep 6, 2011 at 11:01 AM, Ross ross.laza...@gmail.com wrote:
  Here's one bad option for dealing with a bad requirement - at least it
  requires no code changes...
 
  1. create a new user called d...@where.ever.org for your Galaxy. Do
  not require login in universe_wsgi.ini
  2. edit welcome.html - invite visitors not wanting all the benefits of
  an individual registration to login as user:d...@where.ever.org if
  they want to run workflows
  3. Add an extremely blunt disclaimer - although Galaxy will likely be
  behaving 'correctly', unusual things will happen whenever 2 or more
  users using the same account are banging away at the same history -
  Individual registration is strongly recommended - register a free
  account with a throw away email address - it's only for password
  recovery.
  Clearly registration is not mandatory - just sensible and free, so the
  journal's off your back. Hope that demo account isn't multitasking too
  often.
 
  On Tue, Sep 6, 2011 at 6:17 PM, Tim te Beek tim.te.b...@nbic.nl wrote:
  Hi all,
 
  Was wondering how I can allow anonymous users to run workflows in my
  local Galaxy instance, as currently users need to be logged in to run
  workflows. I'd like drop this requirement in light of the intended
  publication of a workflow in a journal which demands that Web
  services must not require mandatory registration by the user.. Could
  any you tell me how I can accomplish this?
 
  I've seen the option to use an external authentication method which
  could be employed to artificially 'login' anonymous users for a single
  session, but it appears this would also disable the normal users
  administration mechanisms in Galaxy, so I'm not sure this would be a
  good fit. Any hints on how to proceed, either via this route or
  otherwise, would be much appreciated.
 
  Best regards,
  Tim
 

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] DRMAA error with latest update 26920e20157f

2012-01-29 Thread Shantanu Pavgi

I am getting following error with the latest galaxy-dist revision 
'26920e20157f' update.  The Python version is 2.6.6. 

{{{
galaxy.jobs.runners.drmaa ERROR 2012-01-29 21:00:28,577 Uncaught exception 
queueing job
Traceback (most recent call last):
  File /projects/galaxy/galaxy-165/lib/galaxy/jobs/runners/drmaa.py, line 
140, in run_next
self.queue_job( obj )
  File /projects/galaxy/galaxy-165/lib/galaxy/jobs/runners/drmaa.py, line 
190, in queue_job
command_line )
TypeError: not all arguments converted during string formatting
}}}

I was wondering if anyone else is experiencing this same issue. The system 
works fine when I rollback to revision 'b258de1e6cea'.  Are there any 
additional configuration details required with the latest revision that I am 
missing?? 

--
Shantanu
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] How to delete user (and unshared files) in local instance by galaxy-admin (panel)

2012-01-29 Thread Bossers, Alex
Greg,
Thanks. Stupid to have missed that universe setting
Alex


Van: Greg Von Kuster [mailto:g...@bx.psu.edu]
Verzonden: zaterdag 28 januari 2012 16:26
Aan: Bossers, Alex
CC: Hans-Rudolf Hotz; galaxy-dev@lists.bx.psu.edu
Onderwerp: Re: [galaxy-dev] How to delete user (and unshared files) in local 
instance by galaxy-admin (panel)

Hello Alex,

If you set the following in your universe_wsgi.ini, you'll be able to delete / 
purge users.

# Can an admin user delete user accounts?
allow_user_deletion = True

With the above setting, from the Admin perspective, select Manage users.  
You're User list grid will include buttons for deleting and purging user 
accounts.  A user account must be deleted before they can be purged.  We keep 
the User in the database ( marked as purged ), and stuff associated with the 
user's private role in case we want the ability to unpurge the user some time 
in the future.

Purging a deleted User deletes all of the following:
- History where user_id = User.id
   - HistoryDatasetAssociation where history_id = History.id
   - Dataset where HistoryDatasetAssociation.dataset_id = Dataset.id
- UserGroupAssociation where user_id == User.id
- UserRoleAssociation where user_id == User.id EXCEPT FOR THE PRIVATE ROLE
- UserAddress where user_id == User.id
Purging Histories and Datasets must be handled via the cleanup_datasets.py 
script.




[ ]

Emailhttp://localhost:8763/admin/users?sort=-email ↓

User Namehttp://localhost:8763/admin/users?sort=username

Groups

Roles

External

Last Login

Status

[ ]

d...@you.comhttp://localhost:8763/admin/users?sort=emailwebapp=galaxyoperation=informationid=f3f73e481f432006

admin-user

0

1

no

never

[ ]

te...@bx.psu.eduhttp://localhost:8763/admin/users?sort=emailwebapp=galaxyoperation=informationid=529fd61ab1c6cc36

regular-user1

0

1

no

2 days ago

[ ]

te...@bx.psu.eduhttp://localhost:8763/admin/users?sort=emailwebapp=galaxyoperation=informationid=d9abeb98649a6a7e

regular-user2

0

1

no

2 days ago

[ ]

t...@bx.psu.eduhttp://localhost:8763/admin/users?sort=emailwebapp=galaxyoperation=informationid=adb5f5c93f827949

test

0

1

no

~ 20 hours ago


For 0 selected items: [Reset Password] [Delete] [Undelete] [Purge]




On Jan 28, 2012, at 9:09 AM, Bossers, Alex wrote:


Hi Hans,
have to look up what it does... :)
Yes its public but users are required to signup themselves. So all data is 
linked to usersbut which data to whome?
Of course I can hack the DB and find out but I thought there should be a more 
convenient way of doing this for galaxy admins!?
Alex



Van: Hans-Rudolf Hotz [h...@fmi.ch]
Verzonden: zaterdag 28 januari 2012 13:10
Aan: Bossers, Alex
CC: galaxy-dev@lists.bx.psu.edumailto:galaxy-dev@lists.bx.psu.edu
Onderwerp: Re: [galaxy-dev] How to delete user (and unshared files) in local 
instance by galaxy-admin (panel)

Hi Alex

Since you are talking about your public server, I assume you don't have
external authentication. Hence, have you considered to turn on the
allow_user_impersonation option?

Nice and very efficient way of cleaning up.


Regards, Hans


On 01/27/2012 10:43 PM, Bossers, Alex wrote:

Hi All,

We are finally up-and-running again with the latest dist release. The previoous 
version was ok but already quite old.

For our local public server at wur we are now encountering the awaited disk 
space issues. as had to come some daybut sooner anyway... So we have 
been cleaning up. Used the cleanusp scripts for marked as deleted files and 
such.

But two issues remain for galaxy-admin users:
1) How to really delete a user and its non (no-longer) shared files? (from the 
admin panel)
2) Is there a way to get the user disk space usage in the admin panel (or using 
some other method?)? Thereby we can contact that user to push cleaning up files.

Thanks
Alex




___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/

Greg Von Kuster
Galaxy Development Team
g...@bx.psu.edumailto:g...@bx.psu.edu



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Disk size of all users?

2012-01-29 Thread Bossers, Alex
Reading this on the wiki: http://wiki.g2.bx.psu.edu/Admin/Disk%20Quotas
Shows that there is a record in the DB tracking the users allocated diskspace 
for histories.
Is there a convenient way to get this info using the galaxy admin panels?
Thereby we can track heavy users and urge them to cleanup or to improve data 
practice...

Thanks
Alex



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/