Galaxy sites usually do all work a compute cluster, with all jobs submitted
as a "galaxy" unix user, so there isn't any "fair-share" accounting between
users.

Other sysops have created a solution to run jobs as the actual unix user,
which may be feasible for an intranet site but is undesirable for a site
accessible via the internet due to security reasons.

A simpler and more secure method to enable fair-share is by using projects.

Here's a simple scenario and straightforward solution:  Multiple groups in
an organization use the same galaxy site and it is desirable to enable
fair-share accounting between the groups.  All users in a group consume the
same fair-share, which is generally acceptable.

1) configure scheduler with a project for each group, configure each user
to use their group's project by default, and grant galaxy user access to
submit jobs to any project; all users should be associated with a project.
 There's a good chance your grid is already configured this way.

2) create a database which maps galaxy user id to a project; i use a cron
job to create a standalone sqlite3 db.  since this is site-specific, code
is not provided but hints are given below.  Rather than having a separate
database, the proj could have been added to the galaxy db, but i sought to
minimize my changes.

3) add a snippet of code to drmaa.py's queue_job method to lookup proj from
job_wrapper.user_id and append to jt.nativeSpecification; see below

Here are the changes required.  It's small enough that I didn't do this as
a clone/patch.

(1) lib/galaxy/jobs/runners/drmaa.py:

 11 import sqlite3
 12
...
155         native_spec = self.get_native_spec( runner_url )
156
157         # BEGIN ADD USER'S PROJ
158         if self.app.config.user_proj_map_db is not None:
159             try:
160                 conn = sqlite3.connect(self.app.config.user_proj_map_db)
161                 c = conn.cursor()
162                 c.execute('SELECT PROJ FROM USER_PROJ WHERE GID=?',
[job_wrapper.user_id])
163                 row = c.fetchone()
164                 c.close
165                 native_spec += ' -P ' + row[0]
166             except:
167                 log.debug("Cannot look up proj of user %s" %
job_wrapper.user_id)
168         # END ADD USER'S PROJ

(2) lib/galaxy/config.py: add support for user_proj_map_db variable

        self.user_proj_map_db = resolve_path( kwargs.get(
"user_proj_map_db", None ), self.root )

(3) universe_wsgi.ini:

user_proj_map_db = /some/path/to/user_proj_map_db.sqlite

(4) here's some suggestions to help get you started on a script to make the
sqlite3 db.

a) parse ldap tree example: (to get uid:email)
ldapsearch -LLL -x -b 'ou=aliases,dc=jgi,dc=gov'

b) parse scheduler config: (to get uid:proj)
qconf -suserl | /usr/bin/xargs -I '{}' qconf -suser '{}' | egrep
'name|default_project'

c) query galaxy db: (to get gid:email)
select id, email from galaxy_user;

The limitation of this method is that all jobs submitted by a user will
always be charged to the same project (which may be okay, depending on how
your organization uses projects).  However a user may have access to
several projects and may wish to associate some jobs with a particular
project.  This could be accomplished by adding an option to the user
preferences; a user would chose a project from their available projects and
any jobs submitted would have to record their currently chosen project.
 Alternatively, histories could be associated with a particular project.
 This solution would require significant changes to galaxy, so i haven't
implemented it (and the simple solution works well enough for me).

Edward Kirton
US DOE JGI
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to