[galaxy-dev] Fwd: Managing Data Locality

John Chilton Thu, 14 Nov 2013 10:15:54 -0800

Forgot to cc galaxy-dev on this, it is too riveting not to post :).

-John

---------- Forwarded message ----------
From: John Chilton <chil0...@umn.edu>
Date: Thu, Nov 14, 2013 at 10:00 AM
Subject: Re: [galaxy-dev] Managing Data Locality
To: "Paniagua, Eric" <epani...@cshl.edu>

Hey Eric,

  Sorry for the delayed response, I have pushed some updates to
galaxy-central and the lwr to close some loop and fill out
documentation based on your comments.

  I worry my last e-mail didn't make it clear that what you want to do
is very ... ambitious. I didn't mean to make it sound like this was a
solved problem, just that it was a problem that people were working on
various parts of. The additional wrinkle that you would like run these
jobs as the actual user is another significant hurdle.  All of that
said, you are certainly not alone or unreasonable in wanting this
functionality - many large computing centers have very similar use
cases and have made various degree of progress including my former
employer the Minnesota Supercomputing Institute and I doubt anyone is
currently using the LWR in this capacity at such centers (it is still
mostly used for submitting jobs to Windows servers).

On Fri, Nov 8, 2013 at 4:04 PM, Paniagua, Eric <epani...@cshl.edu> wrote:
> Hi John,
>
> I have now read the top-level documentation for LWR, and gone through the 
> sample configurations.  I would appreciate if you would answer a few 
> technical questions for me.
>
> 1) How exactly is the "staging_directory" in "server.ini.sample" used?  Is 
> that intended to be the (final) location at which to put files on the remote 
> server?  How is the relative path structure under 
> $GALAXY_ROOT/databases/files handled?

Depending on the configuration, either the LWR client or the LWR will
copy/transfer files out of $GALAXY_ROOT/database/files into
${staging_directory}/${job_identifier}.In your case
$GALAXY_ROOT/database/files will not be mounted on the large compute
cluster but staging_directory should be.  Here, job_identifier can be
either the Galaxy job id or a UUID if you want to allow multiple
Galaxy instances to submit to the same LWR (see assign_uuid option in
server.ini.sample).

>
> 2) What exactly does "persistence_directory" in "server.ini.sample" mean?  
> Where should it be located, how will it be used?

The LWR doesn't talk to a "real" database, it just uses a directory to
persist various internal mappings that should persist beyond an LWR
restart. You shouldn't need to modify this unless the
$LWR_ROOT/persisted_data is not writable to the LWR user (in your case
the LWR user should likely be the Galaxy user). I have filled out the
documentation in server.ini.sample to reflect this.

>
> 3) What exactly does "file_cache_dir" in "server.ini.sample" mean?

It is an experimental feature - it may help you down the road to cache
large files on the file system available to the whole cluster on that
file system so they only need to be transferred out of
$GALAXY_ROOT/database/files once. This option is not used unless it is
specified however, so I would try to get it working in a simpler
(though still very complicated :) ) configuration first. I have filled
out the documentation in server.ini.sample to reflect this.

>
> 4) Does LWR preserve some relative path (e.g. to GALAXY_ROOT) under the above 
> directories?

No.

>
> 5) Are files renamed when cached?  If so, are they eventually restored to 
> their original names?

They are put in new directory structures (e.g.
${staging_directory}/${job_id}/{inputs,outputs}), but I believe they
should have the same names. They do eventually get plopped back into
$GALAXY_ROOT/database/files. I have never used the LWR with an object
store - so none of this might work. Hopefully, if there is a fix
needed it will be easy.

>
> 6) Is it possible to customize the DRMAA and/or qsub requests made by LWR, 
> for example to include additional settings such as Project or a memory limit? 
>  Is it possible to customize this on a case by case basis, rather than 
> globally?

Yes, there are lots of possibilities here (probably too many) and
obviously none particularly well documented. There is an LWR way of
doing this, but I the best thing to do is going to be piggy backing on
job_conf.xml (the Galaxy way). You will want to review the
documentation for how to setup job_conf.xml and ponit to it, but once
you set up an LWR destination you can specify a native specification
to pass along to the LWR by adding the following tag:

  <param id="submit_native_specification">-P bignodes -R y -pe threads 8</param>

This will only work if your are targeting a queued_drmaa or
queued_external_drmaa job manager on the LWR side...

>
> 7) Are there any options for the "queued_drmaa" manager in 
> "job_managers.ini.sample" which are not listed in that file?

Yes, in particular native_specification can be specified just likely
Galaxy here. I have updated the sample.

However, given your setup (run as real user) you will want the manager
type 'queued_external_drmaa' which runs DRMAA communications in a
separate process that can be run as a different user. This has some
additional options "production", chown_working_directory_script,
drmaa_kill_script, drmaa_launch_script. The defaults for all these
should just work but you can modify them if you want. You will need to
add the following rules or some variant of them to your sudoers file
on the LWR server.

galaxy  ALL = (root) NOPASSWD: SETENV:
/home/galaxy/lwr/scripts/drmaa_external_runner.sh
galaxy  ALL = (root) NOPASSWD: SETENV:
/home/galaxy/lwr/scripts/drmaa_external_killer.py
galaxy  ALL = (root) NOPASSWD: SETENV: /usr/bin/chown

This last rule can be significantly restricted since all chowns will
be of the form "chown -R '{user}'
'/path/to/staging_directory/job_id'". I need to work documentation of
all this though.

If using the external version of the drmaa runner, you will need to
add another param element to the destination to pass along the user,
specifically:

<param id="submit_user">$__user_name__</param>

>
> 8) What exactly are the differences between the "queued_drmaa" manager and 
> the "queued_cli" manager?  Are there any options for the latter which are not 
> in the "job_managers.ini.sample" file?

queued_drmaa/queued_external_drmaa uses the DRMAA api to communicate
with the DRM, queued_cli uses the qsub and qstat commands (this feels
less clean, but is probably just as good). There is no
queue_external_cli, but it would probably be easy to add.

>
> 9) When I attempt to run LWR (not having completed all the mentioned 
> preparation steps, namely without setting DRMAA_LIBRARY_PATH), I get a Seg 
> fault.  Is this because it can't find DRMAA or is it potentially unrelated?  
> In the latter case, here's the error being output to the console:
>
> ./run.sh: line 65: 26277 Segmentation fault      paster serve server.ini "$@"

Can you try to simplify the configuration, do not specify a
job_managers file for instance, this would narrow it down to a
DRMAA_LIBRARY_PATH problem. Also, sometimes you need to update the
LD_LIBRARY_PATH to place the underlying (pbs?) shared library on it.

>
> Lastly, a simple comment, hopefully helpful.  It would be nice if the LWR 
> install docs at least mentioned the dependency of PyOpenSSL 0.13 (or later) 
> on OpenSSL 0.9.8f (or later), maybe even with a comment that "pip" will 
> listen to the environment variables CFLAGS and LDFLAGS in the event one is 
> creating a local installation of the OpenSSL library for LWR to use.

I have tried to update the documentation to reflect this, though in my
defense it mentions pyopenssl a couple places :). If you have more
specific recommendations, can you send me a patch for the README.rst
file or issue a pull request and I will consider additional changes.

>
> Thank you for your time and assistance.
>
> Best,
> Eric
> ________________________________________
> From: jmchil...@gmail.com [jmchil...@gmail.com] on behalf of John Chilton 
> [chil...@msi.umn.edu]
> Sent: Tuesday, November 05, 2013 11:58 AM
> To: Paniagua, Eric
> Cc: Galaxy Dev [galaxy-...@bx.psu.edu]
> Subject: Re: [galaxy-dev] Managing Data Locality
>
> Hey Eric,
>
> I think what you are purposing would be a major development effort and
> mirrors major development efforts ongoing. There are  sortof ways to
> do this already, with various trade-offs, and none particularly well
> documented. So before undertaking this efforts I would dig into some
> alternatives.
>
> If you are using PBS, the PBS runner contains some logic for
> delegating to PBS for doing this kind of thing - I have never tried
> it.
>
> https://bitbucket.org/galaxy/galaxy-central/src/default/lib/galaxy/jobs/runners/pbs.py#cl-245
>
> In may be possible to use a specially configured handler and the
> Galaxy object store to stage files to a particular mount before
> running jobs - not sure it makes sense in this case. It might be worth
> looking into this (having the object store stage your files, instead
> of solving it at the job runner level).
>
> My recommendation however would be to investigate the LWR job runner.
> There are a bunch of fairly recent developments to enable something
> like what you are describing. For specificity lets say you are using
> DRMAA to talk to some HPC cluster and Galaxy's file data is stored in
> /galaxy/data on the galaxy web server but not on the HPC and there is
> some scratch space (/scratch) that is mounted on both the Galaxy web
> server and your HPC cluster.
>
> I would stand up an LWR (http://lwr.readthedocs.org/en/latest/) server
> right beside Galaxy on your web server. The LWR has a concept of
> managers that sort of mirrors the concept of runners in Galaxy - see
> the sample config for guidance on how to get it to talk with your
> cluster. It could use DRMAA, torque command-line tools, or condor at
> this time (I could add new methods e.g. PBS library if that would
> help). 
> https://bitbucket.org/jmchilton/lwr/src/default/job_managers.ini.sample?at=default
>
> On the Galaxy side, I would then create a job_conf.xml file telling
> certain HPC tools to be sent to the LWR. Be sure to enable the LWR
> runner at the top (see advanced example config) and then add at least
> one LWR destination.
>
>  <destinations>
>     ....
>     <destination id="lwr" runner="lwr">
>       <param id="url">http://localhost:8913/</param>
>       <!-- Leave Galaxy directory and data indices alone, assumes they
> are mounted in both places. -->
>       <param id="default_file_action">none</param>
>       <!-- Do stage everything in /galaxy/data though -->
>       <param id="file_action_config">file_actions.json</param>
>     </destination>
>
> Then create a file_actions.json file in the Galaxy root directory
> (structure of this file is subject to change, current json layout
> doesn't feel very Galaxy-ish).
>
> {"paths": [
> {"path": "/galaxy/data", "action": "copy"}
> ] }
>
> More details on the structure of this file_actions.json file can be
> found in the following changeset:
> https://bitbucket.org/galaxy/galaxy-central/commits/b0b83be30136e2939a4a4f5d80dda8f8c853c0a2
>
> I am really eager to see the LWR gain adoption and tackle tricky cases
> like this, so if there is anything I can do to help please let me know
> and contributions in terms of development or documentation would be
> greatly appreciated as well.
>
> Hope this helps,
> -John
>
> On Tue, Nov 5, 2013 at 8:23 AM, Paniagua, Eric <epani...@cshl.edu> wrote:
>> Dear Galaxy Developers,
>>
>> I administer a Galaxy instance at Cold Spring Harbor Laboratory, which 
>> servers around 200 laboratory members.  While our initial hardware purchase 
>> has scaled well for the last 3 years, we are finding that we can't quite 
>> keep up with rising the demand for compute-intensive jobs, such as mapping.  
>> We are hesitant to consider buying more hardware to support the load, since 
>> we can't expect that solution to scale.
>>
>> Rather, we are attempting to set up Galaxy to queue jobs (especially 
>> mappers) out to the lab's HPCC to accommodate the increasing load.  While 
>> there is a good number of technical challenges involved in this strategy, I 
>> am only writing to ask about one: data locality.
>>
>> Normally, all Galaxy datasets are stored directly on the private server 
>> hosting our Galaxy instance.  The HPCC cannot mount our Galaxy server's 
>> storage (ie: for the purpose of running jobs reading/writing datasets) for 
>> security reasons.  However, we can mount a small portion of the HPCC file 
>> system to our Galaxy server.  Storage on the HPCC is at a premium, so we 
>> can't afford to just let newly created (or copied) datasets just sit there.  
>> It follows that we need a mechanism for maintaining temporary storage in the 
>> (restricted) HPCC space which allows for transfer of input datasets to the 
>> HPCC (so they will be visible to jobs running there) and transfer of output 
>> datasets back to persistent storage on our server.
>>
>> I am in the process of analyzing when/where/how exact path names are 
>> substituted into tool command lines, looking for potential hooks to 
>> facilitate the staging/unstaging of data before/after job execution on the 
>> HPCC.  I have found a few places where I might try to insert logic for 
>> handling this case.
>>
>> Before modifying too much of Galaxy's core code, I would like to know if 
>> there is a recommended method for handling this situation and whether other 
>> members of the Galaxy community have implemented fixes or workarounds for 
>> this or similar data locality issues.  If you can offer either type of 
>> information, I shall be most grateful.  Of course, if the answer were that 
>> there were no recommended or known technique, then that would be valuable 
>> information too.
>>
>> Thank you in advance,
>> Eric Paniagua
>>
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>>   http://lists.bx.psu.edu/
>>
>> To search Galaxy mailing lists use the unified search at:
>>   http://galaxyproject.org/search/mailinglists/

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] Fwd: Managing Data Locality

Reply via email to