Re: [galaxy-dev] Text file busy and conda

2018-05-16 Thread John Letaw
Hi Marius,

Yes, you were right, I was using the conda_auto_install option.  Since I first 
brought this up, we have been able to get most of our problems solved, and I 
appreciate your help.

I am again seeing the bad interpreter/text file busy message for a perl script 
I am trying to run.  I have tried messing with all the different combinations 
of use_cached_dependencies, copy_conda_dependencies, conda_auto_install, etc.  
Any other ideas?

Thanks,
John

From: Marius van den Beek <m.vandenb...@gmail.com>
Date: Tuesday, January 30, 2018 at 12:16 AM
To: John Letaw <le...@ohsu.edu>
Cc: galaxy-dev <galaxy-dev@lists.galaxyproject.org>
Subject: Re: [galaxy-dev] Text file busy and conda

Hi John,

when you say tool autoinstalled via conda do you mean that the dependencies
were automatically installed during the repository installation process,
or that you have `conda_auto_install = True` in your galaxy.ini ?

The latter is not working reliably while the dependencies have not finished the 
installation process.
To avoid this you can make sure that the dependency is correctly installed 
before running the tool
(the dependency in the admin panel -> Manage tool dependencies is green and no 
conda process is still active).

Best,
Marius

On 30 January 2018 at 03:15, John Letaw <le...@ohsu.edu<mailto:le...@ohsu.edu>> 
wrote:
Hi all,

I’m receiving the following error trying to access tools autoinstalled via 
conda:

Fatal error: Exit code 126 ()
~/galaxy/database/jobs_directory/000/38/tool_script.sh: line 41: 
~/galaxy/database/dependencies/_conda/envs/mulled-v1-bb83f93efd111e042823e53ddfa74c32d81ba74cceca9445dfddfc9e940ff738/bin/samtools:
 Text file busy

So, something is happening too fast I guess, meaning a process is attempting 
access while this is being created.  Any ideas on how I can diagnose this?  Not 
seeing any other errors floating around the logs.  Maybe we are dealing with a 
lustre file locking issue?

Thanks,
John

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/

[galaxy-dev] API Access and User Impersonation

2018-03-12 Thread John Letaw
Hi all,

We have a system in place that sends jobs to a compute cluster based on the 
real user name, as opposed to something like ‘galaxyuser’.  Galaxy workflows 
are created and invoked using bioblend code, so the users don’t have to go in 
and manually set workflow inputs.  However, this means each user needs admin 
access in order to create these workflows via API.

Additionally, we have one or two actual admin users that are charged with 
fixing the occasional workflow problem that pops up.  The ability to 
impersonate users is super helpful in this situation, as you might imagine.  
So, I’m stuck in this situation where I’d rather not have MOST users with 
impersonate access.  I don’t know of any way to do this, do you?  With our 
setup, can anyone recommend an alternate configuration that would close this 
security hole?  How hard is it to feed a list on email address to the user 
impersonation config variable?

Thanks,
John
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/

[galaxy-dev] Text file busy and conda

2018-01-29 Thread John Letaw
Hi all,

I’m receiving the following error trying to access tools autoinstalled via 
conda:

Fatal error: Exit code 126 ()
~/galaxy/database/jobs_directory/000/38/tool_script.sh: line 41: 
~/galaxy/database/dependencies/_conda/envs/mulled-v1-bb83f93efd111e042823e53ddfa74c32d81ba74cceca9445dfddfc9e940ff738/bin/samtools:
 Text file busy

So, something is happening too fast I guess, meaning a process is attempting 
access while this is being created.  Any ideas on how I can diagnose this?  Not 
seeing any other errors floating around the logs.  Maybe we are dealing with a 
lustre file locking issue?

Thanks,
John
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/

Re: [galaxy-dev] Cluster Install Path Errors

2018-01-29 Thread John Letaw
Hi Nate,

Thanks for the quick response, it gave me the lightbulb I needed.  I did the 
following:

On the cluster side, I custom installed python libraries based on the 
instructions in “Framework dependencies.”  This was placed in the Galaxy 
directory, but there would’ve been access to OS specific libraries.  Then, I 
created a vm-specific directory, and also installed libraries there via 
virtualenv.  In job_conf, I added a line to force use of the Galaxy directory 
venv.  Before starting Galaxy with the –no-create-env and –skip-wheels flags, I 
set GALAXY_VIRTUAL_ENV explicitly to look in the vm-specific venv directory 
(export GALAXY_VIRTUAL_ENV=”/galaxy/venv” && sh run.sh).  This did the trick.  
A job was successfully returned from the cluster and metadata properly set.  My 
only question would be, are any of these steps not necessary?

Thanks for your help!

From: Nate Coraor <n...@bx.psu.edu>
Date: Monday, January 29, 2018 at 7:33 AM
To: John Letaw <le...@ohsu.edu>
Cc: galaxy-dev <galaxy-dev@lists.galaxyproject.org>
Subject: Re: [galaxy-dev] Cluster Install Path Errors

Hi John,

Can you verify that your virtualenv created on the cluster is the one being 
used by jobs? It should be possible to view using a test job's script file, the 
path to which is logged at run time.

Thanks,
--nate

On Mon, Jan 29, 2018 at 1:00 AM, John Letaw 
<le...@ohsu.edu<mailto:le...@ohsu.edu>> wrote:
Hi Nate,

I tried to get this to work, but it is not getting rid of the error.  I was 
able to symlink the platform dependent lib-dynload libraries to a recognized 
directory, which removed the error telling me _hashlib was not found.  Now, I’m 
just left with:

galaxy.jobs.output_checker DEBUG 2018-01-28 21:50:36,883 Tool produced standard 
error failing job - [Could not find platform dependent libraries 
Consider setting $PYTHONHOME to [:]
]

Maybe, as Cao suggested, I should just be installing a common version in the 
shared directory between the two filesystems.  The job is correctly being 
scheduled and completes, metadata is just not written.  I’m certainly open to 
any additional ideas.

Thanks,
John

From: Nate Coraor <n...@bx.psu.edu<mailto:n...@bx.psu.edu>>
Date: Thursday, January 18, 2018 at 1:12 PM
To: Cao Tang <charlietan...@gmail.com<mailto:charlietan...@gmail.com>>
Cc: John Letaw <le...@ohsu.edu<mailto:le...@ohsu.edu>>, galaxy-dev 
<galaxy-dev@lists.galaxyproject.org<mailto:galaxy-dev@lists.galaxyproject.org>>
Subject: Re: [galaxy-dev] Cluster Install Path Errors

I think this error most commonly occurs when a virtualenv is used with a 
different python than the one it was created with. Is the cluster also running 
Ubuntu 14.04? If not, you can create a separate virtualenv for running tools 
using the instructions at:

https://docs.galaxyproject.org/en/master/admin/framework_dependencies.html#managing-dependencies-manually

Once created, your tools can use it by setting `/path/to/venv` on the destination in 
job_conf.xml

--nate

On Thu, Jan 18, 2018 at 1:50 PM, Cao Tang 
<charlietan...@gmail.com<mailto:charlietan...@gmail.com>> wrote:
You can try to install:

https://pypi.python.org/pypi/hashlib

On Thu, Jan 18, 2018 at 1:22 PM, John Letaw 
<le...@ohsu.edu<mailto:le...@ohsu.edu>> wrote:
Thanks for the response Peter.  I currently have this instance installed on a 
lustre fs, that is visible on an Ubuntu 14.04 vm.  So, there very well may be 
mismatches between directories on the VM, and those on the lustre cluster.  
This would mean the python installation on the VM needs to exactly match that 
which is on the cluster?  What else will need to match to ensure success?

Thanks,
John

On 1/18/18, 1:50 AM, "Peter Cock" 
<p.j.a.c...@googlemail.com<mailto:p.j.a.c...@googlemail.com>> wrote:

I *think* this is a problem with your copy of Python 2.7 and a
standard library (hashlib) normally present. Do you know how this
Python was installed? If it was compiled from source, then it may have
been missing a few dependencies, and thus you have ended up with
missing a few normally present Python modules.

Peter

On Wed, Jan 17, 2018 at 11:15 PM, John Letaw 
<le...@ohsu.edu<mailto:le...@ohsu.edu>> wrote:
> Hi all,
>
>
>
> I have been trying to finish up a production cluster Galaxy installation,
> and am having trouble with the below error.  In the past, when seeing
> something along these lines, I usually can adjust environmental variables
> either in startup scripts, or by including a script for Galaxy to source
> before it sends out a job.  I have tried all of these different methods, 
but
> I can’t seem to get rid of this error message in any tool invocation.  I
> currently have “embed_metadata_in_job” set to False in my job_conf.xml 
file.
> This removes a “No mod

Re: [galaxy-dev] Cluster Install Path Errors

2018-01-28 Thread John Letaw
Hi Nate,

I tried to get this to work, but it is not getting rid of the error.  I was 
able to symlink the platform dependent lib-dynload libraries to a recognized 
directory, which removed the error telling me _hashlib was not found.  Now, I’m 
just left with:

galaxy.jobs.output_checker DEBUG 2018-01-28 21:50:36,883 Tool produced standard 
error failing job - [Could not find platform dependent libraries 
Consider setting $PYTHONHOME to [:]
]

Maybe, as Cao suggested, I should just be installing a common version in the 
shared directory between the two filesystems.  The job is correctly being 
scheduled and completes, metadata is just not written.  I’m certainly open to 
any additional ideas.

Thanks,
John

From: Nate Coraor <n...@bx.psu.edu>
Date: Thursday, January 18, 2018 at 1:12 PM
To: Cao Tang <charlietan...@gmail.com>
Cc: John Letaw <le...@ohsu.edu>, galaxy-dev <galaxy-dev@lists.galaxyproject.org>
Subject: Re: [galaxy-dev] Cluster Install Path Errors

I think this error most commonly occurs when a virtualenv is used with a 
different python than the one it was created with. Is the cluster also running 
Ubuntu 14.04? If not, you can create a separate virtualenv for running tools 
using the instructions at:

https://docs.galaxyproject.org/en/master/admin/framework_dependencies.html#managing-dependencies-manually

Once created, your tools can use it by setting `/path/to/venv` on the destination in 
job_conf.xml

--nate

On Thu, Jan 18, 2018 at 1:50 PM, Cao Tang 
<charlietan...@gmail.com<mailto:charlietan...@gmail.com>> wrote:
You can try to install:

https://pypi.python.org/pypi/hashlib

On Thu, Jan 18, 2018 at 1:22 PM, John Letaw 
<le...@ohsu.edu<mailto:le...@ohsu.edu>> wrote:
Thanks for the response Peter.  I currently have this instance installed on a 
lustre fs, that is visible on an Ubuntu 14.04 vm.  So, there very well may be 
mismatches between directories on the VM, and those on the lustre cluster.  
This would mean the python installation on the VM needs to exactly match that 
which is on the cluster?  What else will need to match to ensure success?

Thanks,
John

On 1/18/18, 1:50 AM, "Peter Cock" 
<p.j.a.c...@googlemail.com<mailto:p.j.a.c...@googlemail.com>> wrote:

I *think* this is a problem with your copy of Python 2.7 and a
standard library (hashlib) normally present. Do you know how this
Python was installed? If it was compiled from source, then it may have
been missing a few dependencies, and thus you have ended up with
missing a few normally present Python modules.

Peter

On Wed, Jan 17, 2018 at 11:15 PM, John Letaw 
<le...@ohsu.edu<mailto:le...@ohsu.edu>> wrote:
> Hi all,
>
>
>
> I have been trying to finish up a production cluster Galaxy installation,
> and am having trouble with the below error.  In the past, when seeing
> something along these lines, I usually can adjust environmental variables
> either in startup scripts, or by including a script for Galaxy to source
> before it sends out a job.  I have tried all of these different methods, 
but
> I can’t seem to get rid of this error message in any tool invocation.  I
> currently have “embed_metadata_in_job” set to False in my job_conf.xml 
file.
> This removes a “No module named galaxy_ext.metadata.set_metadata” error, 
but
> this hashlib error remains.  If I could understand a little more about the
> steps that are taken when sending out a job, perhaps I could better 
diagnose
> this?
>
>
>
> “””
>
> Could not find platform dependent libraries
>
> Consider setting $PYTHONHOME to [:]
>
> Traceback (most recent call last):
>
>   File "~/galaxydev/galaxy/tools/data_source/upload.py", line 14, in
>
> import tempfile
>
>   File "/usr/lib64/python2.7/tempfile.py", line 35, in
>
> from random import Random as _Random
>
>  File "/usr/lib64/python2.7/random.py", line 49, in
>
> import hashlib as _hashlib
>
>   File "/usr/lib64/python2.7/hashlib.py", line 116, in
>
> import _hashlib
>
> ImportError: No module named _hashlib
>
> “””
>
>
>
> Thanks,
>
> John
>
>
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   https://lists.galaxyproject.org/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/


__

Re: [galaxy-dev] Cluster Install Path Errors

2018-01-18 Thread John Letaw
Hi Nate,

Ah, I like this answer, let me see if I can make it work.  The way I currently 
have the instance set up is inside an Ubuntu 14.04 VM, that has been configured 
to accept lustre mounts.  The galaxy code is installed in a lustre directory, 
which I mainly did to avoid any issues with the cluster seeing important files. 
 I figure the installation may need to be moved to a local VM directory, with 
the Galaxy files visible in a lustre directory.  Anyways, the cluster itself is 
based on CentOS 7.  If you have any other suggestions for this type of setup, 
please let me know before I go too deep in the rabbit hole!

Cao – I can see hashlib in all of the correct places, but I will try to mess 
with the installation if the virtualenv method doesn’t help me.

Thanks so much!

From: Nate Coraor <n...@bx.psu.edu>
Date: Thursday, January 18, 2018 at 1:12 PM
To: Cao Tang <charlietan...@gmail.com>
Cc: John Letaw <le...@ohsu.edu>, galaxy-dev <galaxy-dev@lists.galaxyproject.org>
Subject: Re: [galaxy-dev] Cluster Install Path Errors

I think this error most commonly occurs when a virtualenv is used with a 
different python than the one it was created with. Is the cluster also running 
Ubuntu 14.04? If not, you can create a separate virtualenv for running tools 
using the instructions at:

https://docs.galaxyproject.org/en/master/admin/framework_dependencies.html#managing-dependencies-manually

Once created, your tools can use it by setting `/path/to/venv` on the destination in 
job_conf.xml

--nate

On Thu, Jan 18, 2018 at 1:50 PM, Cao Tang 
<charlietan...@gmail.com<mailto:charlietan...@gmail.com>> wrote:
You can try to install:

https://pypi.python.org/pypi/hashlib

On Thu, Jan 18, 2018 at 1:22 PM, John Letaw 
<le...@ohsu.edu<mailto:le...@ohsu.edu>> wrote:
Thanks for the response Peter.  I currently have this instance installed on a 
lustre fs, that is visible on an Ubuntu 14.04 vm.  So, there very well may be 
mismatches between directories on the VM, and those on the lustre cluster.  
This would mean the python installation on the VM needs to exactly match that 
which is on the cluster?  What else will need to match to ensure success?

Thanks,
John

On 1/18/18, 1:50 AM, "Peter Cock" 
<p.j.a.c...@googlemail.com<mailto:p.j.a.c...@googlemail.com>> wrote:

I *think* this is a problem with your copy of Python 2.7 and a
standard library (hashlib) normally present. Do you know how this
Python was installed? If it was compiled from source, then it may have
been missing a few dependencies, and thus you have ended up with
missing a few normally present Python modules.

Peter

On Wed, Jan 17, 2018 at 11:15 PM, John Letaw 
<le...@ohsu.edu<mailto:le...@ohsu.edu>> wrote:
> Hi all,
>
>
>
> I have been trying to finish up a production cluster Galaxy installation,
> and am having trouble with the below error.  In the past, when seeing
> something along these lines, I usually can adjust environmental variables
> either in startup scripts, or by including a script for Galaxy to source
> before it sends out a job.  I have tried all of these different methods, 
but
> I can’t seem to get rid of this error message in any tool invocation.  I
> currently have “embed_metadata_in_job” set to False in my job_conf.xml 
file.
> This removes a “No module named galaxy_ext.metadata.set_metadata” error, 
but
> this hashlib error remains.  If I could understand a little more about the
> steps that are taken when sending out a job, perhaps I could better 
diagnose
> this?
>
>
>
> “””
>
> Could not find platform dependent libraries
>
> Consider setting $PYTHONHOME to [:]
>
> Traceback (most recent call last):
>
>   File "~/galaxydev/galaxy/tools/data_source/upload.py", line 14, in
>
> import tempfile
>
>   File "/usr/lib64/python2.7/tempfile.py", line 35, in
>
> from random import Random as _Random
>
>  File "/usr/lib64/python2.7/random.py", line 49, in
>
> import hashlib as _hashlib
>
>   File "/usr/lib64/python2.7/hashlib.py", line 116, in
>
> import _hashlib
>
> ImportError: No module named _hashlib
>
> “””
>
>
>
> Thanks,
>
> John
>
>
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   https://lists.galaxyproject.org/
>
> To search Galaxy mailing 

Re: [galaxy-dev] Cluster Install Path Errors

2018-01-18 Thread John Letaw
Thanks for the response Peter.  I currently have this instance installed on a 
lustre fs, that is visible on an Ubuntu 14.04 vm.  So, there very well may be 
mismatches between directories on the VM, and those on the lustre cluster.  
This would mean the python installation on the VM needs to exactly match that 
which is on the cluster?  What else will need to match to ensure success?

Thanks,
John

On 1/18/18, 1:50 AM, "Peter Cock" <p.j.a.c...@googlemail.com> wrote:

I *think* this is a problem with your copy of Python 2.7 and a
standard library (hashlib) normally present. Do you know how this
Python was installed? If it was compiled from source, then it may have
been missing a few dependencies, and thus you have ended up with
missing a few normally present Python modules.

Peter

On Wed, Jan 17, 2018 at 11:15 PM, John Letaw <le...@ohsu.edu> wrote:
> Hi all,
>
>
>
> I have been trying to finish up a production cluster Galaxy installation,
> and am having trouble with the below error.  In the past, when seeing
> something along these lines, I usually can adjust environmental variables
> either in startup scripts, or by including a script for Galaxy to source
> before it sends out a job.  I have tried all of these different methods, 
but
> I can’t seem to get rid of this error message in any tool invocation.  I
> currently have “embed_metadata_in_job” set to False in my job_conf.xml 
file.
> This removes a “No module named galaxy_ext.metadata.set_metadata” error, 
but
> this hashlib error remains.  If I could understand a little more about the
> steps that are taken when sending out a job, perhaps I could better 
diagnose
> this?
>
>
>
> “””
>
> Could not find platform dependent libraries
>
> Consider setting $PYTHONHOME to [:]
>
> Traceback (most recent call last):
>
>   File "~/galaxydev/galaxy/tools/data_source/upload.py", line 14, in
>
> import tempfile
>
>   File "/usr/lib64/python2.7/tempfile.py", line 35, in
>
> from random import Random as _Random
>
>  File "/usr/lib64/python2.7/random.py", line 49, in
>
> import hashlib as _hashlib
>
>   File "/usr/lib64/python2.7/hashlib.py", line 116, in
>
> import _hashlib
>
> ImportError: No module named _hashlib
>
> “””
>
>
>
> Thanks,
>
> John
>
>
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   https://lists.galaxyproject.org/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/


___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/

[galaxy-dev] Cluster Install Path Errors

2018-01-17 Thread John Letaw
Hi all,

I have been trying to finish up a production cluster Galaxy installation, and 
am having trouble with the below error.  In the past, when seeing something 
along these lines, I usually can adjust environmental variables either in 
startup scripts, or by including a script for Galaxy to source before it sends 
out a job.  I have tried all of these different methods, but I can’t seem to 
get rid of this error message in any tool invocation.  I currently have 
“embed_metadata_in_job” set to False in my job_conf.xml file.  This removes a 
“No module named galaxy_ext.metadata.set_metadata” error, but this hashlib 
error remains.  If I could understand a little more about the steps that are 
taken when sending out a job, perhaps I could better diagnose this?

“””
Could not find platform dependent libraries
Consider setting $PYTHONHOME to [:]
Traceback (most recent call last):
  File "~/galaxydev/galaxy/tools/data_source/upload.py", line 14, in
import tempfile
  File "/usr/lib64/python2.7/tempfile.py", line 35, in
from random import Random as _Random
 File "/usr/lib64/python2.7/random.py", line 49, in
import hashlib as _hashlib
  File "/usr/lib64/python2.7/hashlib.py", line 116, in
import _hashlib
ImportError: No module named _hashlib
“””

Thanks,
John
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/

Re: [galaxy-dev] Fwd: Job Script Integrity with GalaxyKickStart (galaxy-dev Digest, Vol 137, Issue 5)

2017-11-09 Thread John Letaw
Hi again,

Ok.  Here’s the top of our slurm.conf file.  One thing I notice when comparing 
to the configure_slurm template is that the GKS tries to set the SlurmUser to 
‘galaxyuser’, while we have it set to ‘slurm’.

ControlMachine=exahead1
#BackupController=exahead2
AuthType=auth/munge
CacheGroups=0
#CheckpointType=checkpoint/none
CryptoType=crypto/munge
DisableRootJobs=YES
#EnforcePartLimits=NO
#Epilog=
#EpilogSlurmctld=
#FirstJobId=1
#MaxJobId=99
GresTypes=gpu
GroupUpdateForce=1
GroupUpdateTime=300
#JobCheckpointDir=/var/slurm/checkpoint
#JobCredentialPrivateKey=
#JobCredentialPublicCertificate=
#JobFileAppend=0
#JobRequeue=1
#JobSubmitPlugins=1
#KillOnBadExit=0
#LaunchType=launch/slurm
#Licenses=foo*4,bar
#MailProg=/bin/mail
#MaxJobCount=5000
#MaxStepCount=4
#MaxTasksPerNode=128
MpiDefault=pmi2
#MpiParams=ports=#-#
#PluginDir=/root/sw/slurm/14.11.7/lib/slurm
#PlugStackConfig=
#PrivateData=jobs
ProctrackType=proctrack/linuxproc
#Prolog=
#PrologFlags=
#PrologSlurmctld=
#PropagatePrioProcess=0
#PropagateResourceLimits=
#PropagateResourceLimitsExcept=
RebootProgram=/sbin/reboot
ReturnToService=2
#SallocDefaultCommand=
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmctldPort=6817
SlurmdPidFile=/var/run/slurmd.pid
SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
#SlurmdUser=root
#SrunEpilog=
#SrunProlog=
StateSaveLocation=/var/spool/slurm
SwitchType=switch/none
#TaskEpilog=
TaskPlugin=task/none
#TaskPluginParam=
#TaskProlog=
#TopologyPlugin=topology/tree
#TmpFS=/tmp
#TrackWCKey=no
#TreeWidth=
#UnkillableStepProgram=
#UsePAM=0

Thanks!

From: galaxy-dev <galaxy-dev-boun...@lists.galaxyproject.org> on behalf of John 
Letaw <le...@ohsu.edu>
Date: Wednesday, November 8, 2017 at 11:46 AM
To: Christophe Antoniewski <droso...@gmail.com>
Cc: galaxy-dev <galaxy-dev@lists.galaxyproject.org>
Subject: Re: [galaxy-dev] Fwd: Job Script Integrity with GalaxyKickStart 
(galaxy-dev Digest, Vol 137, Issue 5)

Hi Chris,

I am changing my mind, not thinking it is a slurm config problem.  I can submit 
jobs from this vm with the ‘galaxyuser’ user and there is no problem.  In the 
logs, I can see a line that says the script is being submitted, then another 
line that echos the native specification.  After that, I don’t see anything 
else unless I stop the job.  If I do that, it will spit back a message saying 
it can’t find the job in the scheduler, since it never actually made it there.  
So, there must be a problem with Galaxy communicating with the scheduler.  From 
the ansible playbook code, I can see there is a step that links the slurm.conf 
and munge.key files to the galaxy path.  This is something I am currently doing 
manually, since I am not trying to configure a new cluster but instead use an 
existing one.  Maybe there is some other simple step I am overlooking that 
would cause this behavior?

Thanks,
John

From: Christophe Antoniewski <droso...@gmail.com>
Date: Wednesday, November 8, 2017 at 12:34 AM
To: John Letaw <le...@ohsu.edu>
Cc: Marius van den Beek <m.vandenb...@gmail.com>, galaxy-dev 
<galaxy-dev@lists.galaxyproject.org>
Subject: Re: [galaxy-dev] Fwd: Job Script Integrity with GalaxyKickStart 
(galaxy-dev Digest, Vol 137, Issue 5)

Hi John and Marius,

So, I am assuming I have some problem with my slurm configuration, does that 
sounds accurate?

Maybe it would help to see that. I have a couple of complicated experiences 
with slurm config but up to now it is with Ubuntu 16.04 Xenial

Best - Chris


Christophe Antoniewski

ARTbio<http://artbio.fr/> - Tel +33 1 44 27 70 05
Drosophila Genetics & Epigenetics<http://drosophile.org/> - Tel +33 1 44 27 34 
39
Mobile +33 6 68 60 51 50
https://twitter.com/ARTbio_IBPS
https://twitter.com/drosofff

2017-11-07 21:09 GMT+01:00 John Letaw <le...@ohsu.edu<mailto:le...@ohsu.edu>>:
Hi Marius.

Ok, this was pretty much how I read the code as well.  My first instinct was to 
do exactly as you suggested, and add that declaration in group_vars/all.  This 
does stop the error, but then I just get stuck with jobs that never run.  So, I 
am assuming I have some problem with my slurm configuration, does that sounds 
accurate?

Thanks,
John

From: galaxy-dev 
<galaxy-dev-boun...@lists.galaxyproject.org<mailto:galaxy-dev-boun...@lists.galaxyproject.org>>
 on behalf of Marius van den Beek 
<m.vandenb...@gmail.com<mailto:m.vandenb...@gmail.com>>
Date: Tuesday, November 7, 2017 at 10:06 AM
To: Christophe Antoniewski <droso...@gmail.com<mailto:droso...@gmail.com>>
Cc: galaxy-dev 
<galaxy-dev@lists.galaxyproject.org<mailto:galaxy-dev@lists.galaxyproject.org>>
Subject: Re: [galaxy-dev] Fwd: Job Script Integrity with GalaxyKickStart 
(galaxy-dev Digest, Vol 137, Issue 5)

Hi John and Christophe,

What the job script integrity script does is checking that the script is ready 
to be executed,
by setting the environment variable `ABC_TEST

Re: [galaxy-dev] Fwd: Job Script Integrity with GalaxyKickStart (galaxy-dev Digest, Vol 137, Issue 5)

2017-11-08 Thread John Letaw
Hi Chris,

I am changing my mind, not thinking it is a slurm config problem.  I can submit 
jobs from this vm with the ‘galaxyuser’ user and there is no problem.  In the 
logs, I can see a line that says the script is being submitted, then another 
line that echos the native specification.  After that, I don’t see anything 
else unless I stop the job.  If I do that, it will spit back a message saying 
it can’t find the job in the scheduler, since it never actually made it there.  
So, there must be a problem with Galaxy communicating with the scheduler.  From 
the ansible playbook code, I can see there is a step that links the slurm.conf 
and munge.key files to the galaxy path.  This is something I am currently doing 
manually, since I am not trying to configure a new cluster but instead use an 
existing one.  Maybe there is some other simple step I am overlooking that 
would cause this behavior?

Thanks,
John

From: Christophe Antoniewski <droso...@gmail.com>
Date: Wednesday, November 8, 2017 at 12:34 AM
To: John Letaw <le...@ohsu.edu>
Cc: Marius van den Beek <m.vandenb...@gmail.com>, galaxy-dev 
<galaxy-dev@lists.galaxyproject.org>
Subject: Re: [galaxy-dev] Fwd: Job Script Integrity with GalaxyKickStart 
(galaxy-dev Digest, Vol 137, Issue 5)

Hi John and Marius,

So, I am assuming I have some problem with my slurm configuration, does that 
sounds accurate?

Maybe it would help to see that. I have a couple of complicated experiences 
with slurm config but up to now it is with Ubuntu 16.04 Xenial

Best - Chris


Christophe Antoniewski

ARTbio<http://artbio.fr/> - Tel +33 1 44 27 70 05
Drosophila Genetics & Epigenetics<http://drosophile.org/> - Tel +33 1 44 27 34 
39
Mobile +33 6 68 60 51 50
https://twitter.com/ARTbio_IBPS
https://twitter.com/drosofff

2017-11-07 21:09 GMT+01:00 John Letaw <le...@ohsu.edu<mailto:le...@ohsu.edu>>:
Hi Marius.

Ok, this was pretty much how I read the code as well.  My first instinct was to 
do exactly as you suggested, and add that declaration in group_vars/all.  This 
does stop the error, but then I just get stuck with jobs that never run.  So, I 
am assuming I have some problem with my slurm configuration, does that sounds 
accurate?

Thanks,
John

From: galaxy-dev 
<galaxy-dev-boun...@lists.galaxyproject.org<mailto:galaxy-dev-boun...@lists.galaxyproject.org>>
 on behalf of Marius van den Beek 
<m.vandenb...@gmail.com<mailto:m.vandenb...@gmail.com>>
Date: Tuesday, November 7, 2017 at 10:06 AM
To: Christophe Antoniewski <droso...@gmail.com<mailto:droso...@gmail.com>>
Cc: galaxy-dev 
<galaxy-dev@lists.galaxyproject.org<mailto:galaxy-dev@lists.galaxyproject.org>>
Subject: Re: [galaxy-dev] Fwd: Job Script Integrity with GalaxyKickStart 
(galaxy-dev Digest, Vol 137, Issue 5)

Hi John and Christophe,

What the job script integrity script does is checking that the script is ready 
to be executed,
by setting the environment variable `ABC_TEST_JOB_SCRIPT_INTEGRITY_XYZ` to 1
and then executing the tool_script.sh script that contains the following check:

```
if [ -n "$ABC_TEST_JOB_SCRIPT_INTEGRITY_XYZ" ]; then
exit 42
fi
```

So if the script is ready to execute it returns with the exit code 42.
Now this can take a few seconds over NFS (I guess that'd be true for lustre as 
well).
This check is being run 35 times with a sleep of .25 seconds.

Unfortunately there was a bug in galaxy that would skip the sleep,
so the job integrity check would fail frequently. We fixed this in
https://github.com/galaxyproject/galaxy/pull/4720 and this has been backported
up to galaxy release 16.07, so if you just get to the latest galaxy commit on 
your branch
it *may* work again.

Now this has been broken for a long time, and it has never worked for me
on our current cluster. Should an update to galaxy not be enough,
you can disable this check with `check_job_script_integrity = False` in the 
galaxy.ini or by adding
`-e GALAXY_CONFIG_CHECK_JOB_SCRIPT_INTEGRITY=False` if you're running kickstart 
in docker.
I have not seen any drawback of disabling the integrity check on our cluster.

Good luck,
Marius

On 7 November 2017 at 18:25, Christophe Antoniewski 
<droso...@gmail.com<mailto:droso...@gmail.com>> wrote:
Hi John,

Can you also raise an issue in https://github.com/ARTbio/GalaxyKickStart/issues 
?

In order to help, I will need to know the configuration of your GalaxyKickStart 
(the variables you modified in the playbook, group_vars and inventory_files).

Did you use the cloud_setup role ? In that case Enis Afgan 
https://github.com/afgane may help.

Best regards

Chris


Christophe Antoniewski

Institut de Biologie Paris Seine<http://www.ibps.upmc.fr/en>
9, Quai St 
Bernard<https://maps.google.com/?q=9,+Quai+St+Bernard=gmail=g>, 
Boîte courrier 24
75252 Paris Cedex 05
ARTbio<http://artbio.fr/> Bâtiment B, 7e étage, porte 725

Tel +33 1 44 27 70 05
Mobile +3

Re: [galaxy-dev] Fwd: Job Script Integrity with GalaxyKickStart (galaxy-dev Digest, Vol 137, Issue 5)

2017-11-07 Thread John Letaw
Hi Marius.

Ok, this was pretty much how I read the code as well.  My first instinct was to 
do exactly as you suggested, and add that declaration in group_vars/all.  This 
does stop the error, but then I just get stuck with jobs that never run.  So, I 
am assuming I have some problem with my slurm configuration, does that sounds 
accurate?

Thanks,
John

From: galaxy-dev <galaxy-dev-boun...@lists.galaxyproject.org> on behalf of 
Marius van den Beek <m.vandenb...@gmail.com>
Date: Tuesday, November 7, 2017 at 10:06 AM
To: Christophe Antoniewski <droso...@gmail.com>
Cc: galaxy-dev <galaxy-dev@lists.galaxyproject.org>
Subject: Re: [galaxy-dev] Fwd: Job Script Integrity with GalaxyKickStart 
(galaxy-dev Digest, Vol 137, Issue 5)

Hi John and Christophe,

What the job script integrity script does is checking that the script is ready 
to be executed,
by setting the environment variable `ABC_TEST_JOB_SCRIPT_INTEGRITY_XYZ` to 1
and then executing the tool_script.sh script that contains the following check:

```
if [ -n "$ABC_TEST_JOB_SCRIPT_INTEGRITY_XYZ" ]; then
exit 42
fi
```

So if the script is ready to execute it returns with the exit code 42.
Now this can take a few seconds over NFS (I guess that'd be true for lustre as 
well).
This check is being run 35 times with a sleep of .25 seconds.

Unfortunately there was a bug in galaxy that would skip the sleep,
so the job integrity check would fail frequently. We fixed this in
https://github.com/galaxyproject/galaxy/pull/4720 and this has been backported
up to galaxy release 16.07, so if you just get to the latest galaxy commit on 
your branch
it *may* work again.

Now this has been broken for a long time, and it has never worked for me
on our current cluster. Should an update to galaxy not be enough,
you can disable this check with `check_job_script_integrity = False` in the 
galaxy.ini or by adding
`-e GALAXY_CONFIG_CHECK_JOB_SCRIPT_INTEGRITY=False` if you're running kickstart 
in docker.
I have not seen any drawback of disabling the integrity check on our cluster.

Good luck,
Marius

On 7 November 2017 at 18:25, Christophe Antoniewski 
<droso...@gmail.com<mailto:droso...@gmail.com>> wrote:
Hi John,

Can you also raise an issue in https://github.com/ARTbio/GalaxyKickStart/issues 
?

In order to help, I will need to know the configuration of your GalaxyKickStart 
(the variables you modified in the playbook, group_vars and inventory_files).

Did you use the cloud_setup role ? In that case Enis Afgan 
https://github.com/afgane may help.

Best regards

Chris


Christophe Antoniewski

Institut de Biologie Paris Seine<http://www.ibps.upmc.fr/en>
9, Quai St Bernard, Boîte courrier 24
75252 Paris Cedex 05
ARTbio<http://artbio.fr/> Bâtiment B, 7e étage, porte 725

Tel +33 1 44 27 70 05
Mobile +33 6 68 60 51 50<tel:06%2068%2060%2051%2050>

Pour accéder à la Plateforme
Bâtiment B, 7e étage, Porte 
725<https://www.google.com/maps/d/u/0/edit?mid=zmZz-3Vin5D0.kjRSV6vitXE8>

[mage removed by sender.]

https://twitter.com/ARTbio_IBPS

2017-11-07 18:00 GMT+01:00 
<galaxy-dev-requ...@lists.galaxyproject.org<mailto:galaxy-dev-requ...@lists.galaxyproject.org>>:
Send galaxy-dev mailing list submissions to

galaxy-dev@lists.galaxyproject.org<mailto:galaxy-dev@lists.galaxyproject.org>

To subscribe or unsubscribe via the World Wide Web, visit
https://lists.galaxyproject.org/listinfo/galaxy-dev
or, via email, send a message with subject or body 'help' to

galaxy-dev-requ...@lists.galaxyproject.org<mailto:galaxy-dev-requ...@lists.galaxyproject.org>

You can reach the person managing the list at

galaxy-dev-ow...@lists.galaxyproject.org<mailto:galaxy-dev-ow...@lists.galaxyproject.org>

When replying, please edit your Subject line so it is more specific
than "Re: Contents of galaxy-dev digest..."


HEY!  This is important!  If you reply to a thread in a digest, please
1. Change the subject of your response from "Galaxy-dev Digest Vol ..." to the 
original subject for the thread.
2. Strip out everything else in the digest that is not part of the thread you 
are responding to.

Why?
1. This will keep the subject meaningful.  People will have some idea from the 
subject line if they should read it or not.
2. Not doing this greatly increases the number of emails that match search 
queries, but that aren't actually informative.

Today's Topics:

   1. Job Script Integrity (John Letaw)


------

Message: 1
Date: Tue, 7 Nov 2017 03:20:49 +
From: "John Letaw" <le...@ohsu.edu<mailto:le...@ohsu.edu>>
To: 
"galaxy-dev@lists.galaxyproject.org<mailto:galaxy-dev@lists.galaxyproject.org>"

<galaxy-dev@lists.galaxyproject.org<mailto:galaxy-dev@lists.galaxyproject.org>>
Subject: [galaxy-dev] Job Script Integrity
Message-ID: 
&l

Re: [galaxy-dev] Fwd: Job Script Integrity with GalaxyKickStart (galaxy-dev Digest, Vol 137, Issue 5)

2017-11-07 Thread John Letaw
Hi Chris,

Thanks for looking at this.  We have spun up an Ubunutu 14.04 VM for this 
purpose, and have a lustre filesystem mounted for persistent data.  Our SLURM 
cluster is already in place, so I have removed from the playbook most of what 
occurs in the galaxy-extras slurm task.  I believe the only thing left are a 
couple of steps at the bottom of the task that deal with creating a 
job_conf.xml file.  I have also changed things like the galaxy_user_name, gid, 
uid, galaxy_loc (on lustre), etc.  The playbook does run to completion, and 
without error.  We have confirmed that the ‘galaxyuser’ user can submit jobs to 
the scheduler, though haven’t had any luck getting this to work from the Galaxy 
instance.

I’m happy to give you any details you need, I feel like I am so close to 
getting this working…

Thanks,
John

From: galaxy-dev <galaxy-dev-boun...@lists.galaxyproject.org> on behalf of 
Christophe Antoniewski <droso...@gmail.com>
Date: Tuesday, November 7, 2017 at 9:25 AM
To: "galaxy-dev@lists.galaxyproject.org" <galaxy-dev@lists.galaxyproject.org>
Subject: [galaxy-dev] Fwd: Job Script Integrity with GalaxyKickStart 
(galaxy-dev Digest, Vol 137, Issue 5)

Hi John,

Can you also raise an issue in https://github.com/ARTbio/GalaxyKickStart/issues 
?

In order to help, I will need to know the configuration of your GalaxyKickStart 
(the variables you modified in the playbook, group_vars and inventory_files).

Did you use the cloud_setup role ? In that case Enis Afgan 
https://github.com/afgane may help.

Best regards

Chris


Christophe Antoniewski

Institut de Biologie Paris Seine<http://www.ibps.upmc.fr/en>
9, Quai St Bernard, Boîte courrier 24
75252 Paris Cedex 05
ARTbio<http://artbio.fr/> Bâtiment B, 7e étage, porte 725

Tel +33 1 44 27 70 05
Mobile +33 6 68 60 51 50<tel:06%2068%2060%2051%2050>

Pour accéder à la Plateforme
Bâtiment B, 7e étage, Porte 
725<https://www.google.com/maps/d/u/0/edit?mid=zmZz-3Vin5D0.kjRSV6vitXE8>

[mage removed by sender.]

https://twitter.com/ARTbio_IBPS

2017-11-07 18:00 GMT+01:00 
<galaxy-dev-requ...@lists.galaxyproject.org<mailto:galaxy-dev-requ...@lists.galaxyproject.org>>:
Send galaxy-dev mailing list submissions to

galaxy-dev@lists.galaxyproject.org<mailto:galaxy-dev@lists.galaxyproject.org>

To subscribe or unsubscribe via the World Wide Web, visit
https://lists.galaxyproject.org/listinfo/galaxy-dev
or, via email, send a message with subject or body 'help' to

galaxy-dev-requ...@lists.galaxyproject.org<mailto:galaxy-dev-requ...@lists.galaxyproject.org>

You can reach the person managing the list at

galaxy-dev-ow...@lists.galaxyproject.org<mailto:galaxy-dev-ow...@lists.galaxyproject.org>

When replying, please edit your Subject line so it is more specific
than "Re: Contents of galaxy-dev digest..."


HEY!  This is important!  If you reply to a thread in a digest, please
1. Change the subject of your response from "Galaxy-dev Digest Vol ..." to the 
original subject for the thread.
2. Strip out everything else in the digest that is not part of the thread you 
are responding to.

Why?
1. This will keep the subject meaningful.  People will have some idea from the 
subject line if they should read it or not.
2. Not doing this greatly increases the number of emails that match search 
queries, but that aren't actually informative.

Today's Topics:

   1. Job Script Integrity (John Letaw)


------

Message: 1
Date: Tue, 7 Nov 2017 03:20:49 +
From: "John Letaw" <le...@ohsu.edu<mailto:le...@ohsu.edu>>
To: 
"galaxy-dev@lists.galaxyproject.org<mailto:galaxy-dev@lists.galaxyproject.org>"

<galaxy-dev@lists.galaxyproject.org<mailto:galaxy-dev@lists.galaxyproject.org>>
Subject: [galaxy-dev] Job Script Integrity
Message-ID: 
<fbf795c3-8f01-47ef-8033-f14dd8694...@ohsu.edu<mailto:fbf795c3-8f01-47ef-8033-f14dd8694...@ohsu.edu>>
Content-Type: text/plain; charset="utf-8"

Hi all,

I’m installing via GalaxyKickStart…

I’m getting the following error:

galaxy.jobs.runners ERROR 2017-11-06 19:14:05,263 (19) Failure preparing job
Traceback (most recent call last):
  File 
"/home/exacloud/lustre1/galaxydev/galaxyuser/lib/galaxy/jobs/runners/__init__.py",
 line 175, in prepare_job
modify_command_for_container=modify_command_for_container
  File 
"/home/exacloud/lustre1/galaxydev/galaxyuser/lib/galaxy/jobs/runners/__init__.py",
 line 209, in build_command_line
container=container
  File 
"/home/exacloud/lustre1/galaxydev/galaxyuser/lib/galaxy/jobs/command_factory.py",
 line 84, in build_command
externalized_commands = __externalize_commands(job_wrapper, 
external_command_shell, commands_builder, remote_command_params)
  File 
"/home/exacloud/lustre1/galaxydev/galax

[galaxy-dev] Job Script Integrity

2017-11-06 Thread John Letaw
Hi all,

I’m installing via GalaxyKickStart…

I’m getting the following error:

galaxy.jobs.runners ERROR 2017-11-06 19:14:05,263 (19) Failure preparing job
Traceback (most recent call last):
  File 
"/home/exacloud/lustre1/galaxydev/galaxyuser/lib/galaxy/jobs/runners/__init__.py",
 line 175, in prepare_job
modify_command_for_container=modify_command_for_container
  File 
"/home/exacloud/lustre1/galaxydev/galaxyuser/lib/galaxy/jobs/runners/__init__.py",
 line 209, in build_command_line
container=container
  File 
"/home/exacloud/lustre1/galaxydev/galaxyuser/lib/galaxy/jobs/command_factory.py",
 line 84, in build_command
externalized_commands = __externalize_commands(job_wrapper, 
external_command_shell, commands_builder, remote_command_params)
  File 
"/home/exacloud/lustre1/galaxydev/galaxyuser/lib/galaxy/jobs/command_factory.py",
 line 143, in __externalize_commands
write_script(local_container_script, script_contents, config)
  File 
"/home/exacloud/lustre1/galaxydev/galaxyuser/lib/galaxy/jobs/runners/util/job_script/__init__.py",
 line 112, in write_script
_handle_script_integrity(path, config)
  File 
"/home/exacloud/lustre1/galaxydev/galaxyuser/lib/galaxy/jobs/runners/util/job_script/__init__.py",
 line 147, in _handle_script_integrity
raise Exception("Failed to write job script, could not verify job script 
integrity.")
Exception: Failed to write job script, could not verify job script integrity.
galaxy.model.metadata DEBUG 2017-11-06 19:14:05,541 Cleaning up external 
metadata files
galaxy.model.metadata DEBUG 2017-11-06 19:14:05,576 Failed to cleanup 
MetadataTempFile temp files from 
/home/exacloud/lustre1/galaxydev/galaxyuser/database/jobs/000/19/metadata_out_HistoryDatasetAssociation_16_I8bhLX:
 No JSON object could be decoded

I would like to further understand what it means to not verify integrity of a 
job script.  Does this just mean there is a permissions error?  Ownership 
doesn’t match up?

Thanks,
John

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/