Re: [galaxy-dev] Passing the number of mpi processors to SGE from xml file

2011-09-13 Thread Glen Beane
I think with the current state of Galaxy this would need to be fixed.  You 
would need to hard code the number of processors in the xml wrapper and in the 
corresponding job runner for that tool



From: galaxy-dev-boun...@lists.bx.psu.edu [galaxy-dev-boun...@lists.bx.psu.edu] 
on behalf of Chorny, Ilya [icho...@illumina.com]
Sent: Tuesday, September 13, 2011 7:58 PM
To: galaxy-dev@lists.bx.psu.edu
Subject: [galaxy-dev] Passing the number of mpi processors to SGE from xml file

Any thoughts on how to pass the number of processor I want to use from my xml 
wrapper to the job template. I am trying to wrap pBWA and use mpi.

Thanks,

Ilya


Ilya Chorny Ph.D.
Bioinformatics Scientist I
Illumina, Inc.
9885 Towne Centre Drive
San Diego, CA 92121
Work: 858.202.4582
Email: icho...@illumina.com
Website: www.illumina.com


___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Passing the number of mpi processors to SGE from xml file

2011-09-13 Thread Chorny, Ilya
Any thoughts on how to pass the number of processor I want to use from my xml 
wrapper to the job template. I am trying to wrap pBWA and use mpi.

Thanks,

Ilya


Ilya Chorny Ph.D.
Bioinformatics Scientist I
Illumina, Inc.
9885 Towne Centre Drive
San Diego, CA 92121
Work: 858.202.4582
Email: icho...@illumina.com
Website: www.illumina.com


___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] passing environment to pbs runner

2011-09-13 Thread Glen Beane




On Sep 13, 2011, at 6:01 PM, "Oleksandr Moskalenko" 
mailto:o...@hpc.ufl.edu>> wrote:


I need to get Galaxy to use tools that are not located in the system path. At 
the command line I can export the PATH either manually or with environment 
modules 'module load foo'. However, there doesn't seem to be a clear way to 
pass the $PATH and other environmental variables to the pbs runner. For 
instance, I use Torque with a dedicated "galaxy" queue. So, I configured the 
"default_cluster_job_runner = pbs:///galaxy/" in universe_wsgi.ini. The problem 
is that the pbs runner does not pass the environment to the scheduler. I sort 
of circumvented this by sourcing an environment setup file from 'pbs_template' 
in pbs.py, but I keep wondering about a better way to do it.

I currently use tools set up using environment modules, so this is what I  
added the following to pbs_template:

. /etc/profile.d/modules.sh
module load bio


I load modules in the .bashrc for the galaxy user on our cluster
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] passing environment to pbs runner

2011-09-13 Thread Oleksandr Moskalenko

I need to get Galaxy to use tools that are not located in the system path. At 
the command line I can export the PATH either manually or with environment 
modules 'module load foo'. However, there doesn't seem to be a clear way to 
pass the $PATH and other environmental variables to the pbs runner. For 
instance, I use Torque with a dedicated "galaxy" queue. So, I configured the 
"default_cluster_job_runner = pbs:///galaxy/" in universe_wsgi.ini. The problem 
is that the pbs runner does not pass the environment to the scheduler. I sort 
of circumvented this by sourcing an environment setup file from 'pbs_template' 
in pbs.py, but I keep wondering about a better way to do it. 

I currently use tools set up using environment modules, so this is what I  
added the following to pbs_template:

. /etc/profile.d/modules.sh
module load bio


Thanks,

Alex___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] ProFTPd on Ubuntu system.

2011-09-13 Thread Luobin Yang
Hi, Enis,

Thanks for sharing your method! I did some searching last night and figured
out why the default proftpd package for Ubuntu 10.04 doesn't work; it
doesn't come with a module that's needed for Galaxy: the mod_sql_passwd
module. So I reinstalled proftpd from source and now it works fine :)

Thanks again,
Luobin

On Tue, Sep 13, 2011 at 2:14 AM, Enis Afgan  wrote:

> I forgot to mention earlier that you'll still need to do the database GRANT
> etc. as described on the wiki page you pointed to. The mentioned method does
> not do that.
>
> Enis
>
>
> On Tue, Sep 13, 2011 at 8:10 AM, Enis Afgan  wrote:
>
>> You can try extracting the ProFTP install method from mi-deployment and
>> running only it via Fabric? We use this method to install ProFTP on the
>> cloud and galaxy VM, both of which are based on Ubuntu 10.04.
>>
>> The method is available here:
>> https://bitbucket.org/afgane/mi-deployment/src/d894af37a83b/mi_fabfile.py#cl-442
>> (also take a look at the _required_packages method and _setup_users method
>> because some additional configuration is done there).
>>
>> Hope this helps,
>> Enis
>>
>>  On Mon, Sep 12, 2011 at 8:05 PM, Luobin Yang  wrote:
>>
>>>  Hi,
>>>
>>> Has anyone set up ProFTPd successfully on Ubuntu 10.04 to enable FTP
>>> upload on Galaxy? I followed the instructions on this link (
>>> http://wiki.g2.bx.psu.edu/Admin/Config/Upload%20via%20FTP) but it
>>> doesn't work.
>>>
>>> Thanks,
>>> Luobin
>>>
>>> ___
>>> Please keep all replies on the list by using "reply all"
>>> in your mail client.  To manage your subscriptions to this
>>> and other Galaxy lists, please use the interface at:
>>>
>>>  http://lists.bx.psu.edu/
>>>
>>
>>
>
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] cluster path question

2011-09-13 Thread Nate Coraor
Ann Black wrote:
> Thanks Nate! I did not know about sge_request - and that helps.  I played 
> around with using -V and -v in sge_request some and it works similar to 
> specifying -V (or -v) in my universe_wsgi.ini file.  I am new to galaxy and I 
> am working on getting it instantiated locally.  Is there an advantage to 
> setting my native DRMAA options in my galaxy user's  local .sge_request file 
> vs. directly in the universe_wsgi.ini configuration file?  Would an advantage 
> be that I can set more universal drmaa native options that I would like to 
> common for all tools there (sge_request for galaxy user) vs. duplicating the 
> in each per tool configurations in universe_wsgi file? 

Hi Ann,

It probably doesn't matter whether you use .sge_request or the runner
URL in the Galaxy config.  The difference may simply be that
.sge_request would be a little less messy.

--nate

> 
> Thanks again,
> 
> Ann
> 
> On Sep 12, 2011, at 12:50 PM, Nate Coraor wrote:
> 
> > Ann Black wrote:
> >> I figured out a solution.  The sun grid engine will strip back the env of 
> >> what gets passed along with the job submission.  I added a native drmaa 
> >> option, -V, which caused the env vars found on the shell that submits the 
> >> job to be passed along.  Therefore all the environment setup I did in my 
> >> galaxy user's .bash_profile and thus configured in my local shell running 
> >> galaxy now gets propagated with my job submissions.  This does not allow 
> >> changes to .bash_profile to be picked up dynamically, however, since the 
> >> .bash_profile is not sourced on each compute node.  IE changes made to the 
> >> galaxy user's env needs to be re-sourced in the shell that runs galaxy and 
> >> dispatches the jobs.
> >> 
> >> Thanks - hope this helps others,
> > 
> > Hi Ann,
> > 
> > For SGE, you can also use ~/.sge_request to set up the environment on
> > the execution host.
> > 
> > --nate
> > 
> >> 
> >> Ann
> >> 
> >> 
> >> On Sep 12, 2011, at 10:09 AM, Ann Black wrote:
> >> 
> >>> Hello everyone -
> >>> 
> >>> I am also running into this issue trying to get galaxy integrated with 
> >>> our sun grid engine.  My galaxy user's .bash_profile does not appear to 
> >>> get sourced when the jobs run. I augmented the sample sam_filter.py 
> >>> tutorial such that it output path and user info so I could see how the 
> >>> jobs were being run:
> >>> 
> >>> out = open( sys.argv[2], "w" )
> >>> out2 = open("/data/galaxy-dist/ann.out", "w")
> >>> out2.write(socket.gethostname())
> >>> out2.write("\n")
> >>> out2.write(os.environ['PATH'])
> >>> out2.write("\n")
> >>> drmaa = os.environ.get('DRMAA_LIBRARY_PATH')
> >>> if drmaa is None:
> >>>  out2.write("None")
> >>> else:
> >>>  out2.write(os.environ.get('DRMAA_LIBRARY_PATH'))
> >>> out2.write("\n")
> >>> out2.write(str(os.geteuid()))
> >>> out2.write("\n")
> >>> out2.write(str(os.getegid()))
> >>> shutil.copytree("/data/galaxy-dist/database/pbs","/data/galaxy-dist/ann")
> >>> 
> >>> the job is being dispatched as my galaxy user, however the my augments to 
> >>> PATH and additional env vars that I have exported in our galaxy user's 
> >>> .bash_profile are not present when the script runs (ie, .bash_profile is 
> >>> not sourced).  When I use qsub to manually run the galaxy script that 
> >>> gets generated under database/pbs, the output to ann.out reflects my PATH 
> >>> and exported env vars.
> >>> 
> >>> Was there any other solution to this issue besides the drmaa.py script 
> >>> augment?
> >>> 
> >>> Thanks much for your help,
> >>> 
> >>> Ann
> >> 
> >> 
> >> ___
> >> Please keep all replies on the list by using "reply all"
> >> in your mail client.  To manage your subscriptions to this
> >> and other Galaxy lists, please use the interface at:
> >> 
> >>  http://lists.bx.psu.edu/
> 
> 
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] (Composite) Dataset Upload not Setting Metadata

2011-09-13 Thread Paniagua, Eric
Hi all,

Can anyone tell me why JobWrapper.finish() moves the primary dataset file 
dataset_path.false_path to dataset_path.real_path (contingent on 
config.outputs_to_working_directory == True) but does not move the "extra 
files"?  (lib/galaxy/jobs/__init__.py:540-553)  It seems to me that if you want 
to move a dataset, you want to move the whole dataset, and that perhaps this 
should be factored out, perhaps into the galaxy.util module?

Why does class DatasetPath only account for the path to the primary file and 
not the path to the "extra files"?  It could be used to account for the "extra 
files" by path splitting as in my previous suggested bug fix, but only if that 
fix is correct.  It doesn't seem to be used for that purpose in the Galaxy code.

I look forward to an informative response.

Thanks,
Eric Paniagua


From: galaxy-dev-boun...@lists.bx.psu.edu [galaxy-dev-boun...@lists.bx.psu.edu] 
on behalf of Paniagua, Eric [epani...@cshl.edu]
Sent: Monday, September 12, 2011 7:37 PM
To: galaxy-dev@lists.bx.psu.edu
Subject: Re: [galaxy-dev] (Composite) Dataset Upload not Setting Metadata

Hello again,

It looks like the config.outputs_to_working_directory variable is intended to 
do something closely related, but setting it to either of True and False does 
not in fact fix the problem.

The output path for files in a composite dataset upload (dataset.files_path) 
that is used in the tools/data_source/upload.xml tool is set to a path under 
the job working directory by lib/galaxy/tools/__init__.py:1519.  The preceding 
code (lines 1507-1516) select the path for the primary file contingent on 
config.outputs_to_working_directory.

Why is the path set in line 1519 not contingent on 
config.outputs_to_working_directory?  Indeed, the following small change fixes 
the bug I'm observing:

diff -r 949e4f5fa03a lib/galaxy/tools/__init__.py
--- a/lib/galaxy/tools/__init__.py  Mon Aug 29 14:42:04 2011 -0400
+++ b/lib/galaxy/tools/__init__.py  Mon Sep 12 19:32:26 2011 -0400
@@ -1516,7 +1516,9 @@
 param_dict[name] = DatasetFilenameWrapper( hda )
 # Provide access to a path to store additional files
 # TODO: path munging for cluster/dataset server relocatability
-param_dict[name].files_path = os.path.abspath(os.path.join( 
job_working_directory, "dataset_%s_files" % (hda.dataset.id) ))
+#param_dict[name].files_path = os.path.abspath(os.path.join( 
job_working_directory, "dataset_%s_files" % (hda.dataset.id) ))
+# This version should make it always follow the primary file
+param_dict[name].files_path = os.path.abspath( os.path.join( 
os.path.split( param_dict[name].file_name )[0], "dataset_%s_files" % 
(hda.dataset.id) ))
 for child in hda.children:
 param_dict[ "_CHILD___%s___%s" % ( name, child.designation ) ] 
= DatasetFilenameWrapper( child )
 for out_name, output in self.outputs.iteritems():

Would this break anything?

If that cannot be changed, would the best solution be to modify the upload tool 
so that it took care of this on its own?  That seems readily doable, but starts 
to decentralize control of data flow policy.

Please advise.

Thanks,
Eric Paniagua

From: galaxy-dev-boun...@lists.bx.psu.edu [galaxy-dev-boun...@lists.bx.psu.edu] 
on behalf of Paniagua, Eric [epani...@cshl.edu]
Sent: Monday, September 12, 2011 1:45 PM
To: galaxy-dev@lists.bx.psu.edu
Subject: [galaxy-dev] (Composite) Dataset Upload not Setting Metadata

Hi everyone,

I've been getting my feet wet with Galaxy development working to get some of 
the rexpression tools online, and I've run into a snag that I've traced back to 
a set_meta datatype method not being able to find a file from which it wants to 
extract metadata.  After reading the code, I believe this would also be a 
problem for non-composite datatypes.

The specific test case I've been looking at is uploading an affybatch file (and 
associated pheno file) using Galaxy's built-in upload tool and selecting the 
File Format manually (ie choosing "affybatch" in the dropdown).  I am using 
unmodified datatype definitions provided in lib/galaxy/datatypes/genetics.py 
and unmodified core Galaxy upload code as of 5955:949e4f5fa03a.  (I am also 
testing with modified versions, but I am able to reproduce and track this bug 
in the specified clean version).

The crux of the cause of error is that in JobWrapper.finish(), 
dataset.set_meta() is called (lib/galaxy/jobs/__init__.py:607) before the 
composite dataset uploaded files are moved (in a call to a Tool method 
"self.tool.collect_associated_files(out_data, self.working_directory)" on line 
670) from the job working directory to the final destination under 
config.file_path (which defaults to "database/files").

In my test case, "database.set_meta( overwrite = False )" eventually calls 
lib/galaxy/datatypes/genetics.py:Rexp.

Re: [galaxy-dev] cloud instance missing /opt/sge/default/common directory

2011-09-13 Thread Enis Afgan
Hi Joe,
If you look in /mnt/cm/paster.log on the instance, are there any indications
as to what went wrong? It should be toward the top of the log after the
server gets started.
SGE gets installed each time an instance is rebooted so simply rebooting it
again may do the trick. You can also chose to manually remove/clean SGE
before rebooting. To do so, you can follow the basic approach captured in
this method:
https://bitbucket.org/galaxy/cloudman/src/862d1087080f/cm/services/apps/sge.py#cl-26

Enis

On Sun, Sep 11, 2011 at 12:05 AM, Joseph Hargitai <
joseph.hargi...@einstein.yu.edu> wrote:

>  Hi,
>
> Upon restarting a saved cloud instance I am missing:
>
> -bash: /opt/sge/default/common/settings.sh: No such file or directory
> -bash: /opt/sge/default/common/settings.sh: No such file or directory
>
> all the other mounts are there and well preserved. Is this pulled from a
> special place i may have not saved?
>
> The instance now does not boot beyond this point. Have login and admin
> console access.
>
>
> joe
>
>
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>
>  http://lists.bx.psu.edu/
>
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Upgrading postgres version

2011-09-13 Thread James Vincent
Hello,

We'd like to migrate from postgres 8.4 installed system wide through
Ubunutu package manager to postres version 9 installed from source. Is
there a smooth and easy route for this or is it the manual database
dump/migrate pain that I think it will be?

Thanks,
Jim
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] ProFTPd on Ubuntu system.

2011-09-13 Thread Enis Afgan
I forgot to mention earlier that you'll still need to do the database GRANT
etc. as described on the wiki page you pointed to. The mentioned method does
not do that.

Enis

On Tue, Sep 13, 2011 at 8:10 AM, Enis Afgan  wrote:

> You can try extracting the ProFTP install method from mi-deployment and
> running only it via Fabric? We use this method to install ProFTP on the
> cloud and galaxy VM, both of which are based on Ubuntu 10.04.
>
> The method is available here:
> https://bitbucket.org/afgane/mi-deployment/src/d894af37a83b/mi_fabfile.py#cl-442
> (also take a look at the _required_packages method and _setup_users method
> because some additional configuration is done there).
>
> Hope this helps,
> Enis
>
>  On Mon, Sep 12, 2011 at 8:05 PM, Luobin Yang  wrote:
>
>>  Hi,
>>
>> Has anyone set up ProFTPd successfully on Ubuntu 10.04 to enable FTP
>> upload on Galaxy? I followed the instructions on this link (
>> http://wiki.g2.bx.psu.edu/Admin/Config/Upload%20via%20FTP) but it doesn't
>> work.
>>
>> Thanks,
>> Luobin
>>
>> ___
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>>
>>  http://lists.bx.psu.edu/
>>
>
>
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] ProFTPd on Ubuntu system.

2011-09-13 Thread Enis Afgan
You can try extracting the ProFTP install method from mi-deployment and
running only it via Fabric? We use this method to install ProFTP on the
cloud and galaxy VM, both of which are based on Ubuntu 10.04.

The method is available here:
https://bitbucket.org/afgane/mi-deployment/src/d894af37a83b/mi_fabfile.py#cl-442
(also take a look at the _required_packages method and _setup_users method
because some additional configuration is done there).

Hope this helps,
Enis

On Mon, Sep 12, 2011 at 8:05 PM, Luobin Yang  wrote:

> Hi,
>
> Has anyone set up ProFTPd successfully on Ubuntu 10.04 to enable FTP upload
> on Galaxy? I followed the instructions on this link (
> http://wiki.g2.bx.psu.edu/Admin/Config/Upload%20via%20FTP) but it doesn't
> work.
>
> Thanks,
> Luobin
>
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>
>  http://lists.bx.psu.edu/
>
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/