Re: [galaxy-dev] Job output not returned from cluster
On Fri, Jul 29, 2011 at 1:01 AM, Ka Ming Nip km...@bcgsc.ca wrote: My jobs have this problem when the command for the tool is wrapped by the stderr wrapper script. Ka Ming Which stderr wrapper script? I think there is more than one... I've also had this error message (I'm currently working out how to connect our Galaxy to our cluster), and in at least one case it was caused by a file permission problem - the tool appeared to run but could not write the output files. If Galaxy could give more diagnostics rather than just Job output not returned from cluster it would help. For instance, as we use SGE, perhaps the captured stdout/stderr files might be available. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] problem with job runner
Hi In my galaxy instance, whatever jobs i am submitting it goes into queued state. If I restart the server then the previous submitted jobs state changes to running. but the newly submitted jobs again goes to queued state. I am at a loss to understand this behaviour of galaxy and unable to debug it. The job submission uses a customized runner. How is it actually goes into the queued state automatically when all the workers thread are free? Does galaxy_session table is_valid attribute makes jobs state true? Or what all places in tables the queued states are getting stored. I can only see that the jobs table state attribute only stores the state. The server logs points error here: galaxy.jobs ERROR 2011-07-29 11:01:28,098 failure running job 2243 Traceback (most recent call last): File /home/gwadmin/galaxy-central/ lib/galaxy/jobs/__init__.py, line 202, in __monitor_step self.dispatcher.put( JobWrapper( job, self ) ) File /home/gwadmin/galaxy-central/lib/galaxy/jobs/__init__.py, line 856, in put self.job_runners[runner_name].put( job_wrapper ) File /home/gwadmin/galaxy-central/lib/galaxy/jobs/runners/gw.py, line 375, in put job_wrapper.change_state( model.Job.states.QUEUED ) File /home/gwadmin/galaxy-central/lib/galaxy/jobs/__init__.py, line 437, in change_state self.sa_session.flush() Regards shashi ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Catch problems preparing cluster job scripts
Hi all, I've run into some file permissions problems as part of using the same mapped directory on both the Galaxy server and our cluster. In the process I wrote the following patch which fixes the following bug - where Galaxy seems to leave the job in the pending state: galaxy.jobs INFO 2011-07-29 14:06:46,170 job 30 dispatched galaxy.jobs.runners.drmaa ERROR 2011-07-29 14:06:46,582 Uncaught exception queueing job Traceback (most recent call last): File /data/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 114, in run_next self.queue_job( obj ) File /data/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 164, in queue_job os.chmod( jt.remoteCommand, 0750 ) OSError: [Errno 1] Operation not permitted: '/data/galaxy/galaxy-central/database/pbs/galaxy_30.sh' The jobs was left stuck in the grey pending state. It looks like this exception should have been caught and the job put into an error state as in this patch: https://bitbucket.org/peterjc/galaxy-central/changeset/c5fa48633c0b This is currently the one and only change on this branch: https://bitbucket.org/peterjc/galaxy-central/src/job_scripts Please could this be reviewed and applied to the trunk. Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Problems with Galaxy on a mapped drive
Hi all, In my recent email I mentioned problems with our setup and mapped drives. I am running a test Galaxy on a server under a CIFS mapped drive. If I map the drive with noperms then things seem to work with submitting jobs to the cluster etc, but that doesn't seem secure at all. Mounting with strict permissions seems to cause various network latency related problems in Galaxy though. Specifically during loading the converters and history export tool, Galaxy creates a temporary XML file which it then tries to parse. I was able to resolve this by switching from tempfile.TemporaryFile to tempfile.mkstemp and adding a 1s sleep, but it isn't very elegant. Couldn't you use a StringIO handle instead? Later during start up there were two errors with a similar issue - Galaxy creates a temp folder then immediately tries to write a tar ball or zip file to it. Again, adding a 1 second sleep after creating the directory before using it seems to work. See lib/galaxy/web/controllers/dataset.py After that Galaxy started, but still gives problems - like the issue reported here which Galaxy handled badly (see patch): http://lists.bx.psu.edu/pipermail/galaxy-dev/2011-July/006213.html Here again, inserting a one second sleep between writing the cluster script file and setting its permissions made it work. If those are the only issues, that can be dealt with. But are there likely to be lots more similar problems of this nature later on? That is my worry. How are most people setting up mapped drives for Galaxy with a cluster? Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Could anyone tell me why the result file in directary /galaxy-dist /database/files/000 do not return to the web page
Dear Sir, We have a program writen in perl. It can run nomally in linux environment with a file saved as result. After added it to galaxy system, we found that it can produce the right result file in directary /var/we/galaxy-dist/database/files/000 , but the result file did not return to the web page where the state is always Job is currently running. Could anyone tell me what is wrong ? why it does not work? The bagging-SVM.xml and the bagging-SVM.pl are attached. Thank you very much! Best Wishes! Yan-Hui Li bagging-SVM.pl Description: Binary data tool id=bagging-SVM_1 name=bagging-SVM description Integration and co-expression network from the two features have the same reference gene prediction and functional genes /description command interpreter=perlbagging-SVM.pl $infile_ref $infile_avedist1k $infile_corrgsea1k $outfile_all_positive /command inputs param format=text name=infile_ref type=data label= ../GO_angiogenesis_expressed20101024smr0.5_fppi.txt / param format=text name=infile_avedist1k type=data label= ../expressed_genes_20101024.hprd8_biogrid-2.0.58.angiogenesis_avedist1k.txt / param format=text name=infile_corrgsea1k type=data label= ../expressed_genes_20101024.hprd8_biogrid-2.0.58.angiogenesis_corrgsea1k.txt / /inputs outputs data format=tabular name=outfile_all_positive label=outfile_all_positive/ /outputs tests test param name=input value=1/ param name=input value=../GEO_dataset/expressed_genes_u133a.6k_u133p2.8k_20101024.txt/ param name=input value=../ppi/hprd8_biogrid-2.0.58.ppi.txt/ output name=out_file1 file=expressed_genes_20101024.hprd8_biogrid-2.0.58.ppi.txt/ /test /tests help Integration and co-expression network from the two features have the same reference gene prediction and functional genes /help /tool ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] [API] Apache error?
Hello everyone, I'm working on a script that uploads files and launchs workflows on them, but I keep getting errors that appear more or less randomly when the display() and submit() functions are called. In a nutshell, there is a 1/3 chance the calls fail this way. Nevertheless, the actions are properly triggered in Galaxy, surprisingly. Here is an example: When I launch a workflow, I get the following traceback even though the workflow is properly executed: http://localhost/galaxy-dev/api/workflows?key=273c7b4e3aaffd3884ef715aaf780d9a File automated_preprocessing.py, line 61, in expandFile result = submit( api_key, api_url + 'workflows', wf_data, return_formatted=False) File common.py, line 100, in submit r = post( api_key, url, data ) File common.py, line 44, in post return simplejson.loads( urllib2.urlopen( req ).read() ) File /g/steinmetz/collaboration/software/CentOS5/opt/Python-2.6.5/lib/python2.6/socket.py, line 329, in read data = self._sock.recv(rbufsize) File /g/steinmetz/collaboration/software/CentOS5/opt/Python-2.6.5/lib/python2.6/httplib.py, line 518, in read return self._read_chunked(amt) File /g/steinmetz/collaboration/software/CentOS5/opt/Python-2.6.5/lib/python2.6/httplib.py, line 561, in _read_chunked raise IncompleteRead(''.join(value)) Failed to expand file A: Type: class 'httplib.IncompleteRead' IncompleteRead(118 bytes read) Therefore, I cannot get the results within the script. Any idea? Best, L-A ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] manage library permission interface glitch
Hello Glen, Thanks very much for finding this issue. It's been corrected in change set 5842:bb51baa20151, which should be available in the distribution within the next few weeks. It is currently available in our development repo at https://bitbucket.org/galaxy/galaxy-central/ if you need it sooner. Greg Von Kuster On Jul 27, 2011, at 2:08 PM, Glen Beane wrote: I should preface this by saying we are running a galaxy-dist that is a couple months old now (we are planning on updating soon), but here is an annoying issue we are seeing managing data library permissions: The Roles not associated lists are filtered so it shows only roles associated with the access library permission -- this is good, except this is filtered for the access library permission too, meaning once you add roles to the access library permission and save you can't add any other roles to this permission since the Roles not associated for access library now only includes roles associated with access library. To add any new roles we need to first remove every role from this permission, making it public, which then causes all roles to get listed in roles not associated, and then we can associate all of the roles we want. The correct behavior would be to filter every roles not associated list for every permission except for access library -- this should always list all roles not associated. I will open a new issue for this on Bitbucket unless a developer can confirm that it has been fixed since my last galaxy-dist update a couple months ago. -- Glen L. Beane Senior Software Engineer The Jackson Laboratory (207) 288-6153 ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ Greg Von Kuster Galaxy Development Team g...@bx.psu.edu ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] BWA sampe runs 'forever'
Hello, Question for galaxy maintainers: have you encountered situations where BWA jobs run 'forever' (for days) ? A little digging shows that it's the bwa sampe step, and SEQanswers thread mention it's somewhat common: http://seqanswers.com/forums/showthread.php?t=11652 http://seqanswers.com/forums/showthread.php?t=6340 One suggested work-around is to add -A to bwa sampe, but there's no way to do it in the current BWA wrapper. (On the command-line with the same input files it does make things much faster). I'm wondering if this is common enough to justify adding -A to the wrapper, or is it rare enough and can be ignored. -gordon ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] upload.py fails with files having commas in file name
The ftp upload module fails when a file has a comma in the file name for example test.bam works but when copied to test,test.bam that file fails. Cheers, Ilya Ilya Chorny Ph.D. Bioinformatics - Intern icho...@illumina.commailto:icho...@illumina.com 858-202-4582 ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Job output not returned from cluster
It was the one on the wiki page. Ka Ming From: Peter Cock [p.j.a.c...@googlemail.com] Sent: July 29, 2011 2:42 AM To: Ka Ming Nip Cc: Edward Kirton; galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Job output not returned from cluster On Fri, Jul 29, 2011 at 1:01 AM, Ka Ming Nip km...@bcgsc.ca wrote: My jobs have this problem when the command for the tool is wrapped by the stderr wrapper script. Ka Ming Which stderr wrapper script? I think there is more than one... I've also had this error message (I'm currently working out how to connect our Galaxy to our cluster), and in at least one case it was caused by a file permission problem - the tool appeared to run but could not write the output files. If Galaxy could give more diagnostics rather than just Job output not returned from cluster it would help. For instance, as we use SGE, perhaps the captured stdout/stderr files might be available. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] BWA sampe runs 'forever'
Chris Fields wrote, On 07/29/2011 12:35 PM: On Jul 29, 2011, at 11:00 AM, Assaf Gordon wrote: Question for galaxy maintainers: have you encountered situations where BWA jobs run 'forever' (for days) ? ... I'm wondering if this is common enough to justify adding -A to the wrapper, or is it rare enough and can be ignored. Can it be added as an optional argument? So that if it is selected, the setting is passed to the tool, otherwise not. Obviously, it can be added to the XML wrapper. I'm asking if other people encountered this situation with Paired-end mapping, and whether it's rare or not. From a user's perspective, this kind of parameter makes no sense: it basically tells the user: run BWA without this parameter, and if after 24 hours the job is still running, kill it and re-run with this parameter. That's not the way to build user-friendly tools :( -gordon ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] BWA sampe runs 'forever'
On Jul 29, 2011, at 12:44 PM, Assaf Gordon wrote: Chris Fields wrote, On 07/29/2011 12:35 PM: On Jul 29, 2011, at 11:00 AM, Assaf Gordon wrote: Question for galaxy maintainers: have you encountered situations where BWA jobs run 'forever' (for days) ? ... I'm wondering if this is common enough to justify adding -A to the wrapper, or is it rare enough and can be ignored. Can it be added as an optional argument? So that if it is selected, the setting is passed to the tool, otherwise not. Obviously, it can be added to the XML wrapper. I'm asking if other people encountered this situation with Paired-end mapping, and whether it's rare or not. If there are cases where this does occur (as you have pointed out), then the XML wrapper should have the option and maybe even a bit of documentation pointing this out (even if it is a rare event). I wouldn't want this running on a cluster where users are charged, for instance. From a user's perspective, this kind of parameter makes no sense: it basically tells the user: run BWA without this parameter, and if after 24 hours the job is still running, kill it and re-run with this parameter. That's not the way to build user-friendly tools :( -gordon It's definitely not intuitive and violates least surprise. If I explicitly set any parameter, I don't expect the application to possibly (and silently) assume I am wrong and change it anyway. chris ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Job output not returned from cluster
thanks for your comments, fellas. permissions would certainly cause this problem, but that's not the cause for me. most wrappers just serve to redirect stderr, so i don't think it's the wrapper script itself, but the stdout/stderr files are part of the problem. the error message is thrown in the finish_job method when it can't open the source/dest stdout/stderr for reading/writing. i split the try statement to add finer-grained error messages but i already verified the files do exist, so it's seems to be a file system issue. i suspect it's because the storage i'm using as a staging area has flashdrives between the RAM and spinnning disks, so upon close, the file buffers may get flushed out of RAM to the SSDs but not immediately be available from the SCSI drives. Or maybe the (inode) metadata table hasn't finished updating yet. if so, it's not the fact that the cluster is heavily utilized, but the filesystem is. this disk is expressly for staging cluster jobs. i'll see if adding a short sleep and retry once upon error solves this problem... but i won't know immediately as the problem is intermittent. that's the problem with fancy toys; they often come with fancy problems! On Fri, Jul 29, 2011 at 2:42 AM, Peter Cock p.j.a.c...@googlemail.comwrote: also had this error message (I'm currently working out how to connect our Galaxy to our cluster), and in at least one case it was caused by a file permission problem - the tool appeared to run but could not write the output files. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] job status when SGE kills/aborts job
We are using SGE cluster with our galaxy install. We have specified resource and run-time limits for certain tools using tool specific drmaa URL configuration, e.g.: - run-time (h_rt, s_rt) - memory (vf, h_vmem). This helps scheduler in submitting jobs to an appropriate node and also prevent node from crashing because of excessive memory consumption. However, sometimes a job needs more resources and/or run-time than specified in the drmaa URL configuration. In such cases SGE kills particular job and we get email notification with appropriate job summary. However, the galaxy web interface doesn't show any error for such failures. The job table doesn't contain any related state/info as well. The jobs are shown in green-boxes meaning they completed without any failure. In reality these jobs have been killed/aborted by the scheduler. This is really confusing as there is inconsistency between job status indicated by the galaxy and SGE/drmaa. Has anyone else experienced and/or addressed this issue? Any comments or suggestions will be really helpful. Thanks, Shantanu. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] job status when SGE kills/aborts job
Hi Shantanu, I am also using a SGE cluster and the DRMAA runner for my Galaxy install. I am also having the same issue for jobs that were killed. How did you define the run-time or memory/runtime configurations in your DRMAA URLs? I had to add -w n in the DRMAA URLs in order for my jobs to be dispatched to the cluster. However, someone said (on another thread) that doing so might hide the errors. I am not sure if this is the cause since my jobs won't be dispatched at all if -w n was not in the DRMAA URLs. Ka Ming From: galaxy-dev-boun...@lists.bx.psu.edu [galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of Shantanu Pavgi [pa...@uab.edu] Sent: July 29, 2011 1:56 PM To: galaxydev psu Subject: [galaxy-dev] job status when SGE kills/aborts job We are using SGE cluster with our galaxy install. We have specified resource and run-time limits for certain tools using tool specific drmaa URL configuration, e.g.: - run-time (h_rt, s_rt) - memory (vf, h_vmem). This helps scheduler in submitting jobs to an appropriate node and also prevent node from crashing because of excessive memory consumption. However, sometimes a job needs more resources and/or run-time than specified in the drmaa URL configuration. In such cases SGE kills particular job and we get email notification with appropriate job summary. However, the galaxy web interface doesn't show any error for such failures. The job table doesn't contain any related state/info as well. The jobs are shown in green-boxes meaning they completed without any failure. In reality these jobs have been killed/aborted by the scheduler. This is really confusing as there is inconsistency between job status indicated by the galaxy and SGE/drmaa. Has anyone else experienced and/or addressed this issue? Any comments or suggestions will be really helpful. Thanks, Shantanu. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] collect_associated_files and umask
When I run galaxy as the actual user using the code I committed to my fork, I run into a problem with dataset_*.dat files that have associated data wherein the associated data files are copied from the job_working_directory into the files directory. That directory is owned by the actual user and not by the galaxy user. I don't have a problem with the ownership but the permissions of the directory and associated files get changed to 777. Any thoughts on why the permissions get changed. The permissions are 755 and 644 for dirs and files respectively when in the working directory so why do they change when the directory is moved? Any help would be greatly appreciated. Best, Ilya Ilya Chorny Ph.D. Bioinformatics - Intern icho...@illumina.commailto:icho...@illumina.com 858-202-4582 ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] job status when SGE kills/aborts job
On Jul 29, 2011, at 4:13 PM, Ka Ming Nip wrote: Hi Shantanu, I am also using a SGE cluster and the DRMAA runner for my Galaxy install. I am also having the same issue for jobs that were killed. How did you define the run-time or memory/runtime configurations in your DRMAA URLs? I had to add -w n in the DRMAA URLs in order for my jobs to be dispatched to the cluster. However, someone said (on another thread) that doing so might hide the errors. I am not sure if this is the cause since my jobs won't be dispatched at all if -w n was not in the DRMAA URLs. Ka Ming The drmaa/SGE URL in our configuration looks something like this: {{{ drmaa:// -V -m be -M email.address.for.notification -l vf=memory,h_rt=hard-run-time,s_rt=soft-run-time,h_vmem=memory / }}} We don't use -w n option in our configuration. The -w n will turn off validation of your job script. Refer to qsub manual for details. The -l options (complex configuration options) can be found here: http://linux.die.net/man/5/sge_complex . Hope this helps you. -- Shantanu. From: galaxy-dev-boun...@lists.bx.psu.edumailto:galaxy-dev-boun...@lists.bx.psu.edu [galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of Shantanu Pavgi [pa...@uab.edu] Sent: July 29, 2011 1:56 PM To: galaxydev psu Subject: [galaxy-dev] job status when SGE kills/aborts job We are using SGE cluster with our galaxy install. We have specified resource and run-time limits for certain tools using tool specific drmaa URL configuration, e.g.: - run-time (h_rt, s_rt) - memory (vf, h_vmem). This helps scheduler in submitting jobs to an appropriate node and also prevent node from crashing because of excessive memory consumption. However, sometimes a job needs more resources and/or run-time than specified in the drmaa URL configuration. In such cases SGE kills particular job and we get email notification with appropriate job summary. However, the galaxy web interface doesn't show any error for such failures. The job table doesn't contain any related state/info as well. The jobs are shown in green-boxes meaning they completed without any failure. In reality these jobs have been killed/aborted by the scheduler. This is really confusing as there is inconsistency between job status indicated by the galaxy and SGE/drmaa. Has anyone else experienced and/or addressed this issue? Any comments or suggestions will be really helpful. Thanks, Shantanu. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] job status when SGE kills/aborts job
On Jul 29, 2011, at 8:03 PM, ambarish biswas wrote: With Regards, Ambarish Biswas, University of Otago Department of Biochemistry, Dunedin, New Zealand, Tel: +64(22)0855647 Fax: +64(0)3 479 7866 Hi have u tested the option drmaa://-q galaxy -V/ option yet? Here as it suggests, -q galaxy creats a queue name with galaxy, not sure what V stands for, but could be verbose The -V option is not for verbose mode but for exporting your shell environment. Refer to qsub manual for details: Specifies that all environment variables active within the qsub utility be exported to the context of the job. We are already using it in our configuration as needed. I think we are having problem with the galaxy (or drmaa Python lib) parsing correct drmaa/SGE messages and not with the drmaa URL configuration. Thoughts? -- Shantanu. On Sat, Jul 30, 2011 at 9:13 AM, Ka Ming Nip km...@bcgsc.camailto:km...@bcgsc.ca wrote: Hi Shantanu, I am also using a SGE cluster and the DRMAA runner for my Galaxy install. I am also having the same issue for jobs that were killed. How did you define the run-time or memory/runtime configurations in your DRMAA URLs? I had to add -w n in the DRMAA URLs in order for my jobs to be dispatched to the cluster. However, someone said (on another thread) that doing so might hide the errors. I am not sure if this is the cause since my jobs won't be dispatched at all if -w n was not in the DRMAA URLs. Ka Ming From: galaxy-dev-boun...@lists.bx.psu.edumailto:galaxy-dev-boun...@lists.bx.psu.edu [galaxy-dev-boun...@lists.bx.psu.edumailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of Shantanu Pavgi [pa...@uab.edumailto:pa...@uab.edu] Sent: July 29, 2011 1:56 PM To: galaxydev psu Subject: [galaxy-dev] job status when SGE kills/aborts job We are using SGE cluster with our galaxy install. We have specified resource and run-time limits for certain tools using tool specific drmaa URL configuration, e.g.: - run-time (h_rt, s_rt) - memory (vf, h_vmem). This helps scheduler in submitting jobs to an appropriate node and also prevent node from crashing because of excessive memory consumption. However, sometimes a job needs more resources and/or run-time than specified in the drmaa URL configuration. In such cases SGE kills particular job and we get email notification with appropriate job summary. However, the galaxy web interface doesn't show any error for such failures. The job table doesn't contain any related state/info as well. The jobs are shown in green-boxes meaning they completed without any failure. In reality these jobs have been killed/aborted by the scheduler. This is really confusing as there is inconsistency between job status indicated by the galaxy and SGE/drmaa. Has anyone else experienced and/or addressed this issue? Any comments or suggestions will be really helpful. Thanks, Shantanu. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] exporting environment variables to SGE in galaxy
Hi, I am setting up SGE in our galaxy mirror. One problem I have is that I cannot export environment variables of the specific users running the galaxy service. On command line, I did this by qsub -V script.sh. or add a line #$ -V in script.sh I tried to change /lib/galaxy/jobs/runners/sge.py and lib/galaxy/jobs/runners/drmaae.py by adding a line #$ -V under sge_template = #!/bin/sh #$ -S /bin/sh but this did not help. Any idea is very appreciated! Chaolin ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Does Galaxy have a tool to run R script?
Dear all, I want to build a Galaxy tool to run R script. Do you know if there is already such tool or similar function? If you can share with me, I would very appreciate your help. Thank you, Bo Liu ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/