Re: [galaxy-dev] PBS_Python Unable to submit jobs
Hi Carrie, I've had the same problem. I wanted to get Galaxy to submit to a cluster which was running Torque 4.x. Torque clients need to be 4.x to work with the that version of the server. I spent a bit of time looking into this and determined that pbs_python used by Galaxy is not compatible with Torque 4.x. A new version would need to be built. At that stage I investigated using the DRMAA runner to talk to the Torque 4.x server. That did work if I built the Torque clients with the server name hard coded --with-default-server. What the DRMAA runner didn't do was data staging as the PBS runner does. So I started working on some code for that. I'm looking at giving up on the data staging by moving the Galaxy instance to the cluster. Sorry I didn't help. I would be interested in comments from Galaxy developers about whether the PBS runner will be supported in the future and, hence, whether Torque 4.x will be supported. I'm also interested whether the DRMAA runner will support data staging or whether Galaxy instances really need to share file systems with a cluster. Regards. Steve McMahon Solutions architect senior systems administrator ASC Cluster Services Information Management Technology (IMT) CSIRO Phone: +61-2-62142968 | Mobile: +61-4-00779318 steve.mcma...@csiro.aumailto:steve.mcma...@csiro.au | www.csiro.auhttp://www.csiro.au/ PO Box 225, DICKSON ACT 2602 1 Wilf Crane Crescent, Yarralumla ACT 2600 From: galaxy-dev-boun...@lists.bx.psu.edu [mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of Ganote, Carrie L Sent: Friday, 5 April 2013 4:52 AM To: galaxy-...@bx.psu.edu Subject: [galaxy-dev] PBS_Python Unable to submit jobs Hi Galaxy dev, My setup is a bit non-standard, but I'm getting the following error: galaxy.jobs.runners.pbs WARNING 2013-04-04 13:24:00,590 (75) pbs_submit failed (try 1/5), PBS error 15044: Resources temporarily unavailable Here is my setup: Torque3 is installed in /usr/local/bin and I can use it to connect with (Default) server1. Torque4 is installed in /N/soft/ and I can use it to connect to server2. I'm running trq_authd so torque4 should work. I can submit jobs to both servers from the command line. For server2, I specify the path to qsub and the servername (-q batch@server2). In Galaxy, I used torquelib_dir=/N/soft to scramble pbs_python. My path is pointing at /N/soft first so 'which qsub' returns torque4. If I just use pbs:///, it will submit a job to server1 (shouldn't work, because /N/soft/qsub doesn't work from the commandline, since the default server1 is running torque3). If I use pbs://-l vmem=100mb,walltime=00:30:00/, it won't work (the server string in pbs.py becomes -l vmem=100mb,walltime=00:30:00 intsead of server1) If I use pbs://server2/, I get the Resources temp unavail error above. The server string is server2, and I put the following in pbs.py: whichq = os.popen(which qsub).read() stats = os.popen(qstat @server2).read() These return the correct values for server2 using the correct torque version4. I'm stumped as to why this is not making the connection. It's probably something about the python implementation I'm overlooking. Thanks for any advice, Carrie Ganote ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Tool unit tests using composite datatypes
On Thu, Apr 4, 2013 at 7:19 PM, Daniel Blankenberg d...@bx.psu.edu wrote: Hi Peter, What is the test error given when you do have a value defined for name in output? Can you try using 'empty_file.dat'? e.g. output name=out_file file=empty_file.dat or output name=out_file file=empty_file.dat compare=contains etc Hi Daniel, That seems to help (plus fixing a typo in one of my child file extensions). However there is something else amiss, but my Galaxy is a little out of date: begin captured logging galaxy.web.framework: DEBUG: Error: this request returned None from get_history(): http://localhost:9486/ galaxy.web.framework: DEBUG: Error: this request returned None from get_history(): http://localhost:9486/ galaxy.web.framework: DEBUG: Error: this request returned None from get_history(): http://localhost:9486/user/logout galaxy.web.framework: DEBUG: Error: this request returned None from get_history(): http://localhost:9486/ galaxy.tools.actions.upload_common: INFO: tool upload1 created job id 1 galaxy.jobs.manager: DEBUG: (1) Job assigned to handler 'main' galaxy.jobs: DEBUG: (1) Working directory for job is: /mnt/galaxy/galaxy-central/database/job_working_directory/000/1 galaxy.jobs.handler: DEBUG: dispatching job 1 to local runner galaxy.jobs.handler: INFO: (1) Job dispatched galaxy.jobs.runners.local: DEBUG: Local runner: starting job 1 galaxy.jobs.runners.local: DEBUG: executing: python /mnt/galaxy/galaxy-central/tools/data_source/upload.py /mnt/galaxy/galaxy-central /tmp/tmpOBsw3s/database/tmp/tmpshAqc4 /tmp/tmpOBsw3s/database/tmp/tmpjPyydZ 1:/mnt/galaxy/galaxy-central/database/job_working_directory/000/1/dataset_1_files:/tmp/tmpOBsw3s/database/files/000/dataset_1.dat galaxy.jobs.runners.local: DEBUG: execution finished: python /mnt/galaxy/galaxy-central/tools/data_source/upload.py /mnt/galaxy/galaxy-central /tmp/tmpOBsw3s/database/tmp/tmpshAqc4 /tmp/tmpOBsw3s/database/tmp/tmpjPyydZ 1:/mnt/galaxy/galaxy-central/database/job_working_directory/000/1/dataset_1_files:/tmp/tmpOBsw3s/database/files/000/dataset_1.dat galaxy.jobs: DEBUG: Tool did not define exit code or stdio handling; checking stderr for success galaxy.jobs: DEBUG: job 1 ended galaxy.jobs.manager: DEBUG: (2) Job assigned to handler 'main' galaxy.jobs: DEBUG: (2) Working directory for job is: /mnt/galaxy/galaxy-central/database/job_working_directory/000/2 galaxy.jobs.handler: DEBUG: dispatching job 2 to local runner galaxy.jobs.handler: INFO: (2) Job dispatched galaxy.jobs.runners.local: DEBUG: Local runner: starting job 2 galaxy.jobs.runners.local: DEBUG: executing: makeblastdb -version /tmp/tmpOBsw3s/database/tmp/GALAXY_VERSION_STRING_2; makeblastdb -out /tmp/tmpOBsw3s/database/files/000/dataset_2_files/blastdb -in /tmp/tmpOBsw3s/database/files/000/dataset_1.dat /tmp/tmpOBsw3s/database/files/000/dataset_1.dat -title Just 4 human proteins -dbtype prot galaxy.jobs.runners.local: DEBUG: execution finished: makeblastdb -version /tmp/tmpOBsw3s/database/tmp/GALAXY_VERSION_STRING_2; makeblastdb -out /tmp/tmpOBsw3s/database/files/000/dataset_2_files/blastdb -in /tmp/tmpOBsw3s/database/files/000/dataset_1.dat /tmp/tmpOBsw3s/database/files/000/dataset_1.dat -title Just 4 human proteins -dbtype prot galaxy.tools: DEBUG: Error opening galaxy.json file: [Errno 2] No such file or directory: '/mnt/galaxy/galaxy-central/database/job_working_directory/000/2/galaxy.json' galaxy.jobs: DEBUG: job 2 ended base.twilltestcase: INFO: ## files diff /mnt/galaxy/galaxy-central/test-data/four_human_proteins.fasta.phd (51 bytes, 4 lines) and /tmp/tmpOBsw3s/database/tmp/tmpPaxnAGblastdb.phd (33 bytes, 1 lines) base.twilltestcase: INFO: ## file /mnt/galaxy/galaxy-central/test-data/four_human_proteins.fasta.phd line 1 is '718449\x022\n' base.twilltestcase: INFO: ## file /mnt/galaxy/galaxy-central/test-data/four_human_proteins.fasta.phd line 2 is '2924903341\x020\n' base.twilltestcase: INFO: ## file /mnt/galaxy/galaxy-central/test-data/four_human_proteins.fasta.phd line 3 is '3666588750\x021\n' base.twilltestcase: INFO: ## file /mnt/galaxy/galaxy-central/test-data/four_human_proteins.fasta.phd line 4 is '539247318\x023\n' base.twilltestcase: INFO: ## file /tmp/tmpOBsw3s/database/tmp/tmpPaxnAGblastdb.phd line 1 is 'This is a BLAST protein database.' base.twilltestcase: INFO: ## sibling files to /tmp/tmpOBsw3s/database/tmp/tmpPaxnAGblastdb.phd in same directory: base.twilltestcase: INFO: ## sibling file: tmpUMWHqD base.twilltestcase: INFO: ## sibling file: twilltestcase-MDnhZ1.html base.twilltestcase: INFO: ## sibling file: twilltestcase-_vAVc9.html base.twilltestcase: INFO: ## sibling file: tmpyiy0Kvempty_file.dat base.twilltestcase: INFO: ## sibling file: twilltestcase-MHWm9w.html base.twilltestcase: INFO: ## sibling file: tmpPaxnAGblastdb.phd base.twilltestcase: INFO: ## sibling file: twilltestcase-dVKyqw.html base.twilltestcase: INFO: ## sibling file: tmpshAqc4
[galaxy-dev] core dump after trying to view data
Hello Galaxy Buddies, I am having a problem with run.sh core dumping. It happens when I try to view data (click on the eye) for a large text file of results. It doesn't happen when viewing data of a smaller size and it executes jobs fine. Here is the server's output. Starting server in PID 30831. serving on http://127.0.0.1:8000 127.0.0.1 - - [05/Apr/2013:15:08:26 -0400] GET / HTTP/1.1 200 - - Mozilla/5.0 (X11; Linux x86_64; rv:10.0.11) Gecko/20121116 Firefox/10.0.11 127.0.0.1 - - [05/Apr/2013:15:08:26 -0400] GET /root/tool_menu HTTP/1.1 200 - http://localhost:8000/; Mozilla/5.0 (X11; Linux x86_64; rv:10.0.11) Gecko/20121116 Firefox/10.0.11 127.0.0.1 - - [05/Apr/2013:15:08:26 -0400] GET /history HTTP/1.1 200 - http://localhost:8000/; Mozilla/5.0 (X11; Linux x86_64; rv:10.0.11) Gecko/20121116 Firefox/10.0.11 127.0.0.1 - - [05/Apr/2013:15:08:27 -0400] GET /history/get_display_application_links HTTP/1.1 200 - http://localhost:8000/history; Mozilla/5.0 (X11; Linux x86_64; rv:10.0.11) Gecko/20121116 Firefox/10.0.11 127.0.0.1 - - [05/Apr/2013:15:08:30 -0400] GET /datasets/1e8ab44153008be8/display/?preview=True HTTP/1.1 200 - http://localhost:8000/history; Mozilla/5.0 (X11; Linux x86_64; rv:10.0.11) Gecko/20121116 Firefox/10.0.11 run.sh: line 78: 30831 Segmentation fault (core dumped) python ./scripts/paster.py serve universe_wsgi.ini $@ [jje16@rgs06 galaxy-dist]$ I am running Galaxy on Red Hat Linux under the most recent stable build. If it matters, it submits jobs to a cluster using the drmaa job runner. [jje16@rgs06 galaxy-dist]$ hg summary parent: 9232:9264cf7148c0 Added tag release_2013.04.01 for changeset 75f09617abaa branch: stable commit: 85 unknown (clean) update: (current) [jje16@rgs06 galaxy-dist]$ any ideas on where to look? thanks much, Jason Jason Evans jason.j.ev...@gmail.com ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] GalaxyAdmins Group: Future Directions?
Hello, Based on my observations and interactions during the past few meet-ups and as an administrator of Galaxy, the current goals of the group [1] remain relevant and are being met through the meet-ups. An additional benefit to these meet-ups that has evolved is an update from a member of the Galaxy team on upcoming features and other developments. With regard to the future of the group, my suggestion is to keep it going and work towards building a full-fledged user group with additional opportunities for sharing, learning and collaborating. This will strengthen the community, possibly drive increased adoption, and further serve to guide development of Galaxy. Appointing one of more leaders (one each from the Galaxy team and user community) to identify speakers, drive the agenda, set a cadence to the meet-ups and drive the formation of a user group would be helpful, IMHO. Thanks! Notes: 1. Build a community, learn from each other From: Dave Clements cleme...@galaxyproject.orgmailto:cleme...@galaxyproject.org Date: Monday, April 1, 2013 11:07 AM To: Galaxy Dev List galaxy-...@bx.psu.edumailto:galaxy-...@bx.psu.edu Subject: [galaxy-dev] GalaxyAdmins Group: Future Directions? Hello all, The GalaxyAdmins group is coming up on it's one year anniversary (coinciding with GCC2013) and this is a good opportunity to discuss what the future of the group should be. Some starting topics for discussion are on the GalaxyAdmins Future Directions page (http://wiki.galaxyproject.org/Community/GalaxyAdmins/Future). These include * What should the group's goals and activities be? * What type of leadership structure should the group have, and how should it be selected? The discussion, however, is wide open to any topic relevant to the group. If you have any opinions or suggestions please reply to the group. Anyone with an interest in the group is encouraged to post. Once the discussion settles, I will summarize the discussion on the wiki page and suggest an action plan for making those things happen. Thanks, Dave C. -- http://galaxyproject.org/GCC2013 http://galaxyproject.org/ http://getgalaxy.org/ http://usegalaxy.org/ http://wiki.galaxyproject.org/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] PBS_Python Unable to submit jobs
Hi Steve, Apologies, I didn't check the Galaxy list before sending you an email. I came to mostly the same conclusion. I installed the Torque 4.x client on the submit node and I can submit jobs that way through the command line without issue. I can't get pbs_submit to work from pbs_python, however. Seems like it has to be some way in which the swig is translating the C code, or the python library is somehow not working with trqauthd over localhost:15005, or some other mysterious error. Drmaa was my first choice, but our server was configured without --enable-drmaa, so I haven't been able to submit to it that way either. We've previously used pbs before so I thought it was a pretty safe backup plan! Luckily, I don't have to do staging, we mounted our shared filesystem onto the VM running galaxy - you might look into Lustre if you have any ability to control that. I highly recommend bribing your sysadmins with beer. I do hope there will be continued work done to address this issue - not because I have anything against drmaa, but because I suspect that the error lies upon false assumptions somewhere in the code that would do well to be fixed. Thanks for your help! Carrie Ganote From: steve.mcma...@csiro.au [steve.mcma...@csiro.au] Sent: Friday, April 05, 2013 2:24 AM To: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] PBS_Python Unable to submit jobs Hi Carrie, I’ve had the same problem. I wanted to get Galaxy to submit to a cluster which was running Torque 4.x. Torque clients need to be 4.x to work with the that version of the server. I spent a bit of time looking into this and determined that pbs_python used by Galaxy is not compatible with Torque 4.x. A new version would need to be built. At that stage I investigated using the DRMAA runner to talk to the Torque 4.x server. That did work if I built the Torque clients with the server name hard coded --with-default-server. What the DRMAA runner didn’t do was data staging as the PBS runner does. So I started working on some code for that. I’m looking at giving up on the data staging by moving the Galaxy instance to the cluster. Sorry I didn’t help. I would be interested in comments from Galaxy developers about whether the PBS runner will be supported in the future and, hence, whether Torque 4.x will be supported. I’m also interested whether the DRMAA runner will support data staging or whether Galaxy instances really need to share file systems with a cluster. Regards. Steve McMahon Solutions architect senior systems administrator ASC Cluster Services Information Management Technology (IMT) CSIRO Phone: +61-2-62142968 | Mobile: +61-4-00779318 steve.mcma...@csiro.aumailto:steve.mcma...@csiro.au | www.csiro.auhttp://www.csiro.au/ PO Box 225, DICKSON ACT 2602 1 Wilf Crane Crescent, Yarralumla ACT 2600 From: galaxy-dev-boun...@lists.bx.psu.edu [mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of Ganote, Carrie L Sent: Friday, 5 April 2013 4:52 AM To: galaxy-...@bx.psu.edu Subject: [galaxy-dev] PBS_Python Unable to submit jobs Hi Galaxy dev, My setup is a bit non-standard, but I'm getting the following error: galaxy.jobs.runners.pbs WARNING 2013-04-04 13:24:00,590 (75) pbs_submit failed (try 1/5), PBS error 15044: Resources temporarily unavailable Here is my setup: Torque3 is installed in /usr/local/bin and I can use it to connect with (Default) server1. Torque4 is installed in /N/soft/ and I can use it to connect to server2. I'm running trq_authd so torque4 should work. I can submit jobs to both servers from the command line. For server2, I specify the path to qsub and the servername (-q batch@server2). In Galaxy, I used torquelib_dir=/N/soft to scramble pbs_python. My path is pointing at /N/soft first so 'which qsub' returns torque4. If I just use pbs:///, it will submit a job to server1 (shouldn't work, because /N/soft/qsub doesn't work from the commandline, since the default server1 is running torque3). If I use pbs://-l vmem=100mb,walltime=00:30:00/, it won't work (the server string in pbs.py becomes -l vmem=100mb,walltime=00:30:00 intsead of server1) If I use pbs://server2/, I get the Resources temp unavail error above. The server string is server2, and I put the following in pbs.py: whichq = os.popen(which qsub).read() stats = os.popen(qstat @server2).read() These return the correct values for server2 using the correct torque version4. I'm stumped as to why this is not making the connection. It's probably something about the python implementation I'm overlooking. Thanks for any advice, Carrie Ganote ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at:
[galaxy-dev] Running two Galaxy instances on the same Torque cluster
Hello, I have a question about running two Galaxy instances on separate hosts on the same Torque cluster. For various reasons, including some recent changes and/or removal of certain features (I am told BLAST was affected) in the newer versions of Galaxy, I would like to keep our current older version of Galaxy running while creating a separate Torque submit host to run the latest version of Galaxy on it. I do not think that will pose any issues for Torque since it will just see the new host as another submit host for jobs, but I would like to know if this would cause any unforeseen issues for either of the Galaxy instances. They will both mount and store their data on the same network filesystem but I will naturally have to create two separate directory trees for their /basedir/database/files, pbs, job_working_directory, etc/ paths. I am planning on making the local user the same on both submit nodes ('galaxy' - we are not using LDAP on that cluster although we may in the future). Will that cause any strange issues such as jobs being reported back to the wrong galaxy instance? Will IP address or DNS name be a factor? Additionally I hope there will not be an issue with the two instances both pointing to the same FTP upload directory. The idea seems sound in my head but I want to make sure I'm not excluding any critical considerations. Any suggestions or insights would be appreciated. Thanks, Josh Nielsen HudsonAlpha Institute for Biotechnology ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/