Re: [galaxy-dev] [galaxy-user] Amazon EC2: An error occurred running this job: Job output not returned from cluster
I am unfortunately not suuuper sure how I fixed this issue, as I was doing some pretty bad troubleshooting techniques and changing a ton of things at once. There are two things I may be able to suggest. When you create the amazon instance, there is an option to use optimized EBS storage. There is a possibility that that option combined with the retry option30 in my first email are able to solve the issue. If you haven't already, it would be worth it to try just adding that extra line, commit changes and restart the service afterwards. I noticed the first two times I tried modifying the ini file, the changes were not committed, so that may contribute to it not working. Good luck, Brian -- Brian Lin cont...@brian-lin.com brian@tufts.edu On Fri, May 3, 2013 at 4:33 PM, Dave Lin d...@verdematics.com wrote: Dear Galaxy-Dev I was hoping to see if anybody had any suggestions to resolve this error. To summarize, I'm using cloudman/Amazon EC2. I typically batch analyze 20-100 data sets against a workflow. (launched serially using bioblend script) I'm consistently seeing the following error An error occurred running this job: Job output not returned from cluster when I launch a large number of samples. If I analyze the same data sets/workflow, but launch 5 at a time, the analysis proceeds smoothly. Any pointers would be appreciated. Thanks Dave On Thu, May 2, 2013 at 1:51 PM, Dave Lin d...@verdematics.com wrote: I am getting similar errors as Brian reported back in March. (Note, we appear to have the same last name, but no relation) An error occurred with this dataset: *Job output not returned from cluster* * * - Running on Cloudman with 5-6 nodes. (xlarge) - The error seems to occur consistently when I launch multiple workflows in batch (using bioblend) - Probably not relevant, but is failing on a BWA step. - I am able to run successfully the same workflow against one of the datasets that failed in batch. - Change-set is from Feb 8, 2013. 8794:1c7174911392. Stable branch. Prior to that, I was running different galaxy instances using changesets from last year and never ran into this problem. - I'm seeing errors like: galaxy.jobs.runners.drmaa WARNING 2013-05-02 17:07:51,991 Job output not returned from cluster: [Errno 2] No such file or directory\ : '/mnt/galaxyData/tmp/job_working_directory/002/2066/2066.drmec' - In this example, the /mnt/galaxyData/tmp/job_working_directory/002/2066 folder and /mnt/galaxyData/tmp/job_working_directory/002/2066/2066.drmec files do not exist. Any suggestions? Seems like this might be some type of resource contention issue, but I'm not sure where to investigate next. Thanks in advance, Dave On Mon, Mar 11, 2013 at 9:04 AM, Brian Lin brian@tufts.edu wrote: Hi guys, I'm running a galaxy cloudman instance and running the usual tophat-cufflinks-cuffdiff workflow from RNAseq data. I am using a m2.4xlarge as a master node, and autoscaling from 0-4 workers of the m2.xlarge type. I have gotten the error: An error occurred running this job: *Job output not returned from cluster* when running fasta groomer, tophat, and now cufflinks. Following up troubleshooting from other people in the mailing list, I have set a new line in universe_wsgi.ini of retry_job_output_collection=30 Unfortunately, this does not seem to have fixed the problem. The stdout is blank, and stderr gives Job output not returned from cluster Under manage jobs in the admin panel, it lists 4 out of the 6 jobs as currently running. What is confusing is that of the 4 running, one has already returned the error in the user dataset panel and yet is still listed as running. From the SGE log, I see these errors: 03/11/2013 14:32:52|worker|ip-10-159-47-223|W|job 42.1 failed on host ip-10-30-130-84.ec2.internal before writing exit_status because: shepherd exited with exit status 19: before writing exit_status 03/11/2013 14:32:52|worker|ip-10-159-47-223|W|job 43.1 failed on host ip-10-30-130-84.ec2.internal before writing exit_status because: shepherd exited with exit status 19: before writing exit_status 03/11/2013 14:39:41|worker|ip-10-159-47-223|E|adminhost ip-10-30-130-84.ec2.internal already exists 03/11/2013 14:40:07|worker|ip-10-159-47-223|E|exechost ip-10-30-130-84.ec2.internal already exists 03/11/2013 14:50:54|worker|ip-10-159-47-223|E|adminhost ip-10-30-130-84.ec2.internal already exists 03/11/2013 14:50:55|worker|ip-10-159-47-223|E|exechost ip-10-30-130-84.ec2.internal already exists Does anyone have any idea how to solve this error? It has removed my ability to use workflows completely and I still have not been able to run a single analysis to completion due to it. Thanks for any insight anyone provide! Brian -- Brian Lin cont...@brian-lin.com brian@tufts.edu ___ The Galaxy User list should be used for the discussion of Galaxy
Re: [galaxy-dev] Running Galaxy through Apache
Jeff, Did you ever get this to work? -Adam -- Adam Brenner Computer Science, Undergraduate Student Donald Bren School of Information and Computer Sciences Research Computing Support Office of Information Technology http://www.oit.uci.edu/rcs/ University of California, Irvine www.ics.uci.edu/~aebrenne/ aebre...@uci.edu On Wed, May 1, 2013 at 1:52 PM, Jeffrey Long jlo...@ualberta.ca wrote: In the Apache access.log, a single access link results in a great many output lines; the first two look like this: 128.233.109.35 - - [01/May/2013:14:31:29 -0600] GET /galaxy HTTP/1.1 200 5058 - Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/536.29.13 (KHTML, like Gecko) Version/6.0.4 Safari/536.29.13 128.233.109.35 - - [01/May/2013:14:31:29 -0600] GET /galaxy/static/style/base.css?v=1367440234 HTTP/1.1 404 495 http://trove.usask.ca/galaxy; Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/536.29.13 (KHTML, like Gecko) Version/6.0.4 Safari/536.29.13 If I'm reading this right, the second one involves a 404 error. Indeed, I don't see any static/style directory in my galaxy location...should there be one? I don't get anything new in apache's error.log as a result of a single access attempt. -Jeff On Wed, May 1, 2013 at 6:16 AM, Nate Coraor n...@bx.psu.edu wrote: On Apr 30, 2013, at 5:15 PM, Jeffrey Long wrote: 1) Any idea which proxy settings in the universe_wsgi file need to be set other than the ones on the website? Those are the only ones I changed. I scrolled down to the 'Advanced Proxy' settings but didn't see anything that looked relevant to me. Probably I should have sent this earlier, but this is the traceback I get; it looks like there's a url somewhere that's not getting passed along or re-assembled properly. Exception happened during processing of request from ('127.0.0.1', 47424) Traceback (most recent call last): File /mnt2/birl/MAVEN/bin/galaxy-dist/eggs/Paste-1.6-py2.7.egg/paste/httpserver.py, line 1053, in process_request_in_thread self.finish_request(request, client_address) File /usr/lib/python2.7/SocketServer.py, line 323, in finish_request self.RequestHandlerClass(request, client_address, self) File /usr/lib/python2.7/SocketServer.py, line 638, in __init__ self.handle() File /mnt2/birl/MAVEN/bin/galaxy-dist/eggs/Paste-1.6-py2.7.egg/paste/httpserver.py, line 432, in handle BaseHTTPRequestHandler.handle(self) File /usr/lib/python2.7/BaseHTTPServer.py, line 340, in handle self.handle_one_request() File /mnt2/birl/MAVEN/bin/galaxy-dist/eggs/Paste-1.6-py2.7.egg/paste/httpserver.py, line 427, in handle_one_request self.wsgi_execute() File /mnt2/birl/MAVEN/bin/galaxy-dist/eggs/Paste-1.6-py2.7.egg/paste/httpserver.py, line 287, in wsgi_execute self.wsgi_start_response) File /mnt2/birl/MAVEN/bin/galaxy-dist/eggs/PasteDeploy-1.3.3-py2.7.egg/paste/deploy/config.py, line 285, in __call__ return self.app(environ, start_response) File /mnt2/birl/MAVEN/bin/galaxy-dist/eggs/Paste-1.6-py2.7.egg/paste/urlmap.py, line 193, in __call__ path_info = self.normalize_url(path_info, False)[1] File /mnt2/birl/MAVEN/bin/galaxy-dist/eggs/Paste-1.6-py2.7.egg/paste/urlmap.py, line 117, in normalize_url or self.domain_url_re.search(url)), URL fragments must start with / or http:// (you gave %r) % url AssertionError: URL fragments must start with / or http:// (you gave '.') Hi Jeff, Does the Apache log give any indication of what path is being accessed? You may also want to enable mod_rewrite's debug logging to make sure it's behaving correctly. 2) Are the static subdirectories strictly necessary, or can I worry about one problem at a time? I've tried both commenting out the proxy lines (in apache2.conf) and leaving them in, and I get the same behaviour. You can exclude the static page rewrites for the purposes of getting it working. Galaxy's internal HTTP server will serve the static pages itself. --nate Thanks, I'll try to remember to 'reply-all' here! -Jeff On Mon, Apr 29, 2013 at 1:23 PM, Adam Brenner aebre...@uci.edu wrote: Hia Jeff, I think this is related to the proxy settings in Galaxy (universe_wsgi.ini) What is happening is that if you look at the source code for your galaxy website, it is trying to load css/images from: http://trove.usask.ca/galaxy/static/style/base.css?v=1367254127 But if you visit that link it offers you a 404 not found. Make sure the proxy settings in your universe_wsgi.ini is set correctly. If it is, it could be that you need to setup proxies for the subdirectories as well. If you look at the nginx guide here: http://wiki.galaxyproject.org/Admin/Config/Performance/nginx%20Proxy you will notice that we provide alias for /static/*. In terms of doing this in Apache, you may need to add more proxy pass lines that correspond to the subdirectories in
Re: [galaxy-dev] Running galaxy using Apache as proxy
Mike, Looking at the configuration, you are not proxy'ing anything. Take a look at my pervious post here and see if that helps: http://dev.list.galaxyproject.org/Running-Galaxy-through-Apache-td4659452.html Let us know how it goes! (I am planning on submitting a re-write for the Apache Proxy page as its missing information..feedback is appreciated). -Adam -- Adam Brenner Computer Science, Undergraduate Student Donald Bren School of Information and Computer Sciences Research Computing Support Office of Information Technology http://www.oit.uci.edu/rcs/ University of California, Irvine www.ics.uci.edu/~aebrenne/ aebre...@uci.edu On Fri, May 3, 2013 at 11:25 AM, Michael Place mpl...@wisc.edu wrote: Hello, I am trying to run galaxy using Apache as the proxy server. This is intended to be internal to our lab. I have galaxy running and working. The problem is that the pages load slowly and then seem to time out after 10 min or so. If I login and click get data then go away for a few minutes , then I come back and click a tool I get: The connection to the server was reset while the page was loading. If I retry the page loads. This makes the page very clunky. httpd.conf changes: IfModule mod_proxy.c ProxyRequests On Proxy * Order deny,allow Deny from all Allow from * /Proxy VirtualHost *:8080 ServerName 192.168.0.240 RewriteEngine on RewriteRule ^/galaxy$ /galaxy/ [R] RewriteRule ^/galaxy/static/style/(.*) /opt/galaxy-dist/static/june_2007_style/blue/$1 [L] RewriteRule ^/galaxy/static/scripts/(.*) /opt/galaxy-dist/static/scripts/packed/$1 [L] RewriteRule ^/galaxy/static/(.*) /opt/galaxy-dist/galaxy-dist/static/$1 [L] RewriteRule ^/galaxy/favicon.ico /opt/galaxy-dist/static/favicon.ico [L] RewriteRule ^/galaxy/robots.txt /opt/galaxy-dist/static/robots.txt [L] RewriteRule ^/galaxy(.*) http://localhost:8080$1 [P] /VirtualHost I have attached the universe_wsgi.ini and my httpd.conf I am running CentOS. Thank you, Mike ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/