[galaxy-dev] Manual modifications to integrated_tool_panel.xml?
Hi, Galaxy Developers, I have what I believe to be a basic question/problem with respect to making modifications to the integrated_tool_panel.xml (and or tool_conf.xml). Our Galaxy deployment does have some custom additions made to these files (I did not make these modifications so I am not 100% certain at this point whether or not the modifications to integrated_tool_panel.xml were manual or propagated from elsewhere) , and everything seems to work alright, although I do have a problem that I am having somewhat of a difficult time resolving. Before ask my question, I wanted to confirm my understanding of the wiki page http://wiki.galaxyproject.org/GalaxyToolPanel. I was hoping that somebody could validate the following statements; 1) The only thing that a human being should really ever change in the file integrated_tool_panel.xml is the ordering of the groupings (I believe I read on the developer mailing list that it was acceptable to delete sections from here as well, although I am not attempting to validate this at this time). 2) That for all intensive purposes, the integrated_tool_panel.xml should be generated automatically from the files specified in the tool_config_file configuration directive in universe_wsgi.ini. 3) That it is perfectly to acceptable to make manual modifications to XML files supplied in the configuration directive tool_config_file in the universe_wsgi.ini, or add any number additional XML files to this list (presumably I wouldn't want to make too many manual changes to the shed_tool_conf.xml file, I'm referring to the other XML files, not this one). The reason I am asking this question is a fairly simple one; every now and then, it appears that new tools are released when new versions of Galaxy are released (i.e. a slightly different default toolset is offered, and thus tool panel and or tool configuration files are generated when the new version of Galaxy is started, at least if I do a fresh install). It appears to me based on some very basic testing that once a manual modification to the tool_conf.xml is made, that the next time a pull for an update is run and Galaxy is restarted, that these files remain as-is; basically, what appears to be happening is that we are running a new version of Galaxy with what seems to be an out-of-date integrated_tool_panel.xml, at least in terms of the tools that are configured as part of the default install; this, in a nutshell, is the fundamental problem I am currently dealing with. Here is what I'm thinking is the correct way to solve my problem; 1) Since we are manually modifying the tools_conf.xml, It seems to me that we are kind of shooting ourselves in the foot in terms of our ability to delete this file and have Galaxy update it when we do an upgrade . Based on what I know I feel that we should do is take all of our custom tools modifications and put them in a separate file called uofc_custom_tools.xml, and then add that to tool_config_file to abstract away our custom tool configurations from the default tool_conf.xml. 2) When we upgrade galaxy, delete the tools_conf.xml, and when galaxy starts, let it replace this file with the default from the current changeset. 3) As long as we didn't need to do any manually reordering of elements in the integrated_tool_panel.xml, just delete the integrated_tool_panel.xml, and allow it to be manually re-generated from tool_conf.xml and uofc_custom_tools.xml. Does this solution sound reasonable? I would be very grateful on any insight anybody could provide in the best way to address this problem. I wish you a wonderful day. Dan Sullivan ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Galaxy Server Processes Dying?
n...@bx.psu.edu date:Wed May 01 09:50:31 2013 -0400 summary: Use Galaxy's ErrorMiddleware since Paste's doesn't return start_response. Fixes downloading tarballs from the Tool Shed when use_debug = false. I appreciate the time you took in reading my email, and any expertise you could provide in helping me troubleshoot this issue. Dan Sullivan ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] A Galaxy environment that can support up to 35 users?
Hi, Galaxy Developers, Is anybody out there managing a Galaxy environment that was designed and or has been tested to support 35 concurrent users? The reason why I am asking this is because we [the U of C] have a training session coming up this Thursday, and the environment we have deployed needs to support this number of users. We have put the server under as high as stress as possible with 6 users, and Galaxy has performed fine, however it has proven somewhat challenging to do load testing for all 35 concurrent users prior to the workshop. I can't help but feel we are rolling the dice a little bit as we've never put the server under anything close to this load level, so I figured I would try to dot my i's by sending an email to this list. Here are the configuration changes that are currently implemented (in terms of trying to performance tune and web scale our galaxy server): 1) Enabled proxy load balancing with six web front-ends (the number six pulled from Galaxy wiki) (Apache): Proxy balancer://galaxy/ BalancerMember http://127.0.0.1:8080 BalancerMember http://127.0.0.1:8081 BalancerMember http://127.0.0.1:8082 BalancerMember http://127.0.0.1:8083 BalancerMember http://127.0.0.1:8084 BalancerMember http://127.0.0.1:8085 /Proxy 2) Rewrite static URLs for static content (Apache): RewriteRule ^/static/style/(.*) /group/galaxy/galaxy-dist/static/uchicago_cri_august_2012_style/blue/$1 [L] RewriteRule ^/static/scripts/(.*) /group/galaxy/galaxy-dist/static/scripts/packed/$1 [L] RewriteRule ^/static/(.*) /group/galaxy/galaxy-dist/static/$1 [L] RewriteRule ^/robots.txt /group/galaxy/galaxy-dist/static/robots.txt [L] RewriteRule ^(.*) balancer://galaxy$1 [P] 3) Enabled compression and caching (Apache): Location / SetOutputFilter DEFLATE SetEnvIfNoCase Request_URI \.(?:gif|jpe?g|png)$ no-gzip dont-vary SetEnvIfNoCase Request_URI \.(?:t?gz|zip|bz2)$ no-gzip dont-vary /Location Location /static ExpiresActive On ExpiresDefault access plus 6 hours /Location 4) Configured web scaling (universe_wsgi.ini) : a) six web server processes (threadpool_workers = 7) b) a single job manager (threadpool_workers = 5) c) two job handlers (threadpool_workers = 5) 5) Configured a pbs_mom external job runner (our cluster), and commented out the default tool runners (to use pbs) (we are not using the other tools for the workshop). #ucsc_table_direct1 = local:/// #ucsc_table_direct_archaea1 = local:/// #ucsc_table_direct_test1 = local:/// #upload1 = local:/// 6) Changed the following database parameters (universe_wsgi.ini): database_engine_option_pool_size = 10 database_engine_option_max_overflow = 20 7) Disable the developer settings (universe_wsgi.ini): debug = False use_interactive = False #filter-with = gzip The server I have is a VM with the following resources: 2GB of RAM 4CPU Cores I feel that it is also worthwhile to mention that users will not be downloading datasets during the workshop, so as of now, the implementation of XSendFile as specified in the Apache Proxy documentation is not of immediate concern. Does anybody see any blaring mistakes where they think this configuration might fall short with respect to capacity planning for an environment of 35 concurrent users, or additional tuning that could potentially assist in ensuring the availability of the server during the workshop? Thank-you so much for your opinion(s), and please wish us luck this Thursday :-) Dan Sullivan ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] A Galaxy environment that can support up to 35 users?
Hi, Assaf, Thank-you for your very detailed, thorough, and thoughtful reply. I have responses to some stuff that you said; my comments are in-line; On Sep 17, 2012, at 1:11 PM, Assaf Gordon gor...@cshl.edu wrote: Hello Dan, Couple of lessons we learned from setting up similar workshop-galaxies: Dan Sullivan wrote, On 09/17/2012 01:04 PM: Hi, Galaxy Developers, Is anybody out there managing a Galaxy environment that was designed and or has been tested to support 35 concurrent users? The reason why I am asking this is because we [the U of C] have a training session coming up this Thursday, and the environment we have deployed needs to support this number of users. We have put the server under as high as stress as possible with 6 users, and Galaxy has performed fine, however it has proven somewhat challenging to do load testing for all 35 concurrent users prior to the workshop. I can't help but feel we are rolling the dice a little bit as we've never put the server under anything close to this load level, so I figured I would try to dot my i's by sending an email to this list. Here are the configuration changes that are currently implemented (in terms of trying to performance tune and web scale our galaxy server): 1) Enabled proxy load balancing with six web front-ends (the number six pulled from Galaxy wiki) (Apache): When configured correctly, 3 or 4 web-front-ends seemed sufficient. (When configured incorrectly, it doesn't matter how many you have, performances will suffer :) ). Given that you only have 4 CPUs/cores for your machine, having six front-ends seems too much. Since we are not running on bare metal hardware, I can definitely increase memory and CPU count on the Galaxy VM. I am going to increase these to 8 Cores w/8GB of RAM for the purpose of the workshop, based on the rough numbers you provided. 2) Rewrite static URLs for static content (Apache): 3) Enabled compression and caching (Apache): This might sounds obvious, but test that it actually works (e.g. check in the apache logs that the static files were served by apache, not by galaxy). Typos and other minor incompatibilities can cause the URLs to be served by galaxy, which will waste resources. This is a very good idea. Thank-you. 4) Configured web scaling (universe_wsgi.ini) : a) six web server processes (threadpool_workers = 7) b) a single job manager (threadpool_workers = 5) c) two job handlers (threadpool_workers = 5) Again, with a system of only 4 CPUs, you might overload your server. As I said, I am going to increase the CPU core count to 8 based on your recommendations. 5) Configured a pbs_mom external job runner (our cluster), and commented out the default tool runners (to use pbs) (we are not using the other tools for the workshop). #ucsc_table_direct1 = local:/// #ucsc_table_direct_archaea1 = local:/// #ucsc_table_direct_test1 = local:/// #upload1 = local:/// Unless your workshop is *tightly* scripted, you can't really tell which tool users will use. If this is an introduction to galaxy, users will experiment with some tools (even if you don't tell them to). (also, I'm not sure if those data import tools can run on your cluster node). Based on some limited testing, these data import tools can run on our cluster node. We have NAT configured with outbound HTTP from the cluster. I think we're alright on this one, although I will report back if I find any new meaningful lessons learned using this configuration. 6) Changed the following database parameters (universe_wsgi.ini): database_engine_option_pool_size = 10 database_engine_option_max_overflow = 20 Assuming you're using PostgreSQL (and you shouldn't use anything else, in practice), add the following: database_engine_option_server_side_cursors = True And I would set pool_size to 50 and max_overflow to 100 - seems excessive, but under the load of 20 users hammering at galaxy at the same time in a short time window, I got the database connection pool size errors within 10 minutes. This is good information from your experience. Thank-you for sharing this. I will implement this as you suggested. The server I have is a VM with the following resources: 2GB of RAM 4CPU Cores IMHO, that's too little memory and CPUs. A ball-park figures for our servers: memory-wise: each web-front-end python process takes ~300MB (and you plan for 6 of them). and you also have 3 more python processes (1 job manager + 2 job handlers). CPU-wise: In addition to 9 python processes, you will have several PostgreSQL processes, few apache threads, and some other system processes running. Even when each python process doesn't run at full capacity (ie. 100% CPU), your system already sounds overloaded. When jobs are running (at least on our system) the job-handlers consume some CPU time by just monitoring
[galaxy-dev] Programmatically deleting data sets from data libraries?
are calculated? It is it some sort of a hash of composite or primary keys from the back-end tables? The reason why I am asking this is because I did a full dump of the database and searched (using grep) for the ldda_id (i.e. 62e564808c5368d4), and it didn't exist anywhere in the database (I was surprised by this). If anybody out there has programmatically deleted a data set from a Galaxy data library (via the API or other), or could shed some light on how to solve my problem, I'd love to hear from you. Thank-you so much for your time, and again, I apologize for my lengthly e-mail. Dan Sullivan ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] missing slash when installing from toolshed?
Hi, Galaxy Developers, This is just a follow up to my previous response; I was able to get things working, and have pasted our apache configuration below for anybody else that search the mailing list with similar problems. Our implementation authenticates against UNIX LDAP and use a group for authorization. So far we have not had any problems with the following configuration ( I realize this is already similar to what is posted on the wiki). Thank-you again for your help, Greg. RewriteEngine On RewriteCond %{HTTPS} off RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} VirtualHost *:443 LoadModule ssl_module modules/mod_ssl.so ServerName crigalaxy.uchicago.edu SSLEngine On SSLCertificateFile /group/galaxy/certs/crigalaxy.cer SSLCertificateKeyFile /group/galaxy/certs/crigalaxy.key SSLCertificateChainFile /group/galaxy/certs/crigalaxy_interm.cer Location / Authtype Basic AuthName Galaxy AuthBasicProvider ldap AuthLDAPURL ldap://someldapserver.whateverdomain.edu:389/dc=uchicago,dc=edu?uid?sub?(objectClass=inetOrgPerson) TLS AuthzLDAPAuthoritative off Require ldap-group cn=uc:org:cri:galaxy:cri-galaxy_web_users,ou=groups,dc=uchicago,dc=edu RequestHeader set REMOTE_USER %{AUTHENTICATE_uid}e /Location RewriteEngine on Proxy balancer://galaxy/ BalancerMember http://127.0.0.1:8080 BalancerMember http://127.0.0.1:8081 /Proxy RewriteRule ^/static/style/(.*) /group/galaxy/galaxy-dist/static/uchicago_cri_august_2012_style/blue/$1 [L] RewriteRule ^/static/scripts/(.*) /group/galaxy/galaxy-dist/static/scripts/packed/$1 [L] RewriteRule ^/static/(.*) /group/galaxy/galaxy-dist/static/$1 [L] RewriteRule ^/robots.txt /group/galaxy/galaxy-dist/static/robots.txt [L] RewriteRule ^(.*) balancer://galaxy$1 [P] /VirtualHost Dan Sullivan On Aug 29, 2012, at 9:44 AM, Daniel Patrick Sullivan dansulli...@gmail.com wrote: Hi, Greg, Thank-you so much for taking the time to answer my question. You appear to be correct; I removed the virtual hosting configuration from apache (I had some redirection configured to automatically forward from port 80 to port 443), and this problem went away. I think I should be able to get things situated from here (I'll try to remember to publish my working config when I get things worked out for for the people that use the mail archive). Again, thank-you for your time and expertise. Dan Sullivan On Tue, Aug 28, 2012 at 5:10 PM, Greg Von Kuster g...@bx.psu.edu wrote: Hi Dan, The only thing I can think of that is causing this behavior is an apache rewrite rule that is not correctly handling your base URL. That's the only scenario in which I've seen something like this occur, but others in the community may have seen other causes. I'm fairly certain it has something to do with the server configuration (not a Galaxy issue). Greg Von Kuster On Aug 22, 2012, at 3:18 PM, Dan Sullivan wrote: I apologize for spamming this list, the screenshot that I specified as attached was supplied in a hyperlink farther down in the mail (not as an attachment). https://webshare.uchicago.edu/users/dansully/Public/Screen%20Shot%202012-08-21%20at%203.22.27%20PM.png Dan On Aug 22, 2012, at 10:13 AM, Dan Sullivan dansulli...@gmail.com wrote: Hi, Galaxy Developers, I have a question that is very similar to the following thread; http://dev.list.galaxyproject.org/Problem-fetching-updates-to-toolshed-tool-td4353417.html Basically whenever I try to install a tool from the toolshed, it appears that a trailing slash is not appended to the Galaxy URL, which is causing a DNS lookup failure. For example, the URL specified in the attached screenshot should be https://crigalaxy.uchicago.edu/admin_toolshed, however it is appearing as https://crigalaxy.uchicago.eduadmin_toolshed.Does anybody know if there are any known bugs that would cause this behavior? I might venture down the road of modifying lib/galaxy/webapps/community/controllers/repository.py as suggested by the thread above, although I would like to avoid doing this if possible. I am seeing experiencing this behavior on the following version of Galaxy: hg parent changeset: 7400:ec29ce8e27a1 In addition to this, I feel that its worthwhile to mention that I am using the apache proxy balancer with web scaling, however based on what I have seen and tested, I do not believe this (apache) to be the root cause of my problem. The URL below is a screenshot of the network debug I am seeing from the firebug extension in firefox. https://webshare.uchicago.edu/users/dansully/Public/Screen%20Shot%202012-08-21%20at%203.22.27%20PM.png If somebody has seen this, or could offer a suggestion to bring this issue to resolve, I would definitely be interested in your strategy. Thank-you so much for your help and for your time. Dan Sullivan 312-607
[galaxy-dev] missing slash when installing from toolshed?
Hi, Galaxy Developers, I have a question that is very similar to the following thread; http://dev.list.galaxyproject.org/Problem-fetching-updates-to-toolshed-tool-td4353417.html Basically whenever I try to install a tool from the toolshed, it appears that a trailing slash is not appended to the Galaxy URL, which is causing a DNS lookup failure. For example, the URL specified in the attached screenshot should be https://crigalaxy.uchicago.edu/admin_toolshed, however it is appearing as https://crigalaxy.uchicago.eduadmin_toolshed.Does anybody know if there are any known bugs that would cause this behavior? I might venture down the road of modifying lib/galaxy/webapps/community/controllers/repository.py as suggested by the thread above, although I would like to avoid doing this if possible. I am seeing experiencing this behavior on the following version of Galaxy: hg parent changeset: 7400:ec29ce8e27a1 In addition to this, I feel that its worthwhile to mention that I am using the apache proxy balancer with web scaling, however based on what I have seen and tested, I do not believe this (apache) to be the root cause of my problem. The URL below is a screenshot of the network debug I am seeing from the firebug extension in firefox. https://webshare.uchicago.edu/users/dansully/Public/Screen%20Shot%202012-08-21%20at%203.22.27%20PM.png If somebody has seen this, or could offer a suggestion to bring this issue to resolve, I would definitely be interested in your strategy. Thank-you so much for your help and for your time. Dan Sullivan 312-607-3702___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/