Daniel, My recommendations (some of which are off topic for this list): (1) If you are almost done with the module, why not go ahead and test to see if each child process is getting a unique copy of the server_config_struct? It is possible that if you never try to write to that data after server startup (i.e. from a request handling function or a filter function) that you will see many/all of the child processes using the same server_config_struct (printf the struct as an integer using "%d"... that will give you an integer representation of the struct's memory location). There may be no problem (or you may see that the total RAM consumed is marginal). (2) Try not using a server_config_struct merge function. If your child process doesn't write to the server_config_struct (possibly only done by the merge function). (3) Have you considered using a single Apache server with only a list of mod_proxy directives to act as the load balancer? I'm not saying this is the best way to go, but this may help you distribute your VHosts and directories the way you want (Perhaps faster than you think). Do a search for pages containing both mod_rewrite and mod_proxy if you are unfamiliar with this technique. (I'm not sure this technique works with fully redundant servers and/or directories). (4) You may save programming time by calling a PERL/PHP/otehr_language script to update a file from your DB SELECT statement. Call your script before httpd start|restart|stop. Then all you have to do to update your server_config_struct is to unserialize the file into your struct.
If each of your 10~20 servers are identical (all contain all VHosts and all directories) then your current layout may be one of the optimal schemes (given the extreme redundancy). However, (I don't know the particulars of your requirements, but) I would think you could have the main Apache server (which receives ALL initial requests and loadbalances) distribute the requests by domain. That server should hand off requests to one of, say, 4 servers. These 4 servers should further load-balance to 4 other servers (which can include itself) based on a combination of the domain/directory (giving you a 1-4-4 tree). You may run into weird session problems (including problems with the loadbalancer) if you are sending the same visitor to two different end-Apache servers with the same MX/domain based on different directories, you may run into session problems (I'm not speaking from experience, just conjecture). Regards, Dave On 3/23/07, Danie Qian <[EMAIL PROTECTED]> wrote:
----- Original Message ----- From: "David Wortham" <[EMAIL PROTECTED]> To: <[email protected]>; "Danie Qian" <[EMAIL PROTECTED]> Sent: Friday, March 23, 2007 2:39 PM Subject: Re: load data at server startup - is ap_hook_post_config() the right place? > Daniel, > AFAIK, "mutex" refers to mutual exclusion. It is commonly referred to > in relation to multi-threading but can apply to an inter-process scheme > too. I would assume that only one thread of one (child) process can > access > a given resource at a given time, but you should refer to the > documentation > of any code you use for those specifics. I'm sorry, but I don't have any > personal experience with mutex and the APR libs. > > Are you talking about adjusting your Apache child/thread setup for > development or are you targeting your module to only Apache installations > with specific configuration (i.e. single-thread mode)? > If you are worried about the size of a child process' > server_config_struct, > you could just program it to take a lot of room in the > server_config_struct > and recommend that Apache be configured for only one child-process. I > would > recommend against it, but it doesn't seem like a terrible idea. There > will > probably be significant performance hits on certain > OSes/Apache-deployments > (*NIX) and less on others (likely WinNT since I believe it defaults to a > single-child process with a large thread-count). > > Also, what are you using this giant list of URIs for? There may be a > more efficient way of distributing the processing load (and, therefore, > speeding up the overall Apache response time). If I read your first post > to > this list correctly, you are using a table (as in apr_table_*, a DB table, > or other?) with a large number of URLs. You aren't walking through a > sequential list of URLs in apr_tables are you? Are the "URI"s you're > storing full URIs, domain names, or MXs? Perhaps you could use some help > with your overall design of the module (or maybe there's an existing > module > that does what you want). > Hi Dave, To be more meaningful about what I am talking about, here is our setup: We have around 5 thousand sites running on a few web-farmed locations. Each location has about 10~20 linux apache servers running behind a load balancer. Every apache usually has 40~50 child processes running to serve the amount traffic. Internally every site is maintained as one of our products with all its operation information stored in mysql databases. With the information from one of the tables we can decide where the site folder is by looking at the site name and some other fields associated with the site. What I am thinking to do with the module is to load all the relevant fields into apache memory space upon apache startup and let every request look up from that chunk of memory for better performance. so as u can see we are nailed a setup of multiple processes and probably single thread mode in apache. Thanks, Daniel
-- David Wortham Senior Web Applications Developer Unspam Technologies, Inc. 1901 Prospector Dr. #30 Park City, UT 84060 (435) 513-0672
