Re: [AOLSERVER] Performance problems with large installation
Andrew Piskorski wrote: To reduce your RAM usage, you would definitely want to investigate ttrace, included in the Tcl Threads Extension. However, you don't seem to be running out memory, so this is unlikely to be that big a help in your case. I have had very little success in getting ttrace to work. However, I was looking at the memory stats on mempools.adp and I noticed that the overhead was in the neighborhood of 50%-60%. After a small amount of looking, I realized this is entirely expected: the USE_THREAD_ALLOC version of Tcl_Alloc is a power-of-2 allocator, meaning that it rounds up all requests that it handles to the next higher power of 2. So for example, a request to allocate 600 bytes will really allocate 1024 bytes and the difference is wasted space. On average, you would expect this wasted space to be about 50% more than you requested. So, inspired by the description of google's tcmalloc which uses a large number of buckets, I added a whole lot more buckets spaced closely together (from 8 bytes for the smaller sizes to 512 bytes for larger sizes) in tclThreadAlloc.c and the results were as I hoped; the memory overhead went down to around 10%. Adding all these extra buckets will impose some overhead of its own and a small (probably unnoticable) performance impact, but the amount of memory allocated seems to dwarf this overhead and real impact on memory size is noticable. My server that would be around 90M a few minutes after starting is now only 65-70M. On a memory-tight environment like a unixshell.org setup, this can be quite significant. On a larger server with more memory this can still matter: if trying to keep the overall memory footprint small isn't really your issue then you could think of it as being able to run 40% more threads. -J -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] with the body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: field of your email blank.
Re: [AOLSERVER] Performance problems with large installation
Hello, Andrew! Andrew Piskorski wrote: On Fri, Feb 03, 2006 at 12:49:43PM -0500, Nima wrote: Dossy..when you look at the hardware would you say that I need more dynamic servers and can I run serveral dynamic servers against on postgresql? Using pound I add some more aolservers behind the scenes connected to one databse. Of course you can - in fact that's the whole point, typically there is no other way to do it. Are you sure? We try it but every aolserver (with OpenACS /dorLrn) has his own cache and the sync among then was very bad. Are anybody running a OpenACS / dotLrn big installation with cluster? Other related question: Are there any way to cluster the database (Postgres) part? Regards, Agustin -- == | Jose Agustin Lopez Bueno | | E-Mail: [EMAIL PROTECTED] | |Home Page: http://www.uv.es/~lopezj/| |http://www.uv.es/postman/ | |Tfnos: +34-96-3544310 +34-96-3543129 | | Fax: +34-96-3544200 | | Servicio de Informatica, Univ. de Valencia, Spain | == -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] with the body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: field of your email blank. begin:vcard fn;quoted-printable:Jos=C3=A9 Agust=C3=ADn L=C3=B3pez Bueno n;quoted-printable;quoted-printable:L=C3=B3pez Bueno;Jos=C3=A9 Agust=C3=ADn org;quoted-printable:Universidad de Valencia;Servicio de Inform=C3=A1tica adr;quoted-printable:;;Dr. Moliner, 50;Burjassot;Valencia;46100;Espa=C3=B1a - Spain email;internet:[EMAIL PROTECTED] title:Analista Programador tel;work:963543129 x-mozilla-html:FALSE url:http://www.uv.es/lopezj version:2.1 end:vcard
Re: [AOLSERVER] Performance problems with large installation
On Fri, 2006-02-03 at 19:55 -0500, Andrew Piskorski wrote: That's only cpu load. You also want to check it's I/O activity. (Solaris top also shows I/O wait percentages but Linux unfortunately does not.) On older Linux boxes iostat 5 was the way to do that. Newer Linux systems may have different/better ways to do that. That would probably be vmstat 5. It gives you processes, memory, paging, block IO, traps, cpu activity. vmstat -D gives you disk stats. You can also use netstat -tcs to snapshot the number of established tcp connections on the machine - that might give you a true indication of the number of connections actually being served. - Steve Steve Manning - Mandrake Linux 10.1 - Gnome 2.6 East Goscote - Leicester - UK +44 (0)116 260 5457 E-Mail: [EMAIL PROTECTED] - Web: www.festinalente.co.uk AIM: verbomania - Public Key: 25665CAF from: wwwkeys.pgp.net There are only 10 types of people in this world Those who understand binary and those who don't -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.
Re: [AOLSERVER] Performance problems with large installation
On Sat, Feb 04, 2006 at 11:05:31AM +0100, Agustin Lopez wrote: Andrew Piskorski wrote: On Fri, Feb 03, 2006 at 12:49:43PM -0500, Nima wrote: Dossy..when you look at the hardware would you say that I need more dynamic servers and can I run serveral dynamic servers against on postgresql? Using pound I add some more aolservers behind the scenes connected to one databse. Of course you can - in fact that's the whole point, typically there is no other way to do it. Are you sure? We try it but every aolserver (with OpenACS /dorLrn) has his own cache and the sync among then was very bad. Yes. The cache management you're talking about is one of the potential issues to overcome. This HAS been made to work well enough in the past for some users, and OpenACS has had code to facilitate clustering that for a long, long time. But MOST users of OpenACS don't care about support for multiple AOLservers, and I would bet that many OpenACS developers pay no attention to cluster support either, so I'm not surprised to hear that it doesn't work correctly out of the box for some applications. Other related question: Are there any way to cluster the database (Postgres) part? That's much harder. What you really need for that is PostgreSQL multi-master replication, which does not exist, so the general answer is no. However, you might be able to get significant speed-ups by sending non-transactional read-only queries to read-only PostgreSQL slaves kept almost up to date with Slony. How well that works probably depends strongly on how Slony is implemented. But of course, if you have a read-only query where it's ok for the answer to be a bit out of data, you might prefer to just hit the mastter db and then cache the result in RAM, even RAM on a remote machine, with say Memcached. Which you want to do depends on the circumstances. I don't know of anyone who's tried to use Slony with OpenACS for query speed up. At least a couple people have used Memcached. Either approach requires special code in your application. See also: http://openacs.org/forums/message-view?message_id=179348 http://openacs.org/forums/message-view?message_id=128060 http://groups.google.com/group/comp.databases.postgresql.hackers/browse_thread/thread/3175700383a99b71/ AFAIK, a fully transparent, Just run my database on a cluster of machines, and my application code can't tell the difference except that everything is faster solution does not exist for ANY production worthy disk-backed RDBMS. E.g. Oracle has multi-master clustering but it is useful only for fail-over, it does NOT help performance. The closest thing for PostgreSQL was probably Postgres-R. If I remember right, it was multi-master (I think?), and let you add more database machines to scale somewhat with increasing client load, while keeping everything transactional. But of course the scaling was sub-linear, and it did NOT let you scale with increasing database size - everything was replicated to every machine. More importantly, it was strictly research-ware to begin with and it only worked with PostgreSQL 6.4.2 - incredibly ancient at this point. There are projects trying to make disk-backed RDBMSs really work on clusters of machines, but AFAIK they are all at the early research stage. Backplane and (to a lesser extent) Clusgres come to mind. Production grade distributed multi-master databases that are mostly in memory but can periodically write to disk do exist. E.g., Mnesia in Erlang, but I've never used it (nor Erlang, for that matter). Backplane is certainly ambitious, but I've no idea how well it might work, or if it will even work at all. From Matt Dillon's 2002 lecture slides, I think Backplane works it's magic by being a temporal database, which could itself be a useful feature. It looks dead though, you can download a tarball of the code but it hasn't been touched since May 2003, and its author, Dillon, is working on other things: http://apollo.backplane.com/ http://www.dragonflybsd.org/main/ Btw, I'd never heard of it before, but that Firefly BSD project is interesting. Among a whole bunch of other ambitious goals, they plan a package system which fully and mostly automatically supports installing multiple versions of ANY package. Cool. -- Andrew Piskorski [EMAIL PROTECTED] http://www.piskorski.com/ -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] with the body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: field of your email blank.
Re: [AOLSERVER] Performance problems with large installation
I would like to understand more the relation between: - number of concurrent requests - number of required threads - number of required database connections in the pool Can someone kindly give more insight on that? Let's say I have 50 concurrent users. How many threads do I need with how many db connections for each db pool? Also what is the relation between usage of RAM and number of threads? When I look at top the aolserver currently in top it says: %MEM PID USER PR NI VIRT RES SHR S %CPUTIME+ COMMAND 41.7 27147 unima225 0 1796m 1.6g 7448 S 0.0 0:36.66 /opt/aolserver4/bin/nsd -u unima2 -t /www/unima2/etc/config.tcl with basically 44 users who do nothing but still the server uses 41% of the RAM. -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] with the body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: field of your email blank.
Re: [AOLSERVER] Performance problems with large installation
On 2006.02.03, Nima [EMAIL PROTECTED] wrote: What basically happens is that once I have more than 150 users logged in the response time for a page takes a minute or more which is very frustrating for users. [...] We have three linux boxes. One for an aolserver with database connection, one for a static aolserver and one for the database. The database box never goes above 5-10%. The static server is also not very busy but the dynamic server can go upt to 99% and a load of 10 and more. [...] Sounds like you already answered your question: you need more dynamic servers and do load balancing across them. AOLServer config.tcl ... ns_section ns/server/${server} ns_param directoryfile $directoryfile ns_param pageroot $pageroot ns_param maxconnections 100 ns_param keepalivetimeout 15 ns_param maxdropped 0 ns_param maxthreads 65 ns_param minthreads 65 ns_param threadtimeout 3600 If you have 150 concurrent users, maxconnections = 100 could be a factor. maxthreads = 65 ... do you have that many concurrent connections? I'd suggest raising maxconnections to, say, 500 (just to rule that out entirely) and observe the effects. I might raise maxthreads to 120, too. Of course, all of this tweaking and tuning won't have as much impact as actually optimizing your application code. Unfortunately, I'm not very familiar with dotlrn or OpenACS to be able to speak to specific things you should look at. Maybe someone else can speak up about those things? ns_section ns/db/pool/pool1 ... ns_param connections 100 ... ns_section ns/db/pool/pool2 ... ns_param connections 60 ... ns_section ns/db/pool/pool3 ... ns_param connections 30 ... Do you have any kind of statistics from the DB side? What's the max number of concurrent connections you've actually seen, according to the DB? -- Dossy -- Dossy Shiobara | [EMAIL PROTECTED] | http://dossy.org/ Panoptic Computer Network | http://panoptic.com/ He realized the fastest way to change is to laugh at your own folly -- then you can let go and quickly move on. (p. 70) -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] with the body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: field of your email blank.
Re: [AOLSERVER] Performance problems with large installation
Is there any way to find out how many concurrent users I currently have? -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] with the body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: field of your email blank.
Re: [AOLSERVER] Performance problems with large installation
If I increase max_threads how do I have to increase the number of db connections for each pool? -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] with the body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: field of your email blank.
Re: [AOLSERVER] Performance problems with large installation
Dossy..when you look at the hardware would you say that I need more dynamic servers and can I run serveral dynamic servers against on postgresql? Using pound I add some more aolservers behind the scenes connected to one databse. -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] with the body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: field of your email blank.
Re: [AOLSERVER] Performance problems with large installation
The included stats interface should help: ns_section ns/server/stats ns_param enabled 1 ns_param url /_stats ns_param user aolserver ns_param password stats Then take a look at the Process and Threads sections. - n On Feb 3, 2006, at 12:44 PM, Nima wrote: Is there any way to find out how many concurrent users I currently have? -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] with the body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: field of your email blank. -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] with the body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: field of your email blank.
Re: [AOLSERVER] Performance problems with large installation
Nima schrieb: I would like to understand more the relation between: - number of concurrent requests - number of required threads - number of required database connections in the pool Can someone kindly give more insight on that? Let's say I have 50 concurrent users. How many threads do I need with how many db connections for each db pool? the question certainly is, what concurrent users means; most probably you are talking about the 10 min. window (the request-monitor/running info shows the number of concurrently active requests at a time). note, that when talking about concurrent users, it might be the case that one user has many requests concurrently running. To simplify things, we should talk about e.g. 10 concurrent requests. Then you need - at least 10 threads - at least 10 connections The number of database connections is more difficult to estimate, since this in an application(oacs) matter. In oacs, most dynamic requests need 1 or 2 db connections concurrently. You should as well pay attention to scheduled procedures, which might as well require significant resources. Also what is the relation between usage of RAM and number of threads? there is a linear relationship. since aolserver/oacs loads by default everything into ram, and every thread has a separate interpreter state, 20 connection threads require typically twice the memory of 10 connection threads. When I look at top the aolserver currently in top it says: %MEM PID USER PR NI VIRT RES SHR S %CPUTIME+ COMMAND 41.7 27147 unima225 0 1796m 1.6g 7448 S 0.0 0:36.66 /opt/aolserver4/bin/nsd -u unima2 -t /www/unima2/etc/config.tcl with basically 44 users who do nothing but still the server uses 41% of the RAM. if nothing else is running on that machine, i would not worry about the 1.6g. I would recommend to restart the server every night at 4 o'clock. we have noticed a significant decrease on speed with long running aolservers (at least on MP machines). I am not sure, where this comes from, maybe this is due to memory fragmentation. -gustaf -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] with the body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: field of your email blank. -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] with the body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: field of your email blank.
Re: [AOLSERVER] Performance problems with large installation
One thing to consider is that you simply may have 1 or more slow queries running in your DB, I believe from the postgres website you can use select * from pg_stat_activity To determine the process list of queries running. If you have ns_section ns/db/pool/pool1 (2/3) ns_param Verbose on It will also show you the rate at which queries are being sent to your DB. Keep an eye open on you /var/log/messages for anything weird. P Nathan Folkman wrote: The included stats interface should help: ns_section "ns/server/stats" ns_param enabled 1 ns_param url "/_stats" ns_param user "aolserver" ns_param password "stats" Then take a look at the "Process" and "Threads" sections. - n On Feb 3, 2006, at 12:44 PM, Nima wrote: Is there any way to find out how many concurrent users I currently have? -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank. -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank. -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.
Re: [AOLSERVER] Performance problems with large installation
On Fri, Feb 03, 2006 at 11:53:01AM -0500, Nima wrote: What basically happens is that once I have more than 150 users logged in the response time for a page takes a minute or more which is very frustrating for users. Today we received an email saying that the system is simply shit. I don't know what to do next. All I know is that from semester to semester the load is getting higher but the frustration as well. Nima, rrom your description this did it NOT happen recently all at once, it has been gradually getting worse as you add more users, right? It sounds like you're fairly lost and that you have an urgent performance problem affecting your users. I don't know just what dotLRN installation this is, but if you really have 150 concurrent users (whatever that means precisely in this case), then it's probably one of the big dotLRN installations at a large university somewhere (ah, from the config file below, uni-mannheim.de). I hope that means you also have some sort of support contract with one of the OpenACS / dotLRN gurus. If it's not something you can immediately (like, today) identify and fix, my advice is get them involved ASAP. It might be just a simple misconfiguration somewhere, or it might be something deeper and trickier to fix. Either way, having all your users actively angry at you makes it plenty urgent enough to call in the big guns... We have three linux boxes. One for an aolserver with database connection, one for a static aolserver and one for the database. The database box never goes above 5-10%. The static server is also not That's only cpu load. You also want to check it's I/O activity. (Solaris top also shows I/O wait percentages but Linux unfortunately does not.) On older Linux boxes iostat 5 was the way to do that. Newer Linux systems may have different/better ways to do that. very busy but the dynamic server can go upt to 99% and a load of 10 and more. Well, that VERY strongly suggests that the rate limiter is simply executing all that Tcl code in your AOLserver. If so, add more AOLserver boxes, and set up Pound or the like as a front-end server to split the load between them. And/or upgrade to a much faster server. In addition, try to find out what pages are eating up most of the processing time, and speed them up. A lot of that processing may be redundant and/or innefficient. Some judicious cacheing and/or code tuning could make a huge difference. Oh, and a silly question: What version of Tcl are you using, and did you compile it with optimization? You definitely want to be using the latest Tcl 8.4.x version compiled with either -g -O2 (my preferecne) or -O2. I don't know how much slower Tcl is if you leave compiler optimization turned off, but it's probably enough to be very noticable in your case. (Make sure AOLserver was also compiled with optization of course; it re-uses the Tcl build flags.) Finally, this is more of a research project, but your site is large and busy enough to benefit from figuring out just what the current status is of this patch: Cache compiled Tcl page bytecode http://sourceforge.net/tracker/?func=detailaid=689515group_id=3152atid=353152 Currently: %MEM %CPU SHR PID USER PR NI VIRT RES STIME+ COMMAND 41.1 0.0 7448 27147 unima225 0 1770m 1.6g S 0:36.66 /opt/aolserver4/bin/nsd -u unima2 -t /www/unima2/etc/config.tcl with 44 users logged into the system. That says that your AOLserver is using 1.7 GB total memory, almost all of it resident. Which is huge for most people, but probably quite reasonable for you, since you have 4 GB in that box. That at least probably means that the box isn't thrashing between RAM and disk, good. dotlrn (dynamic server) AOLServer 4.0.10 (connected to the database) Pound 1.8.2 (as reverse proxy for ssl and load balancing) Apache 2.0.53 (only redirect from 80 to 443 where pound is) Oh, you're already using Pound as the front-end. So, shouldn't it be easy to stick in additional AOLservers behind it for dynamic content? The CATCH is, is your site and all its code, both stock and custom, already set up to work nicely with multiple AOLservers? Or does it rashly ASSUME only 1 AOLserver process in some places, such that you are going to see bugs or inconsistencies when using multiple AOLservers? I dunno. For that, I definitely recommend talking to the other folks running multiple AOLservers with OpenACS and dotLRN. It sounds like you're running Pound on the same box as AOLserver. You'll definitely need to change that in order to add dynamic content servers. (I don't understand why you're using Apache to redirect client browsers from port 80 to 443 either, that seems odd.) SuSE 9.2 Linux Linux version 2.6.8-24.18-smp (gcc version 3.3.4 (pre 3.3.5 20040809)) #1 SMP Fri Aug 19 11:56:28 UTC 2005 4 CPU Intel(R) Xeon(TM) CPU 3.06GHz , L2 cache: 512K 4 GByte RAM - Memory: 4070968k/4111296k available (2339k kernel code, 39528k reserved, 824k
Re: [AOLSERVER] Performance problems with large installation
On Fri, Feb 03, 2006 at 12:49:43PM -0500, Nima wrote: Dossy..when you look at the hardware would you say that I need more dynamic servers and can I run serveral dynamic servers against on postgresql? Using pound I add some more aolservers behind the scenes connected to one databse. Of course you can - in fact that's the whole point, typically there is no other way to do it. On Fri, Feb 03, 2006 at 12:38:56PM -0500, Nima wrote: I would like to understand more the relation between: - number of concurrent requests - number of required threads - number of required database connections in the pool Can someone kindly give more insight on that? This depends entirely on your application and needs to be determined empirically, by load testing on a separate server and/or profiling of your Production database. I've never done that for large OpenACS installations, but other people have. Maybe they have some rules of thumb. Let's say I have 50 concurrent users. How many threads do I need with how many db connections for each db pool? It depends entirely on your application, there is no easy answer to this question. Also what is the relation between usage of RAM and number of threads? That also depends. By default AOLserver loads ALL Tcl procs separately into every single thread, and OpenACS has a LOT of Tcl procs, as when using OpenACS AOLserver threads suck up a lot of memory. To reduce your RAM usage, you would definitely want to investigate ttrace, included in the Tcl Threads Extension. However, you don't seem to be running out memory, so this is unlikely to be that big a help in your case. When I look at top the aolserver currently in top it says: %MEM PID USER PR NI VIRT RES SHR S %CPUTIME+ COMMAND 41.7 27147 unima225 0 1796m 1.6g 7448 S 0.0 0:36.66 /opt/aolserver4/bin/nsd -u unima2 -t /www/unima2/etc/config.tcl with basically 44 users who do nothing but still the server uses 41% of the RAM. Number of browser clients has almost nothing to do with how much RAM AOLserver uses, and everything to do with how you've configured AOLserver. On Fri, Feb 03, 2006 at 12:51:01PM -0500, Nima wrote: If I increase max_threads how do I have to increase the number of db connections for each pool? Again, that depends entirely on your application. If you have slow Tcl code but all your queries are very fast, then you can probably get away with a lot fewer db connections than you have conn threads. If on the other hand your Tcl code is faster, or you have slow queries, or you hold a db handle when you don't really need it, then you might need as many db connections as you have conn threads, plus a few extra for non-conn threads. Or you might conceivably even need that many db connections in EACH db pool. (Conceivably - I don't know what's likely in practice.) With OpenACS, the first one or two db pools probably get the most use by far, so you should have more db connections there and fewer in other pools. What particular balance is most appropriate, I don't know. I also don't know what the over head of extra db connections is. Might be large, might be small, I dunno. Ideally there would be some sort of unified monitor application that would track all this on a live Production site. Check out Gustaf's request monitor / throttler, it might include some of that functionality. Oh, I see you're already using it: http://openacs.org/forums/message-view?message_id=347126 http://openacs.org/forums/message-view?message_id=339132 -- Andrew Piskorski [EMAIL PROTECTED] http://www.piskorski.com/ -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] with the body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: field of your email blank.