Re: Too Much Context Switching?
Written by [EMAIL PROTECTED] on 06/30/08 12:58 In 6.x. the default thread library is quite inefficient although it can make use of multiple CPUs (again, providing the application is giving them work to do). For multi-threaded performance you will be better off switching to the libthr library (see libmap.conf(5)) or updating to 7.0 (where it is the default). This isn't likely to be the underlying issue if you are trying to debug a loss of performance relative to the same configuration in the past though. Indeed Plone is written in python, and python has a Big Giant Lock inside which insures that only one thread can execute, in order to protect the python structures. This lock is only released under special circumstances, such as doing IO. Hence it is necessary to run several instances of python programs and do synchronization work, if one wants to make use of several CPUs, or use python threads, and immediately make some IOs, or similar techniques. It may be that using Jython, if possible, yields better threading behavior. When doing some work according to these ideas, i had found quite severe contention, and this was not cured when switching native threading libraries (libksd, libthr, etc.). The problem is really inside python. Yep, it could be that -- what confuses me though is that it is claimed that performance suddenly regressed. If so then this cannot be the underlying cause. It's actually been a long, slow, steady degradation of performance as best I can tell, that's recently just reached proportions that are so ridiculous that it's gone from this sucks but I can deal to this is completely unusable. The system has been slow from the start, just not this slow. I guess I'll need to investigate this...and while I know that Python is somewhat off-topic, if anyone here has any suggestions on where to start, they'd be much appreciated. :-) Alex As far as degradations-over-time are concerned, don't overlook your ZODB. If you don't pack it regularly and it grows to some ridiculous size you can be in for a world of hurt. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Too Much Context Switching?
On Monday 30 June 2008, cpghost wrote: You need to run ZEO if you want to make use of multiple CPUs in Zope. Here's a small HOWTO. It's for gentoo, but easily adaptable to FreeBSD: http://gentoo-wiki.com/HOWTO_ZEO/Zope_and_Plone Good luck optimizing the Beast! ;-) This is *so* critically important that I can't overstress it. You *have* to use ZEO if you're running a busy Zope site. On our dual P4-Xeon system, I run 8 Zope instances and use Apache to spread the load across 7 of them (reserving the 8th for admin use) like so: I $ cat /usr/local/etc/apache22/zope.txt zeoclients 9080|10080|11080|12080|13080|14080|15080 $ cat mydomain.conf [...] # Load-balance the Zope servers RewriteMap zope rnd:/usr/local/etc/apache22/zope.txt RewriteRule ^/(.*) http://web2.daycos.com: ${zope:zeoclients}/VirtualHostBase/http/web2.xrsnet.com:80/XRSnet/VirtualHostRoot/$1 [P] On each new connection, Apache picks a random port from the list defined in zope.txt and passes the connection to that Zope process. -- Kirk Strauser signature.asc Description: This is a digitally signed message part.
Re: Too Much Context Switching? - FIXED
[EMAIL PROTECTED] wrote: [snip] I'll probably be upgrading to 7.0 in the next month or so, given that this is obviously a thread issue and that that release has much improved thread code. However, for the time being, the pressing issue is fixed, and for anyone in my position stuck on 6.2...this is night day. It has been over a year and a half, or so, since I last experimented with Zope. I only have 9 FreeBSD servers, but in my circumstance I've had good results with 7. When you build Python set the HUGE_STACK_SIZE to yes. I believe the default is to have thread support already on. Even when Python app code is written with multithreading, the execution still runs through the 'Global_Interpreter_Lock' when run. So with this limitation in mind, should you still observe Zope/Python only utilizing only one core in an SMP machine an alternative may be to see if you can run Zope as FastCGI and start more than one instance, ie one for each core. Keep in mind that FastCGI brings in a whole new dimension of it's own problems and instabilities. A problem that may arise in such a situation is loss of session if a request should switch instances somehow. Test for this if you can. Before attempting such I would profile/bench the box as it is now. Since this might be considered expiremental I'd *not* use the production box unless you are in a downtime, and have sufficient time to play around. Best would be to try this kind of stuff on a second box and not mess with the production one, ala - if it works don't FIX it! :-) If you succeed in getting multiple Zope instances using multiple cores, and have lots of RAM you may also consider giving memcached a go. As I said earlier I'd only play with these ideas in a lab testing scenario and *not* the production box. YMMV -Mike ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Too Much Context Switching?
[EMAIL PROTECTED] wrote: First off, thanks for such a prompt response. :-) [EMAIL PROTECTED] wrote: I'm the webmaster for www.marssociety.org, which is a FreeBSD 6.2-RELEASE box running on a dual-core AMD Opteron setup with 4GB of RAM. The box is reasonably busy, as it's the sole piece of hardware running web, database, and mail operations for the Mars Society, an international nonprofit group dedicated to space exploration. We regularly send out newsletters to ~10,000 members, and our web site is averaging ~50,000-100,000 hits/day. The main portion of the web site is run via the Zope/Plone CMS system (Plone 2.5, for anyone who may care). Recently, it's been slowing down dramatically, and our Plone guy (not me -- I inherited the system and can't stand it) can't figure out why. I've been diving into OS-related issues, and in so doing, I ran across what appears to be a very high number of context switches going on. Here's some sample output from vmstat 2: A few hundred or thousand context switches per second is trivial load. That is not your problem. Modern CPUs can do hundreds of thousands per second before it starts to become a problem. OK, well that's good to know. Note that your system is 50% idle and spending almost no time in the kernel. This basically means that only one core is doing work, which might be because you're not giving it enough work to do. There are only 1-2 running tasks for most of your trace, one of which is probably vmstat itself, so that means there is only one running server process (which can obviously only saturate at most 1 CPU). Actually, I decided to run vmstat this morning for a little while after turning off Zope, and during the couple of minutes I had it going, the number of processes running (as indicated by the leftmost column of vmstat's output) was at 0 for all but one line worth of output, so I would guess that vmstat's not including itself in the number of processes there. Even so, though, your assessment about how saturated the CPU is is of course still valid, which leads me to a follow-up question: by default, can a multi-threaded app use both cores? Or would I need to have two instances of the process running (Zope is apparently able to handle multiple instances running reasonably well) in order to have it fully utilize the CPU? In 6.x. the default thread library is quite inefficient although it can make use of multiple CPUs (again, providing the application is giving them work to do). For multi-threaded performance you will be better off switching to the libthr library (see libmap.conf(5)) or updating to 7.0 (where it is the default). This isn't likely to be the underlying issue if you are trying to debug a loss of performance relative to the same configuration in the past though. However it may well be that you can obtain better performance either by upgrading the OS, or tuning zope to give a better work distribution. The trace suggests that your performance problems are either in userland, or elsewhere in your network or application stack, possibly due to interactions between components. Try to look at why the system is not being given enough work to keep it saturated. Any tips on tools I could use to check this out? I'll of course be looking at Zope profiling tools, to see if I can have them tell me where any bottlenecks are, but if there are any OS-level tools that I could use to profile a given process (or group thereof) for problems, I'd really appreciate hearing about them (simple links to man pages or the like would be fine, I don't mean to waste your time explaining how tools work when I can usually figure it out on my own). ktrace, tcpdump, hwpmc, the kernel audit system, MUTEX_PROFILING/LOCK_PROFILING(9) are various utilities you can use to profile the system workload (probably in decreasing order of utility for you). Some of these are less usable in 6.x though. Kris ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Too Much Context Switching?
On Mon, 30 Jun 2008 10:48:25 -0400 [EMAIL PROTECTED] wrote: Actually, I decided to run vmstat this morning for a little while after turning off Zope, and during the couple of minutes I had it going, the number of processes running (as indicated by the leftmost column of vmstat's output) was at 0 for all but one line worth of output, so I would guess that vmstat's not including itself in the number of processes there. Even so, though, your assessment about how saturated the CPU is is of course still valid, which leads me to a follow-up question: by default, can a multi-threaded app use both cores? Or would I need to have two instances of the process running (Zope is apparently able to handle multiple instances running reasonably well) in order to have it fully utilize the CPU? You need to run ZEO if you want to make use of multiple CPUs in Zope. Here's a small HOWTO. It's for gentoo, but easily adaptable to FreeBSD: http://gentoo-wiki.com/HOWTO_ZEO/Zope_and_Plone Good luck optimizing the Beast! ;-) Alex Kirk -cpghost. -- Cordula's Web. http://www.cordula.ws/ ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Too Much Context Switching?
Kris Kennaway wrote: In 6.x. the default thread library is quite inefficient although it can make use of multiple CPUs (again, providing the application is giving them work to do). For multi-threaded performance you will be better off switching to the libthr library (see libmap.conf(5)) or updating to 7.0 (where it is the default). This isn't likely to be the underlying issue if you are trying to debug a loss of performance relative to the same configuration in the past though. Indeed Plone is written in python, and python has a Big Giant Lock inside which insures that only one thread can execute, in order to protect the python structures. This lock is only released under special circumstances, such as doing IO. Hence it is necessary to run several instances of python programs and do synchronization work, if one wants to make use of several CPUs, or use python threads, and immediately make some IOs, or similar techniques. It may be that using Jython, if possible, yields better threading behavior. When doing some work according to these ideas, i had found quite severe contention, and this was not cured when switching native threading libraries (libksd, libthr, etc.). The problem is really inside python. -- Michel TALON ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Too Much Context Switching?
Michel Talon wrote: Kris Kennaway wrote: In 6.x. the default thread library is quite inefficient although it can make use of multiple CPUs (again, providing the application is giving them work to do). For multi-threaded performance you will be better off switching to the libthr library (see libmap.conf(5)) or updating to 7.0 (where it is the default). This isn't likely to be the underlying issue if you are trying to debug a loss of performance relative to the same configuration in the past though. Indeed Plone is written in python, and python has a Big Giant Lock inside which insures that only one thread can execute, in order to protect the python structures. This lock is only released under special circumstances, such as doing IO. Hence it is necessary to run several instances of python programs and do synchronization work, if one wants to make use of several CPUs, or use python threads, and immediately make some IOs, or similar techniques. It may be that using Jython, if possible, yields better threading behavior. When doing some work according to these ideas, i had found quite severe contention, and this was not cured when switching native threading libraries (libksd, libthr, etc.). The problem is really inside python. Yep, it could be that -- what confuses me though is that it is claimed that performance suddenly regressed. If so then this cannot be the underlying cause. Kris ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Too Much Context Switching?
On Mon, Jun 30, 2008 at 07:53:00PM +0200, Kris Kennaway wrote: Yep, it could be that -- what confuses me though is that it is claimed that performance suddenly regressed. If so then this cannot be the underlying cause. It may be that the load has augmented to the point that contention imposes a rapid regression on throughput. -- Michel TALON ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Too Much Context Switching?
Michel Talon wrote: On Mon, Jun 30, 2008 at 07:53:00PM +0200, Kris Kennaway wrote: Yep, it could be that -- what confuses me though is that it is claimed that performance suddenly regressed. If so then this cannot be the underlying cause. It may be that the load has augmented to the point that contention imposes a rapid regression on throughput. Yes, it could be that. I don't know off-hand whether multiple threads are counted separately by vmstat (at a guess I'd say no), but ps/top/etc should show how many are active in the python process. Kris ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Too Much Context Switching?
In 6.x. the default thread library is quite inefficient although it can make use of multiple CPUs (again, providing the application is giving them work to do). For multi-threaded performance you will be better off switching to the libthr library (see libmap.conf(5)) or updating to 7.0 (where it is the default). This isn't likely to be the underlying issue if you are trying to debug a loss of performance relative to the same configuration in the past though. Indeed Plone is written in python, and python has a Big Giant Lock inside which insures that only one thread can execute, in order to protect the python structures. This lock is only released under special circumstances, such as doing IO. Hence it is necessary to run several instances of python programs and do synchronization work, if one wants to make use of several CPUs, or use python threads, and immediately make some IOs, or similar techniques. It may be that using Jython, if possible, yields better threading behavior. When doing some work according to these ideas, i had found quite severe contention, and this was not cured when switching native threading libraries (libksd, libthr, etc.). The problem is really inside python. Yep, it could be that -- what confuses me though is that it is claimed that performance suddenly regressed. If so then this cannot be the underlying cause. It's actually been a long, slow, steady degradation of performance as best I can tell, that's recently just reached proportions that are so ridiculous that it's gone from this sucks but I can deal to this is completely unusable. The system has been slow from the start, just not this slow. I guess I'll need to investigate this...and while I know that Python is somewhat off-topic, if anyone here has any suggestions on where to start, they'd be much appreciated. :-) Alex ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Too Much Context Switching?
[EMAIL PROTECTED] wrote: It's actually been a long, slow, steady degradation of performance as best I can tell, that's recently just reached proportions that are so ridiculous that it's gone from this sucks but I can deal to this is completely unusable. The system has been slow from the start, just not this slow. I guess I'll need to investigate this...and while I know that Python is somewhat off-topic, if anyone here has any suggestions on where to start, they'd be much appreciated. :-) If you want to factor FreeBSD out of the problem, try to do the exact same Plone stuff under a good and easy Linux distro, like Ubuntu, and you will know if the problem is in Plone. In this case you have a workaround using a multiplexer as someone else mentioned, assuming your machine has several cores and a lot of memory. I am not an expert, but i have heard that Java frameworks have much better scalability, partly because threads are handled in a more reasonable way, and also because the JIT is very good. By the way, you can try to run Plone under psyco http://psyco.sourceforge.net/ provided you have a lot of memory. I have seen good improvement for some python programs with psyco. I have found a speed comparison which may enlighten you here: http://www.alrond.com/en/2007/jan/25/performance-test-of-6-leading-frameworks/ It has some remarks at the end which may help for plone. -- Michel TALON ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Too Much Context Switching?
Quoting Kris Kennaway [EMAIL PROTECTED]: Michel Talon wrote: On Mon, Jun 30, 2008 at 07:53:00PM +0200, Kris Kennaway wrote: Yep, it could be that -- what confuses me though is that it is claimed that performance suddenly regressed. If so then this cannot be the underlying cause. It may be that the load has augmented to the point that contention imposes a rapid regression on throughput. Yes, it could be that. I don't know off-hand whether multiple threads are counted separately by vmstat (at a guess I'd say no), but ps/top/etc should show how many are active in the python process. Just ran ktrace, and a bit of Googling seems to confirm my initial suspicion that the results I'm seeing are abnormal. The first several screenfulls of output look like this: 52929 python2.4 1214867016.469416 CALL kse_wakeup(0x811740c) 52929 python2.4 0.60 RET kse_wakeup 0 52929 python2.4 0.08 RET kse_release 0 52929 python2.4 0.40 CALL kse_release(0x811df4c) 52929 python2.4 0.000515 CALL kse_wakeup(0x811740c) 52929 python2.4 0.12 RET kse_wakeup 0 52929 python2.4 0.04 RET kse_release 0 52929 python2.4 0.12 CALL kse_release(0x811df4c) 52929 python2.4 0.000365 CALL kse_wakeup(0x811740c) 52929 python2.4 0.12 RET kse_wakeup 0 52929 python2.4 0.03 RET kse_release 0 52929 python2.4 0.10 CALL kse_release(0x811df4c) 52929 python2.4 0.000413 CALL kse_wakeup(0x811740c) 52929 python2.4 0.11 RET kse_wakeup 0 52929 python2.4 0.04 RET kse_release 0 52929 python2.4 0.09 CALL kse_release(0x811df4c) 52929 python2.4 0.000393 CALL kse_wakeup(0x811740c) 52929 python2.4 0.12 RET kse_wakeup 0 52929 python2.4 0.04 RET kse_release 0 52929 python2.4 0.09 CALL kse_release(0x811df4c) I may be mistaken, but it seems like that's a lot of unnecessary activity managing the threads; the confirmation I found came from http://arkiv.freebsd.se/?ml=freebsd-threadsa=2007-02t=3178634. Am I correct that this is abnormal behavior? If so, any idea what I may need to do to fix the issue? Alex ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Too Much Context Switching?
[EMAIL PROTECTED] wrote: Quoting Kris Kennaway [EMAIL PROTECTED]: Michel Talon wrote: On Mon, Jun 30, 2008 at 07:53:00PM +0200, Kris Kennaway wrote: Yep, it could be that -- what confuses me though is that it is claimed that performance suddenly regressed. If so then this cannot be the underlying cause. It may be that the load has augmented to the point that contention imposes a rapid regression on throughput. Yes, it could be that. I don't know off-hand whether multiple threads are counted separately by vmstat (at a guess I'd say no), but ps/top/etc should show how many are active in the python process. Just ran ktrace, and a bit of Googling seems to confirm my initial suspicion that the results I'm seeing are abnormal. The first several screenfulls of output look like this: 52929 python2.4 1214867016.469416 CALL kse_wakeup(0x811740c) 52929 python2.4 0.60 RET kse_wakeup 0 52929 python2.4 0.08 RET kse_release 0 52929 python2.4 0.40 CALL kse_release(0x811df4c) 52929 python2.4 0.000515 CALL kse_wakeup(0x811740c) 52929 python2.4 0.12 RET kse_wakeup 0 52929 python2.4 0.04 RET kse_release 0 52929 python2.4 0.12 CALL kse_release(0x811df4c) 52929 python2.4 0.000365 CALL kse_wakeup(0x811740c) 52929 python2.4 0.12 RET kse_wakeup 0 52929 python2.4 0.03 RET kse_release 0 52929 python2.4 0.10 CALL kse_release(0x811df4c) 52929 python2.4 0.000413 CALL kse_wakeup(0x811740c) 52929 python2.4 0.11 RET kse_wakeup 0 52929 python2.4 0.04 RET kse_release 0 52929 python2.4 0.09 CALL kse_release(0x811df4c) 52929 python2.4 0.000393 CALL kse_wakeup(0x811740c) 52929 python2.4 0.12 RET kse_wakeup 0 52929 python2.4 0.04 RET kse_release 0 52929 python2.4 0.09 CALL kse_release(0x811df4c) I may be mistaken, but it seems like that's a lot of unnecessary activity managing the threads; the confirmation I found came from http://arkiv.freebsd.se/?ml=freebsd-threadsa=2007-02t=3178634. Am I correct that this is abnormal behavior? If so, any idea what I may need to do to fix the issue? Looks exactly like the python thread problem Michel described. You will get some improvement by switching to libthr and/or updating to 7.0 as I discussed, but ultimately you're hitting limits of python, not FreeBSD. Kris ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Too Much Context Switching? - FIXED
Quoting Kris Kennaway [EMAIL PROTECTED]: [EMAIL PROTECTED] wrote: Quoting Kris Kennaway [EMAIL PROTECTED]: Michel Talon wrote: On Mon, Jun 30, 2008 at 07:53:00PM +0200, Kris Kennaway wrote: Yep, it could be that -- what confuses me though is that it is claimed that performance suddenly regressed. If so then this cannot be the underlying cause. It may be that the load has augmented to the point that contention imposes a rapid regression on throughput. Yes, it could be that. I don't know off-hand whether multiple threads are counted separately by vmstat (at a guess I'd say no), but ps/top/etc should show how many are active in the python process. Just ran ktrace, and a bit of Googling seems to confirm my initial suspicion that the results I'm seeing are abnormal. The first several screenfulls of output look like this: 52929 python2.4 1214867016.469416 CALL kse_wakeup(0x811740c) 52929 python2.4 0.60 RET kse_wakeup 0 52929 python2.4 0.08 RET kse_release 0 52929 python2.4 0.40 CALL kse_release(0x811df4c) 52929 python2.4 0.000515 CALL kse_wakeup(0x811740c) 52929 python2.4 0.12 RET kse_wakeup 0 52929 python2.4 0.04 RET kse_release 0 52929 python2.4 0.12 CALL kse_release(0x811df4c) 52929 python2.4 0.000365 CALL kse_wakeup(0x811740c) 52929 python2.4 0.12 RET kse_wakeup 0 52929 python2.4 0.03 RET kse_release 0 52929 python2.4 0.10 CALL kse_release(0x811df4c) 52929 python2.4 0.000413 CALL kse_wakeup(0x811740c) 52929 python2.4 0.11 RET kse_wakeup 0 52929 python2.4 0.04 RET kse_release 0 52929 python2.4 0.09 CALL kse_release(0x811df4c) 52929 python2.4 0.000393 CALL kse_wakeup(0x811740c) 52929 python2.4 0.12 RET kse_wakeup 0 52929 python2.4 0.04 RET kse_release 0 52929 python2.4 0.09 CALL kse_release(0x811df4c) I may be mistaken, but it seems like that's a lot of unnecessary activity managing the threads; the confirmation I found came from http://arkiv.freebsd.se/?ml=freebsd-threadsa=2007-02t=3178634. Am I correct that this is abnormal behavior? If so, any idea what I may need to do to fix the issue? Looks exactly like the python thread problem Michel described. You will get some improvement by switching to libthr and/or updating to 7.0 as I discussed, but ultimately you're hitting limits of python, not FreeBSD. WOW...it's *amazing* how much of a difference a single sysctl can make. I went ahead and set kern.threads.virtual_cpu=1, as suggested in the thread above, and the difference is ridiculous -- Zope is now faster than I've ever seen. More importantly, my ktracing shows that all of the kse_* garabage is now gone. I'll probably be upgrading to 7.0 in the next month or so, given that this is obviously a thread issue and that that release has much improved thread code. However, for the time being, the pressing issue is fixed, and for anyone in my position stuck on 6.2...this is night day. Alex ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Too Much Context Switching? - FIXED
[EMAIL PROTECTED] wrote: Quoting Kris Kennaway [EMAIL PROTECTED]: [EMAIL PROTECTED] wrote: Quoting Kris Kennaway [EMAIL PROTECTED]: Michel Talon wrote: On Mon, Jun 30, 2008 at 07:53:00PM +0200, Kris Kennaway wrote: Yep, it could be that -- what confuses me though is that it is claimed that performance suddenly regressed. If so then this cannot be the underlying cause. It may be that the load has augmented to the point that contention imposes a rapid regression on throughput. Yes, it could be that. I don't know off-hand whether multiple threads are counted separately by vmstat (at a guess I'd say no), but ps/top/etc should show how many are active in the python process. Just ran ktrace, and a bit of Googling seems to confirm my initial suspicion that the results I'm seeing are abnormal. The first several screenfulls of output look like this: 52929 python2.4 1214867016.469416 CALL kse_wakeup(0x811740c) 52929 python2.4 0.60 RET kse_wakeup 0 52929 python2.4 0.08 RET kse_release 0 52929 python2.4 0.40 CALL kse_release(0x811df4c) 52929 python2.4 0.000515 CALL kse_wakeup(0x811740c) 52929 python2.4 0.12 RET kse_wakeup 0 52929 python2.4 0.04 RET kse_release 0 52929 python2.4 0.12 CALL kse_release(0x811df4c) 52929 python2.4 0.000365 CALL kse_wakeup(0x811740c) 52929 python2.4 0.12 RET kse_wakeup 0 52929 python2.4 0.03 RET kse_release 0 52929 python2.4 0.10 CALL kse_release(0x811df4c) 52929 python2.4 0.000413 CALL kse_wakeup(0x811740c) 52929 python2.4 0.11 RET kse_wakeup 0 52929 python2.4 0.04 RET kse_release 0 52929 python2.4 0.09 CALL kse_release(0x811df4c) 52929 python2.4 0.000393 CALL kse_wakeup(0x811740c) 52929 python2.4 0.12 RET kse_wakeup 0 52929 python2.4 0.04 RET kse_release 0 52929 python2.4 0.09 CALL kse_release(0x811df4c) I may be mistaken, but it seems like that's a lot of unnecessary activity managing the threads; the confirmation I found came from http://arkiv.freebsd.se/?ml=freebsd-threadsa=2007-02t=3178634. Am I correct that this is abnormal behavior? If so, any idea what I may need to do to fix the issue? Looks exactly like the python thread problem Michel described. You will get some improvement by switching to libthr and/or updating to 7.0 as I discussed, but ultimately you're hitting limits of python, not FreeBSD. WOW...it's *amazing* how much of a difference a single sysctl can make. I went ahead and set kern.threads.virtual_cpu=1, as suggested in the thread above, and the difference is ridiculous -- Zope is now faster than I've ever seen. More importantly, my ktracing shows that all of the kse_* garabage is now gone. I'll probably be upgrading to 7.0 in the next month or so, given that this is obviously a thread issue and that that release has much improved thread code. However, for the time being, the pressing issue is fixed, and for anyone in my position stuck on 6.2...this is night day. Seriously, try libthr. No matter what you do to libkse it is going to suck. That's why we removed it. Kris ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Too Much Context Switching? - FIXED
Quoting Kris Kennaway [EMAIL PROTECTED]: [EMAIL PROTECTED] wrote: Quoting Kris Kennaway [EMAIL PROTECTED]: [EMAIL PROTECTED] wrote: Quoting Kris Kennaway [EMAIL PROTECTED]: Michel Talon wrote: On Mon, Jun 30, 2008 at 07:53:00PM +0200, Kris Kennaway wrote: Yep, it could be that -- what confuses me though is that it is claimed that performance suddenly regressed. If so then this cannot be the underlying cause. It may be that the load has augmented to the point that contention imposes a rapid regression on throughput. Yes, it could be that. I don't know off-hand whether multiple threads are counted separately by vmstat (at a guess I'd say no), but ps/top/etc should show how many are active in the python process. Just ran ktrace, and a bit of Googling seems to confirm my initial suspicion that the results I'm seeing are abnormal. The first several screenfulls of output look like this: 52929 python2.4 1214867016.469416 CALL kse_wakeup(0x811740c) 52929 python2.4 0.60 RET kse_wakeup 0 52929 python2.4 0.08 RET kse_release 0 52929 python2.4 0.40 CALL kse_release(0x811df4c) 52929 python2.4 0.000515 CALL kse_wakeup(0x811740c) 52929 python2.4 0.12 RET kse_wakeup 0 52929 python2.4 0.04 RET kse_release 0 52929 python2.4 0.12 CALL kse_release(0x811df4c) 52929 python2.4 0.000365 CALL kse_wakeup(0x811740c) 52929 python2.4 0.12 RET kse_wakeup 0 52929 python2.4 0.03 RET kse_release 0 52929 python2.4 0.10 CALL kse_release(0x811df4c) 52929 python2.4 0.000413 CALL kse_wakeup(0x811740c) 52929 python2.4 0.11 RET kse_wakeup 0 52929 python2.4 0.04 RET kse_release 0 52929 python2.4 0.09 CALL kse_release(0x811df4c) 52929 python2.4 0.000393 CALL kse_wakeup(0x811740c) 52929 python2.4 0.12 RET kse_wakeup 0 52929 python2.4 0.04 RET kse_release 0 52929 python2.4 0.09 CALL kse_release(0x811df4c) I may be mistaken, but it seems like that's a lot of unnecessary activity managing the threads; the confirmation I found came from http://arkiv.freebsd.se/?ml=freebsd-threadsa=2007-02t=3178634. Am I correct that this is abnormal behavior? If so, any idea what I may need to do to fix the issue? Looks exactly like the python thread problem Michel described. You will get some improvement by switching to libthr and/or updating to 7.0 as I discussed, but ultimately you're hitting limits of python, not FreeBSD. WOW...it's *amazing* how much of a difference a single sysctl can make. I went ahead and set kern.threads.virtual_cpu=1, as suggested in the thread above, and the difference is ridiculous -- Zope is now faster than I've ever seen. More importantly, my ktracing shows that all of the kse_* garabage is now gone. I'll probably be upgrading to 7.0 in the next month or so, given that this is obviously a thread issue and that that release has much improved thread code. However, for the time being, the pressing issue is fixed, and for anyone in my position stuck on 6.2...this is night day. Seriously, try libthr. No matter what you do to libkse it is going to suck. That's why we removed it. I will, probably as part of upgrading to 7.0 (which I may accelerate, given this point). I'm just ecstatic at the difference I'm already seeing, and specifically wanted to make note of it in the archives. Point very much taken, though. :-) Alex ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Too Much Context Switching? - FIXED
I will, probably as part of upgrading to 7.0 (which I may accelerate, given this point). I'm just ecstatic at the difference I'm already seeing, and specifically wanted to make note of it in the archives. Point very much taken, though. :-) It's trivial to change to libthr, as pointed out earlier in this thread. You simply add an entry/entries to /etc/libmap.conf (see man libmap.conf for details) and then restart whatever it is that is currently running against libkse. I'll second Kris' recommendation to move to libthr. I saw a drastic improvement in MySQL and ffmpeg performance on 6.2 when I switched from libkse to libthr. Certainly 7.0 would give it to you automatically, but there's no reason not to use libmap to use it now, as an interim solution. Josh ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Too Much Context Switching?
[EMAIL PROTECTED] wrote: I'm the webmaster for www.marssociety.org, which is a FreeBSD 6.2-RELEASE box running on a dual-core AMD Opteron setup with 4GB of RAM. The box is reasonably busy, as it's the sole piece of hardware running web, database, and mail operations for the Mars Society, an international nonprofit group dedicated to space exploration. We regularly send out newsletters to ~10,000 members, and our web site is averaging ~50,000-100,000 hits/day. The main portion of the web site is run via the Zope/Plone CMS system (Plone 2.5, for anyone who may care). Recently, it's been slowing down dramatically, and our Plone guy (not me -- I inherited the system and can't stand it) can't figure out why. I've been diving into OS-related issues, and in so doing, I ran across what appears to be a very high number of context switches going on. Here's some sample output from vmstat 2: A few hundred or thousand context switches per second is trivial load. That is not your problem. Modern CPUs can do hundreds of thousands per second before it starts to become a problem. Note that your system is 50% idle and spending almost no time in the kernel. This basically means that only one core is doing work, which might be because you're not giving it enough work to do. There are only 1-2 running tasks for most of your trace, one of which is probably vmstat itself, so that means there is only one running server process (which can obviously only saturate at most 1 CPU). The trace suggests that your performance problems are either in userland, or elsewhere in your network or application stack, possibly due to interactions between components. Try to look at why the system is not being given enough work to keep it saturated. Kris ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Too Much Context Switching?
First off, thanks for such a prompt response. :-) [EMAIL PROTECTED] wrote: I'm the webmaster for www.marssociety.org, which is a FreeBSD 6.2-RELEASE box running on a dual-core AMD Opteron setup with 4GB of RAM. The box is reasonably busy, as it's the sole piece of hardware running web, database, and mail operations for the Mars Society, an international nonprofit group dedicated to space exploration. We regularly send out newsletters to ~10,000 members, and our web site is averaging ~50,000-100,000 hits/day. The main portion of the web site is run via the Zope/Plone CMS system (Plone 2.5, for anyone who may care). Recently, it's been slowing down dramatically, and our Plone guy (not me -- I inherited the system and can't stand it) can't figure out why. I've been diving into OS-related issues, and in so doing, I ran across what appears to be a very high number of context switches going on. Here's some sample output from vmstat 2: A few hundred or thousand context switches per second is trivial load. That is not your problem. Modern CPUs can do hundreds of thousands per second before it starts to become a problem. OK, well that's good to know. Note that your system is 50% idle and spending almost no time in the kernel. This basically means that only one core is doing work, which might be because you're not giving it enough work to do. There are only 1-2 running tasks for most of your trace, one of which is probably vmstat itself, so that means there is only one running server process (which can obviously only saturate at most 1 CPU). Actually, I decided to run vmstat this morning for a little while after turning off Zope, and during the couple of minutes I had it going, the number of processes running (as indicated by the leftmost column of vmstat's output) was at 0 for all but one line worth of output, so I would guess that vmstat's not including itself in the number of processes there. Even so, though, your assessment about how saturated the CPU is is of course still valid, which leads me to a follow-up question: by default, can a multi-threaded app use both cores? Or would I need to have two instances of the process running (Zope is apparently able to handle multiple instances running reasonably well) in order to have it fully utilize the CPU? The trace suggests that your performance problems are either in userland, or elsewhere in your network or application stack, possibly due to interactions between components. Try to look at why the system is not being given enough work to keep it saturated. Any tips on tools I could use to check this out? I'll of course be looking at Zope profiling tools, to see if I can have them tell me where any bottlenecks are, but if there are any OS-level tools that I could use to profile a given process (or group thereof) for problems, I'd really appreciate hearing about them (simple links to man pages or the like would be fine, I don't mean to waste your time explaining how tools work when I can usually figure it out on my own). Alex Kirk ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]