Re: Aussie is *slow*
Pavel Sanda [EMAIL PROTECTED] writes: our main config change is somewhere around 3000. we have trespassed the load of 10 for a few times from then but system got immediately back. so i guess the slowness issue is resolved. Looks like it indeed. Congrats. JMarc
Re: Aussie is *slow*
Pavel Sanda <[EMAIL PROTECTED]> writes: > our main config change is somewhere around 3000. we have trespassed > the load of 10 for a few times from then but system got immediately > back. so i guess the slowness issue is resolved. Looks like it indeed. Congrats. JMarc
Re: Aussie is *slow*
i'll keep my logger running on aussie for some time, to see if we solved this for longer ime periods. fyi i regenerated the load after few weeks of run: http://195.113.31.123/~sanda/junk/aussie_load_whole_history.png our main config change is somewhere around 3000. we have trespassed the load of 10 for a few times from then but system got immediately back. so i guess the slowness issue is resolved. pavel
Re: Aussie is *slow*
> i'll keep my logger running on aussie for some time, to see if we solved this > for > longer ime periods. fyi i regenerated the load after few weeks of run: http://195.113.31.123/~sanda/junk/aussie_load_whole_history.png our main config change is somewhere around 3000. we have trespassed the load of 10 for a few times from then but system got immediately back. so i guess the slowness issue is resolved. pavel
Re: Aussie is *slow*
Pavel Sanda [EMAIL PROTECTED] writes: i would give it a try, but you are the root here :) (i'll make some stress test again to see what will happen after such a change) OK, make spamd children=2, max httpd processes = 15. JMarc
Re: Aussie is *slow*
Jean-Marc Lasgouttes [EMAIL PROTECTED] writes: i would give it a try, but you are the root here :) (i'll make some stress test again to see what will happen after such a change) OK, make spamd children=2, max httpd processes = 15. I mean that I just did that. JMarc
Re: Aussie is *slow*
i would give it a try, but you are the root here :) (i'll make some stress test again to see what will happen after such a change) OK, make spamd children=2, max httpd processes = 15. dont know if this is going to be permanent status, but i'm not able to do the stress test now! trac became so much faster that before i click on other link new tab is already loaded. so i tried to load the po diffs to make the load time longer and instead of freezing aussie my firefox get overloaded :) i'll keep my logger running on aussie for some time, to see if we solved this for longer ime periods. pavel
Re: Aussie is *slow*
Pavel Sanda <[EMAIL PROTECTED]> writes: > > i would give it a try, but you are the root here :) > (i'll make some stress test again to see what will happen after such a change) OK, make spamd children=2, max httpd processes = 15. JMarc
Re: Aussie is *slow*
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: >> i would give it a try, but you are the root here :) >> (i'll make some stress test again to see what will happen after such a >> change) > > OK, make spamd children=2, max httpd processes = 15. I mean that I just did that. JMarc
Re: Aussie is *slow*
> >> i would give it a try, but you are the root here :) > >> (i'll make some stress test again to see what will happen after such a > >> change) > > > > OK, make spamd children=2, max httpd processes = 15. dont know if this is going to be permanent status, but i'm not able to do the stress test now! trac became so much faster that before i click on other link new tab is already loaded. so i tried to load the po diffs to make the load time longer and instead of freezing aussie my firefox get overloaded :) i'll keep my logger running on aussie for some time, to see if we solved this for longer ime periods. pavel
Re: Aussie is *slow*
JMarc, do you see some problem with the lowering of both services childern? please at least restart them, aussie is unusable the whole day and both services seem get into mutual lockup. Looks like everything is OK now (I did nothing today). Do you still want me to lower the two children numbers? i would give it a try, but you are the root here :) (i'll make some stress test again to see what will happen after such a change) pavel
Re: Aussie is *slow*
Pavel Sanda [EMAIL PROTECTED] writes: JMarc, do you see some problem with the lowering of both services childern? please at least restart them, aussie is unusable the whole day and both services seem get into mutual lockup. Looks like everything is OK now (I did nothing today). Do you still want me to lower the two children numbers? JMarc
Re: Aussie is *slow*
> > JMarc, do you see some problem with the lowering of both services > > childern? please at least restart them, aussie is unusable the whole > > day and both services seem get into mutual lockup. > > Looks like everything is OK now (I did nothing today). Do you still > want me to lower the two children numbers? i would give it a try, but you are the root here :) (i'll make some stress test again to see what will happen after such a change) pavel
Re: Aussie is *slow*
Pavel Sanda <[EMAIL PROTECTED]> writes: > JMarc, do you see some problem with the lowering of both services > childern? please at least restart them, aussie is unusable the whole > day and both services seem get into mutual lockup. Looks like everything is OK now (I did nothing today). Do you still want me to lower the two children numbers? JMarc
Re: Aussie is *slow*
Abdelrazak Younes [EMAIL PROTECTED] writes: Couldn't you at least upgrade some of the components? Trac is at 10.4 and we are still using 10.2, the changelog seems to say that the fix are important: http://trac.edgewall.org/wiki/ChangeLog I am not sure the fixes would make a difference for our particular case. SpamAssassin is at 3.0.6, the last version is 3.2.3. Can we do that without reinstalling a newer linux distribution? JMarc
Re: Aussie is *slow*
Jean-Marc Lasgouttes wrote: Abdelrazak Younes [EMAIL PROTECTED] writes: Couldn't you at least upgrade some of the components? Trac is at 10.4 and we are still using 10.2, the changelog seems to say that the fix are important: http://trac.edgewall.org/wiki/ChangeLog I am not sure the fixes would make a difference for our particular case. Maybe the cache improvement will lower the load on http and svn? From the Changelog: 0.10.4 * Repository cache improvements. The new syncing scheme is incompatible with the previous one and requires a database schema upgrade in order to prevent the old and the new codebase to be mixed. A repository resync is not needed, though. The 0.10.4 scheme is compatible with the 0.11 one. (#3837, #4043 and #4586) * Fix a possible freeze under heavy load (#4465) 0.10.3: * Subversion repository resync broken. (#4204). SpamAssassin is at 3.0.6, the last version is 3.2.3. Can we do that without reinstalling a newer linux distribution? I would be very surprised if not. Or do you mean that the latest version is not available for FC4? In the good old Slackware days, it was easy to upgrade anything. Abdel.
Re: Aussie is *slow*
Jürgen Spitzmüller wrote: I noticed/suspected too that you were browsing trac at the time where the bump happened on the graph. Do you use rss feeds? No, I just browsed it with a plain web browser (konqueror). I used nothing but the standard view, where the changes are colored. However, the Konqueror has a status bar applet that searches for available RSS feeds. Maybe this is the culprit. Do we need RSS feed support, or can we just try to disable that? Jürgen
Re: Aussie is *slow*
Pavel Sanda [EMAIL PROTECTED] writes: this is _far_too_high_ and i suggest we should still go down with the number of maxclients; there is no point in allowing 24 apaches when their only work is swapping the whole box to death. i have observed that 8 processes are able to qork on some 0.x load, so this is the lower bound and i would say lets put the higher bound somewhere between 12-15. Shall we do something about MaxRequestsPerChild 50 JMarc
Re: Aussie is *slow*
Jean-Marc Lasgouttes wrote: Can we do that without reinstalling a newer linux distribution? Why not upgrading to the latest Fedora by the way? I guess (hope) that Fedora provide easy upgrade without the need to reinstall, doesn't it? I certainly wouldn't mind a few hours of unavailability if we get rid of those problems that sucks our time every couple of weeks. If Lars could take this occasion to put some more RAM that would be even better. Abdel.
Re: Aussie is *slow*
Abdelrazak Younes [EMAIL PROTECTED] writes: Jean-Marc Lasgouttes wrote: Can we do that without reinstalling a newer linux distribution? Why not upgrading to the latest Fedora by the way? I guess (hope) that Fedora provide easy upgrade without the need to reinstall, doesn't it? I would not try that without physical access. But if Lars can upgrade it, it would be great. JMarc
Re: Aussie is *slow*
On Wed, Feb 13, 2008 at 10:18:37AM +0100, Abdelrazak Younes wrote: Jean-Marc Lasgouttes wrote: Can we do that without reinstalling a newer linux distribution? Why not upgrading to the latest Fedora by the way? I guess (hope) that Fedora provide easy upgrade without the need to reinstall, doesn't it? You can at least upgrade with incrementing the release one by one. Somewhere they've even an archive of the older releases so you can do it step by step. I would be a bit careful going straight from FC4 to FC8 or at least try it at home first. My Fedora experience is a bit limited anyway. Cheers, Sven -- If God passed a mic to me to speak I'd say stay in bed, world Sleep in peace [The Cardigans - 03:45: No sleep]
Re: Aussie is *slow*
this is _far_too_high_ and i suggest we should still go down with the number of maxclients; there is no point in allowing 24 apaches when their only work is swapping the whole box to death. i have observed that 8 processes are able to qork on some 0.x load, so this is the lower bound and i would say lets put the higher bound somewhere between 12-15. Shall we do something about MaxRequestsPerChild 50 if i understand it correctly this parameter will speed up return to normal (=8) number of childern and its point is mainly preventing memory leaks. pavel
Re: Aussie is *slow*
this is _far_too_high_ and i suggest we should still go down with the number of maxclients; there is no point in allowing 24 apaches when their only work is swapping the whole box to death. i have observed that 8 processes are able to qork on some 0.x load, so this is the lower bound and i would say lets put the higher bound somewhere between 12-15. Shall we do something about MaxRequestsPerChild 50 if i understand it correctly this parameter will speed up return to normal (=8) number of childern and its point is mainly preventing memory leaks. i would be also interested what will happen with our todays graph if you decrease the number of spam childern by one. pavwel
Re: Aussie is *slow*
Abdelrazak Younes wrote: Jean-Marc Lasgouttes wrote: Can we do that without reinstalling a newer linux distribution? Why not upgrading to the latest Fedora by the way? I guess (hope) that Fedora provide easy upgrade without the need to reinstall, doesn't it? This is complicated under Fedora. There are ways to upgrade on the fly, then reboot, but this is not trivial and can lead to problems. It's one of the less desirable facts about Fedora. The approved method is to reboot using an install disk and then choose upgrade. So it might be hard to do this without physical access. Richard
Re: Aussie is *slow*
[EMAIL PROTECTED] wrote: On Wed, 13 Feb 2008, Pavel Sanda wrote: so adjusting some 12 childern wont have effect on normal traffic while could significantly inhibit swapping when somebody starts playing with trac. Regarding Trac, the user base might be larger than you think as some of the wiki pages embed files from the repository using Trac. So if one or more users frequently look at such wiki pages, it will also indirectly invoke Trac (just to extract the relevan file of course). I wonder if there's not some way to cache a lot of this. rh
Re: Aussie is *slow*
On Wed, Feb 13, 2008 at 08:13:43AM +0100, [EMAIL PROTECTED] wrote: On Wed, 13 Feb 2008, Pavel Sanda wrote: so adjusting some 12 childern wont have effect on normal traffic while could significantly inhibit swapping when somebody starts playing with trac. Regarding Trac, the user base might be larger than you think as some of the wiki pages embed files from the repository using Trac. So if one or more users frequently look at such wiki pages, it will also indirectly invoke Trac (just to extract the relevan file of course). Hm.. could these pages turned into something more static, e.g. by manually extracting them using trac? Andre'
Re: Aussie is *slow*
José Matos wrote: On Wednesday 13 February 2008 17:05:08 Richard Heck wrote: This is complicated under Fedora. There are ways to upgrade on the fly, then reboot, but this is not trivial and can lead to problems. It's one of the less desirable facts about Fedora. The approved method is to reboot using an install disk and then choose upgrade. So it might be hard to do this without physical access. Actually all those troubles are documented. :-) I know because I have updated using yum since FC-4. This works quite well for the n - n+1 jump. I've tried this, too, and it might be OK for aussie, if it weren't that we were so far behind at this point. The problems I've had have mostly had to do with extra packages being left lying around, maybe because at that time I was using rpmforge packages quite heavily. Now that's basically dead, from what I can tell, and I'm using livna, so I'm hoping future upgrades will be less painful. rh
Re: Aussie is *slow*
On Wednesday 13 February 2008 17:05:08 Richard Heck wrote: This is complicated under Fedora. There are ways to upgrade on the fly, then reboot, but this is not trivial and can lead to problems. It's one of the less desirable facts about Fedora. The approved method is to reboot using an install disk and then choose upgrade. So it might be hard to do this without physical access. Actually all those troubles are documented. :-) I know because I have updated using yum since FC-4. This works quite well for the n - n+1 jump. The n - n+2 works although it can be a bit tricky because of the package dependency changes. The n - n+4 is a leap of faith. :-) Richard -- José Abílio
Re: Aussie is *slow*
On Wed, 13 Feb 2008, Richard Heck wrote: [EMAIL PROTECTED] wrote: On Wed, 13 Feb 2008, Pavel Sanda wrote: so adjusting some 12 childern wont have effect on normal traffic while could significantly inhibit swapping when somebody starts playing with trac. Regarding Trac, the user base might be larger than you think as some of the wiki pages embed files from the repository using Trac. So if one or more users frequently look at such wiki pages, it will also indirectly invoke Trac (just to extract the relevan file of course). I wonder if there's not some way to cache a lot of this. Sure, it could be done like this for instance. * Take the path to the file in question, convert it to a hashed filename. * Check the cache directory for a file with this hashed filename * If it doesn't exist or is older than some threshold, save/download to cache directory * Render the page using the file from the cache directory A (minor?) problem with this approach is chosing a good threshold value. Don't know how long this would take to implement, but the MimeTeX recipe already has this framework as it's used to cache the images that are generated based on math formulas. /Christian -- Christian Ridderström, +46-8-768 39 44 http://www.md.kth.se/~chr
Re: Aussie is *slow*
this is _far_too_high_ and i suggest we should still go down with the number of maxclients; there is no point in allowing 24 apaches when their only work is swapping the whole box to death. i have observed that 8 processes are able to qork on some 0.x load, so this is the lower bound and i would say lets put the higher bound somewhere between 12-15. Shall we do something about MaxRequestsPerChild 50 if i understand it correctly this parameter will speed up return to normal (=8) number of childern and its point is mainly preventing memory leaks. i would be also interested what will happen with our todays graph if you decrease the number of spam childern by one. seeing the evolution today both spamd(500mb vss) and httpd(800mb vss) got completely crazy. JMarc, do you see some problem with the lowering of both services childern? please at least restart them, aussie is unusable the whole day and both services seem get into mutual lockup. thanks, pavel
Re: Aussie is *slow*
On Wed, 13 Feb 2008, [EMAIL PROTECTED] wrote: Sure, it could be done like this for instance. * Take the path to the file in question, convert it to a hashed filename. * Check the cache directory for a file with this hashed filename * If it doesn't exist or is older than some threshold, save/download to cache directory * Render the page using the file from the cache directory A (minor?) problem with this approach is chosing a good threshold value. Don't know how long this would take to implement, but the MimeTeX recipe already has this framework as it's used to cache the images that are generated based on math formulas. Let me know if you think this is needed, otherwise I'll let it rest for now. /Christian -- Christian Ridderström, +46-8-768 39 44 http://www.md.kth.se/~chr
Re: Aussie is *slow*
Abdelrazak Younes <[EMAIL PROTECTED]> writes: > Couldn't you at least upgrade some of the components? Trac is at 10.4 > and we are still using 10.2, the changelog seems to say that the fix > are important: > > http://trac.edgewall.org/wiki/ChangeLog I am not sure the fixes would make a difference for our particular case. > SpamAssassin is at 3.0.6, the last version is 3.2.3. Can we do that without reinstalling a newer linux distribution? JMarc
Re: Aussie is *slow*
Jean-Marc Lasgouttes wrote: Abdelrazak Younes <[EMAIL PROTECTED]> writes: Couldn't you at least upgrade some of the components? Trac is at 10.4 and we are still using 10.2, the changelog seems to say that the fix are important: http://trac.edgewall.org/wiki/ChangeLog I am not sure the fixes would make a difference for our particular case. Maybe the cache improvement will lower the load on http and svn? From the Changelog: 0.10.4 * Repository cache improvements. The new syncing scheme is incompatible with the previous one and requires a database schema upgrade in order to prevent the old and the new codebase to be mixed. A repository resync is not needed, though. The 0.10.4 scheme is compatible with the 0.11 one. (#3837, #4043 and #4586) * Fix a possible freeze under heavy load (#4465) 0.10.3: * Subversion repository resync broken. (#4204). SpamAssassin is at 3.0.6, the last version is 3.2.3. Can we do that without reinstalling a newer linux distribution? I would be very surprised if not. Or do you mean that the latest version is not available for FC4? In the good old Slackware days, it was easy to upgrade anything. Abdel.
Re: Aussie is *slow*
Jürgen Spitzmüller wrote: >> I noticed/suspected too that you were browsing trac at the time where the >> bump happened on the graph. Do you use rss feeds? > > No, I just browsed it with a plain web browser (konqueror). I used nothing > but the standard view, where the changes are colored. However, the Konqueror has a status bar applet that searches for available RSS feeds. Maybe this is the culprit. Do we need RSS feed support, or can we just try to disable that? Jürgen
Re: Aussie is *slow*
Pavel Sanda <[EMAIL PROTECTED]> writes: > this is _far_too_high_ and i suggest we should still go down with > the number of maxclients; there is no point in allowing 24 apaches > when their only work is swapping the whole box to death. > > i have observed that 8 processes are able to qork on some 0.x load, so this is > the lower bound and i would say lets put the higher bound somewhere between > 12-15. Shall we do something about MaxRequestsPerChild 50 JMarc
Re: Aussie is *slow*
Jean-Marc Lasgouttes wrote: Can we do that without reinstalling a newer linux distribution? Why not upgrading to the latest Fedora by the way? I guess (hope) that Fedora provide easy upgrade without the need to reinstall, doesn't it? I certainly wouldn't mind a few hours of unavailability if we get rid of those problems that sucks our time every couple of weeks. If Lars could take this occasion to put some more RAM that would be even better. Abdel.
Re: Aussie is *slow*
Abdelrazak Younes <[EMAIL PROTECTED]> writes: > Jean-Marc Lasgouttes wrote: >> Can we do that without reinstalling a newer linux distribution? > > Why not upgrading to the latest Fedora by the way? I guess (hope) that > Fedora provide easy upgrade without the need to reinstall, doesn't it? I would not try that without physical access. But if Lars can upgrade it, it would be great. JMarc
Re: Aussie is *slow*
On Wed, Feb 13, 2008 at 10:18:37AM +0100, Abdelrazak Younes wrote: > Jean-Marc Lasgouttes wrote: > >Can we do that without reinstalling a newer linux distribution? > > Why not upgrading to the latest Fedora by the way? I guess (hope) that > Fedora provide easy upgrade without the need to reinstall, doesn't it? You can at least upgrade with incrementing the release one by one. Somewhere they've even an archive of the older releases so you can do it step by step. I would be a bit careful going straight from FC4 to FC8 or at least try it at home first. My Fedora experience is a bit limited anyway. Cheers, Sven -- If God passed a mic to me to speak I'd say stay in bed, world Sleep in peace [The Cardigans - 03:45: No sleep]
Re: Aussie is *slow*
> > this is _far_too_high_ and i suggest we should still go down with > > the number of maxclients; there is no point in allowing 24 apaches > > when their only work is swapping the whole box to death. > > > > i have observed that 8 processes are able to qork on some 0.x load, so this > > is > > the lower bound and i would say lets put the higher bound somewhere between > > 12-15. > > Shall we do something about > MaxRequestsPerChild 50 if i understand it correctly this parameter will speed up return to normal (<=8) number of childern and its point is mainly preventing memory leaks. pavel
Re: Aussie is *slow*
> > > this is _far_too_high_ and i suggest we should still go down with > > > the number of maxclients; there is no point in allowing 24 apaches > > > when their only work is swapping the whole box to death. > > > > > > i have observed that 8 processes are able to qork on some 0.x load, so > > > this is > > > the lower bound and i would say lets put the higher bound somewhere > > > between > > > 12-15. > > > > Shall we do something about > > MaxRequestsPerChild 50 > > if i understand it correctly this parameter will speed up return to normal > (<=8) number of > childern and its point is mainly preventing memory leaks. i would be also interested what will happen with our todays graph if you decrease the number of spam childern by one. pavwel
Re: Aussie is *slow*
Abdelrazak Younes wrote: Jean-Marc Lasgouttes wrote: Can we do that without reinstalling a newer linux distribution? Why not upgrading to the latest Fedora by the way? I guess (hope) that Fedora provide easy upgrade without the need to reinstall, doesn't it? This is complicated under Fedora. There are ways to upgrade "on the fly", then reboot, but this is not trivial and can lead to problems. It's one of the less desirable facts about Fedora. The approved method is to reboot using an install disk and then choose "upgrade". So it might be hard to do this without physical access. Richard
Re: Aussie is *slow*
[EMAIL PROTECTED] wrote: On Wed, 13 Feb 2008, Pavel Sanda wrote: so adjusting some 12 childern wont have effect on normal traffic while could significantly inhibit swapping when somebody starts playing with trac. Regarding Trac, the "user base" might be larger than you think as some of the wiki pages embed files from the repository using Trac. So if one or more users frequently look at such wiki pages, it will also indirectly invoke Trac (just to extract the relevan file of course). I wonder if there's not some way to cache a lot of this. rh
Re: Aussie is *slow*
On Wed, Feb 13, 2008 at 08:13:43AM +0100, [EMAIL PROTECTED] wrote: > On Wed, 13 Feb 2008, Pavel Sanda wrote: > >> so adjusting some 12 childern wont have effect on normal traffic while >> could significantly inhibit swapping when somebody starts playing with >> trac. > > Regarding Trac, the "user base" might be larger than you think as some of > the wiki pages embed files from the repository using Trac. So if one or > more users frequently look at such wiki pages, it will also indirectly > invoke Trac (just to extract the relevan file of course). Hm.. could these pages turned into something more static, e.g. by manually extracting them using trac? Andre'
Re: Aussie is *slow*
José Matos wrote: On Wednesday 13 February 2008 17:05:08 Richard Heck wrote: This is complicated under Fedora. There are ways to upgrade "on the fly", then reboot, but this is not trivial and can lead to problems. It's one of the less desirable facts about Fedora. The approved method is to reboot using an install disk and then choose "upgrade". So it might be hard to do this without physical access. Actually all those troubles are documented. :-) I know because I have updated using yum since FC-4. This works quite well for the n -> n+1 jump. I've tried this, too, and it might be OK for aussie, if it weren't that we were so far behind at this point. The problems I've had have mostly had to do with extra packages being left lying around, maybe because at that time I was using rpmforge packages quite heavily. Now that's basically dead, from what I can tell, and I'm using livna, so I'm hoping future upgrades will be less painful. rh
Re: Aussie is *slow*
On Wednesday 13 February 2008 17:05:08 Richard Heck wrote: > This is complicated under Fedora. There are ways to upgrade "on the > fly", then reboot, but this is not trivial and can lead to problems. > It's one of the less desirable facts about Fedora. The approved method > is to reboot using an install disk and then choose "upgrade". So it > might be hard to do this without physical access. Actually all those troubles are documented. :-) I know because I have updated using yum since FC-4. This works quite well for the n -> n+1 jump. The n -> n+2 works although it can be a bit tricky because of the package dependency changes. The n -> n+4 is a leap of faith. :-) > Richard -- José Abílio
Re: Aussie is *slow*
On Wed, 13 Feb 2008, Richard Heck wrote: [EMAIL PROTECTED] wrote: On Wed, 13 Feb 2008, Pavel Sanda wrote: > so adjusting some 12 childern wont have effect on normal traffic while > could significantly inhibit swapping when somebody starts playing with > trac. Regarding Trac, the "user base" might be larger than you think as some of the wiki pages embed files from the repository using Trac. So if one or more users frequently look at such wiki pages, it will also indirectly invoke Trac (just to extract the relevan file of course). I wonder if there's not some way to cache a lot of this. Sure, it could be done like this for instance. * Take the path to the file in question, convert it to a hashed filename. * Check the cache directory for a file with this hashed filename * If it doesn't exist or is older than some threshold, save/download to cache directory * Render the page using the file from the cache directory A (minor?) problem with this approach is chosing a good threshold value. Don't know how long this would take to implement, but the MimeTeX recipe already has this framework as it's used to cache the images that are generated based on math formulas. /Christian -- Christian Ridderström, +46-8-768 39 44 http://www.md.kth.se/~chr
Re: Aussie is *slow*
> > > > this is _far_too_high_ and i suggest we should still go down with > > > > the number of maxclients; there is no point in allowing 24 apaches > > > > when their only work is swapping the whole box to death. > > > > > > > > i have observed that 8 processes are able to qork on some 0.x load, so > > > > this is > > > > the lower bound and i would say lets put the higher bound somewhere > > > > between > > > > 12-15. > > > > > > Shall we do something about > > > MaxRequestsPerChild 50 > > > > if i understand it correctly this parameter will speed up return to normal > > (<=8) number of > > childern and its point is mainly preventing memory leaks. > > i would be also interested what will happen with our todays graph if you > decrease the number > of spam childern by one. seeing the evolution today both spamd(500mb vss) and httpd(800mb vss) got completely crazy. JMarc, do you see some problem with the lowering of both services childern? please at least restart them, aussie is unusable the whole day and both services seem get into mutual lockup. thanks, pavel
Re: Aussie is *slow*
On Wed, 13 Feb 2008, [EMAIL PROTECTED] wrote: Sure, it could be done like this for instance. * Take the path to the file in question, convert it to a hashed filename. * Check the cache directory for a file with this hashed filename * If it doesn't exist or is older than some threshold, save/download to cache directory * Render the page using the file from the cache directory A (minor?) problem with this approach is chosing a good threshold value. Don't know how long this would take to implement, but the MimeTeX recipe already has this framework as it's used to cache the images that are generated based on math formulas. Let me know if you think this is needed, otherwise I'll let it rest for now. /Christian -- Christian Ridderström, +46-8-768 39 44 http://www.md.kth.se/~chr
Re: Aussie is *slow*
Should be done now. yep, i'm gonna test it. i used trac to make stress test for the system. i send a lot of various request to certain trac pages for cca 5 min and waited cca 10min to have all pages showed in browser. in that time aussie load reached 24 https processes (you can see the peak around t=2120). after this peak it took the system aprox hour of time to get back to 8 httpd processes and some reasonable load (0.3 in time i'm writing this). during that one hour i watched apache logs and there was no extraordinary traffic which could be responsible for such kind of load ranging between 10-20. so i conclude this was needed just for (un)swapping purposes and cleaning processes which finally lead back to low load and number of httpd processes. this is _far_too_high_ and i suggest we should still go down with the number of maxclients; there is no point in allowing 24 apaches when their only work is swapping the whole box to death. i have observed that 8 processes are able to qork on some 0.x load, so this is the lower bound and i would say lets put the higher bound somewhere between 12-15. pavel
Re: Aussie is *slow*
i have observed that 8 processes are able to qork on some 0.x load, so this is the lower bound and i would say lets put the higher bound somewhere between 12-15. ... more thinking about this... another justification for the numbers above could be done this way: in the stress test max vss httpd mem with 24 childern reached its maximum somewhere around 900mb. when we take 12 childern, their vss mem would be somewhere around 450mb. when you look on the picture with the record of last seven days and plot horizontal line in 450mb, you will find that there are only two single measurements (i.e. maximal 10 mins spread time period) when usual traffic on server forced apache to fork this number of childern. so adjusting some 12 childern wont have effect on normal traffic while could significantly inhibit swapping when somebody starts playing with trac. pavel
Re: Aussie is *slow*
Pavel Sanda [EMAIL PROTECTED] writes: JMarc, could you try to change the line in httpd.conf MaxClients 48 into eg 24 and restart httpd? we can give it some trac-test to see how will aussie manage it. Should be done now. JMarc
Re: Aussie is *slow*
Should be done now. yep, i'm gonna test it. pavel
Re: Aussie is *slow*
On Tue, Feb 12, 2008 at 06:22:02PM +0100, Pavel Sanda wrote: I noticed/suspected too that you were browsing trac at the time where the bump happened on the graph. Do you use rss feeds? No, I just browsed it with a plain web browser (konqueror). I used nothing but the standard view, where the changes are colored. BTW WebSVN looks like a good alternative, and it seems easy to install. i would firstly just try to upgrade trac or change httpd settings. The whole system needs some major upgrading. If the kernel is as old as the running Apache uh no I better stop thinking about it. Maybe mod_fastcgi could help aswell if it's not used already. Sven -- If God passed a mic to me to speak I'd say stay in bed, world Sleep in peace [The Cardigans - 03:45: No sleep]
Re: Aussie is *slow*
Jürgen Spitzmüller wrote: I noticed/suspected too that you were browsing trac at the time where the bump happened on the graph. Do you use rss feeds? No, I just browsed it with a plain web browser (konqueror). I used nothing but the standard view, where the changes are colored. BTW WebSVN looks like a good alternative, and it seems easy to install. Jürgen
Re: Aussie is *slow*
Pavel Sanda [EMAIL PROTECTED] writes: look on my reply to Sven. i guess many medium-sized httpd processes is the cause. now i have caught the peak online Is there a way to know what these httpd children do? JMarc
Re: Aussie is *slow*
look on my reply to Sven. i guess many medium-sized httpd processes is the cause. now i have caught the peak online load average: 20.40, 22.81, 17.49 30277 0.0 1.1 22208 2836 ?Ss Feb07 0:06 /usr/sbin/httpd 10213 0.6 1.2 29972 3124 ?S17:47 0:07 /usr/sbin/httpd 10254 0.1 1.2 29376 3116 ?S17:47 0:02 /usr/sbin/httpd 10255 0.3 1.9 29772 5052 ?D17:47 0:03 /usr/sbin/httpd 10324 0.5 1.5 30312 3936 ?S17:49 0:05 /usr/sbin/httpd 10337 0.1 1.2 29192 3124 ?S17:49 0:01 /usr/sbin/httpd 10338 0.6 1.6 30400 4204 ?S17:49 0:06 /usr/sbin/httpd 10339 0.3 1.3 30340 3512 ?S17:49 0:03 /usr/sbin/httpd 10344 0.1 1.2 29044 3144 ?S17:49 0:01 /usr/sbin/httpd 10347 0.1 1.1 29180 3028 ?S17:49 0:01 /usr/sbin/httpd 10351 0.1 1.2 29188 3144 ?S17:49 0:01 /usr/sbin/httpd 10358 0.7 1.3 30672 3484 ?S17:50 0:06 /usr/sbin/httpd 10361 0.4 1.4 30184 3652 ?S17:50 0:04 /usr/sbin/httpd 10366 0.7 3.6 30396 9448 ?S17:50 0:06 /usr/sbin/httpd 10372 0.4 2.7 30344 6976 ?S17:50 0:04 /usr/sbin/httpd 10380 0.3 1.3 31956 3532 ?S17:51 0:03 /usr/sbin/httpd 10385 1.2 3.9 30692 10164 ?R17:51 0:11 /usr/sbin/httpd 10391 0.1 1.1 29392 2992 ?S17:51 0:01 /usr/sbin/httpd 10394 0.0 1.1 28008 3004 ?S17:51 0:00 /usr/sbin/httpd 10421 0.2 1.2 30112 3188 ?S17:52 0:02 /usr/sbin/httpd 10434 0.4 1.5 30252 3928 ?S17:52 0:03 /usr/sbin/httpd 10436 0.1 1.2 28876 3220 ?S17:52 0:01 /usr/sbin/httpd 10438 0.4 3.3 30016 8628 ?D17:52 0:03 /usr/sbin/httpd 10439 0.2 3.2 29064 8204 ?D17:52 0:02 /usr/sbin/httpd 10440 0.1 1.1 29304 3064 ?S17:52 0:01 /usr/sbin/httpd 10441 0.2 1.3 29692 3576 ?S17:52 0:02 /usr/sbin/httpd 10443 0.1 1.2 29712 3188 ?S17:52 0:01 /usr/sbin/httpd 10454 0.1 1.3 28680 3536 ?D17:53 0:01 /usr/sbin/httpd 10455 0.4 6.4 35940 16416 ?S17:53 0:03 /usr/sbin/httpd 10456 0.1 1.1 29316 3060 ?S17:53 0:00 /usr/sbin/httpd 10461 0.0 1.2 26864 3092 ?S17:53 0:00 /usr/sbin/httpd 10470 0.2 1.2 28880 3192 ?S17:53 0:01 /usr/sbin/httpd 10476 1.0 3.8 30548 9800 ?D17:54 0:07 /usr/sbin/httpd 10477 0.6 3.0 29876 7856 ?S17:54 0:04 /usr/sbin/httpd 10479 1.0 5.4 39184 14008 ?S17:54 0:07 /usr/sbin/httpd 10480 0.0 1.2 26864 3096 ?S17:54 0:00 /usr/sbin/httpd 10481 0.1 1.2 29040 3180 ?S17:54 0:01 /usr/sbin/httpd 10482 0.8 5.3 39024 13756 ?S17:54 0:05 /usr/sbin/httpd 10484 0.1 1.2 29164 3252 ?S17:54 0:00 /usr/sbin/httpd 10491 0.4 2.4 29928 6144 ?S17:54 0:03 /usr/sbin/httpd 10494 0.1 1.2 29212 3304 ?S17:54 0:01 /usr/sbin/httpd 10498 0.5 3.4 30528 8760 ?S17:54 0:03 /usr/sbin/httpd 10499 0.2 3.0 29756 7768 ?S17:54 0:01 /usr/sbin/httpd 10500 0.6 1.6 30324 4180 ?S17:54 0:04 /usr/sbin/httpd 10508 0.1 1.3 29044 3452 ?S17:55 0:01 /usr/sbin/httpd 10509 0.1 1.2 28728 3308 ?S17:55 0:01 /usr/sbin/httpd 10510 0.0 1.2 28024 3216 ?S17:55 0:00 /usr/sbin/httpd 10523 0.6 1.4 29956 3716 ?S17:56 0:03 /usr/sbin/httpd 10551 0.7 1.4 29404 3812 ?S17:58 0:03 /usr/sbin/httpd wow pavel
Re: Aussie is *slow*
Jean-Marc Lasgouttes wrote: I noticed/suspected too that you were browsing trac at the time where the bump happened on the graph. Do you use rss feeds? No, I just browsed it with a plain web browser (konqueror). I used nothing but the standard view, where the changes are colored. Jürgen
Re: Aussie is *slow*
look again on the picture, i made more zoomed view. i would say there is stronger correlation with the green line - especially look on the start of the peaks. Today, I noticed that it started to slow down while I was heavily browing trac. Could this be so evil? this is easy to test out. but wait when load is again down, we are again at peak. pavel
Re: Aussie is *slow*
Pavel Sanda wrote: look again on the picture, i made more zoomed view. i would say there is stronger correlation with the green line - especially look on the start of the peaks. Today, I noticed that it started to slow down while I was heavily browing trac. Could this be so evil? Jürgen
Re: Aussie is *slow*
I noticed/suspected too that you were browsing trac at the time where the bump happened on the graph. Do you use rss feeds? No, I just browsed it with a plain web browser (konqueror). I used nothing but the standard view, where the changes are colored. BTW WebSVN looks like a good alternative, and it seems easy to install. i would firstly just try to upgrade trac or change httpd settings. pavel
Re: Aussie is *slow*
Jürgen Spitzmüller [EMAIL PROTECTED] writes: Pavel Sanda wrote: look again on the picture, i made more zoomed view. i would say there is stronger correlation with the green line - especially look on the start of the peaks. Today, I noticed that it started to slow down while I was heavily browing trac. Could this be so evil? I noticed/suspected too that you were browsing trac at the time where the bump happened on the graph. Do you use rss feeds? JMarc
Re: Aussie is *slow*
Jean-Marc Lasgouttes [EMAIL PROTECTED] writes: Pavel Sanda [EMAIL PROTECTED] writes: i'm not sure about anything, because i dont have read access to aussie logs. No you for for httpd. Erm. Now you can for httpd. JMarc
Re: Aussie is *slow*
Pavel Sanda [EMAIL PROTECTED] writes: i'm not sure about anything, because i dont have read access to aussie logs. No you for for httpd. I am not sure what to do with them personally. JMarc
Re: Aussie is *slow*
Pavel Sanda [EMAIL PROTECTED] writes: btw i dont think that just restart is the way how to solve it. there must be some httpd config which bound the number of httpd childern. Is it the number of children that causes problems? There is also a spamd process that has grown to +100M virtual memory. look again on the picture, i made more zoomed view. i would say there is stronger correlation with the green line - especially look on the start of the peaks. If the httpd begin to use 1.6G of virtual memory, I am not surprised that the load increases... The question is to know how this happened: many medium-sized httpd processes, or on huge one? JMarc
Re: Aussie is *slow*
watch out, the real fun starts right now. Can you tell me at what time it started? cat ~sanda/log first offense seems to be around 10am today. btw i dont think that just restart is the way how to solve it. there must be some httpd config which bound the number of httpd childern. Depends on the worker module in use. Are you sure it's the number of connections that suddenly raises or is it child that somehow consumes a lot of the aviable resources? i'm not sure about anything, because i dont have read access to aussie logs. for the load around 20 before 15 mins (ie aussie unusable) there was: PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND 30277 0.0 1.1 22208 2912 ?Ss Feb07 0:06 /usr/sbin/httpd 7526 0.8 3.1 34120 8176 ?S16:53 0:02 /usr/sbin/httpd 7572 1.4 3.4 29524 8716 ?D16:54 0:04 /usr/sbin/httpd 7648 1.3 2.3 30180 5924 ?S16:54 0:03 /usr/sbin/httpd 7688 4.2 5.9 38052 15268 ?S16:55 0:09 /usr/sbin/httpd 7712 0.9 2.3 29740 6120 ?D16:55 0:01 /usr/sbin/httpd 7721 1.1 2.7 29044 6992 ?S16:56 0:01 /usr/sbin/httpd 7722 0.6 2.0 29272 5204 ?D16:56 0:01 /usr/sbin/httpd 7740 0.7 1.8 29028 4760 ?S16:56 0:01 /usr/sbin/httpd 7754 0.9 5.2 30916 13384 ?S16:56 0:01 /usr/sbin/httpd 7757 1.7 3.4 33368 8848 ?S16:57 0:01 /usr/sbin/httpd 7758 2.0 4.1 33324 10660 ?S16:57 0:02 /usr/sbin/httpd 7767 1.2 2.7 29316 6924 ?S16:57 0:01 /usr/sbin/httpd 7774 0.0 1.5 22704 4068 ?S16:57 0:00 /usr/sbin/httpd 7776 1.6 2.5 29288 6416 ?S16:57 0:01 /usr/sbin/httpd 7793 0.0 0.9 22344 2524 ?S16:58 0:00 /usr/sbin/httpd 7796 0.0 1.2 22344 3276 ?D16:58 0:00 /usr/sbin/httpd for the load 1.5 now there is: 30277 0.0 1.1 22208 2920 ?Ss Feb07 0:06 /usr/sbin/httpd 7857 3.9 7.1 36640 18396 ?S16:59 0:12 /usr/sbin/httpd 8095 1.7 3.4 29164 8904 ?S17:04 0:00 /usr/sbin/httpd 8096 2.2 3.3 28796 8504 ?S17:04 0:01 /usr/sbin/httpd (dont take the load numbers too precisely, ) pavel
Re: Aussie is *slow*
On Tue, Feb 12, 2008 at 04:27:13PM +0100, Pavel Sanda wrote: watch out, the real fun starts right now. Can you tell me at what time it started? cat ~sanda/log first offense seems to be around 10am today. btw i dont think that just restart is the way how to solve it. there must be some httpd config which bound the number of httpd childern. Depends on the worker module in use. Are you sure it's the number of connections that suddenly raises or is it child that somehow consumes a lot of the aviable resources? Sven FYI ... # prefork MPM # StartServers: number of server processes to start # MinSpareServers: minimum number of server processes which are kept spare # MaxSpareServers: maximum number of server processes which are kept spare # MaxClients: maximum number of server processes allowed to start # MaxRequestsPerChild: maximum number of requests a server process serves IfModule mpm_prefork_module StartServers 5 MinSpareServers 5 MaxSpareServers 10 MaxClients 150 MaxRequestsPerChild 0 /IfModule # worker MPM # StartServers: initial number of server processes to start # MaxClients: maximum number of simultaneous client connections # MinSpareThreads: minimum number of worker threads which are kept spare # MaxSpareThreads: maximum number of worker threads which are kept spare # ThreadsPerChild: constant number of worker threads in each server process # MaxRequestsPerChild: maximum number of requests a server process serves IfModule mpm_worker_module StartServers 2 MaxClients 150 MinSpareThreads 25 MaxSpareThreads 75 ThreadsPerChild 25 MaxRequestsPerChild 0 /IfModule -- If God passed a mic to me to speak I'd say stay in bed, world Sleep in peace [The Cardigans - 03:45: No sleep]
Re: Aussie is *slow*
Jean-Marc Lasgouttes [EMAIL PROTECTED] writes: Pavel Sanda [EMAIL PROTECTED] writes: watch out, the real fun starts right now. Can you tell me at what time it started? JMarc
Re: Aussie is *slow*
however the main point was to find correlation between black curve and other curves. now i see we have 3 spamd, so lets wait what will happen when some swap crisis will come again. I am happy to see that it has not happened yet :) watch out, the real fun starts right now. http://195.113.31.123/~sanda/junk/aussie_load.png http://195.113.31.123/~sanda/junk/aussie_load_whole_history.png pavel
Re: Aussie is *slow*
btw i dont think that just restart is the way how to solve it. there must be some httpd config which bound the number of httpd childern. Is it the number of children that causes problems? There is also a spamd process that has grown to +100M virtual memory. look again on the picture, i made more zoomed view. i would say there is stronger correlation with the green line - especially look on the start of the peaks. If the httpd begin to use 1.6G of virtual memory, I am not surprised that the load increases... The question is to know how this happened: many medium-sized httpd processes, or on huge one? so, watching last few hours the system processes during the slowness phase its clear that the number of httpd childs keeps longly around the number of 50 with ~1.6 GB of virtual memory and the system have problems to get back into the sane state. I suggest that we find by some bisecting the number of maximum childern, which is aussie able to survive. JMarc, could you try to change the line in httpd.conf MaxClients 48 into eg 24 and restart httpd? we can give it some trac-test to see how will aussie manage it. pavel
Re: Aussie is *slow*
watch out, the real fun starts right now. Can you tell me at what time it started? cat ~sanda/log first offense seems to be around 10am today. btw i dont think that just restart is the way how to solve it. there must be some httpd config which bound the number of httpd childern. pavel
Re: Aussie is *slow*
Pavel Sanda [EMAIL PROTECTED] writes: however the main point was to find correlation between black curve and other curves. now i see we have 3 spamd, so lets wait what will happen when some swap crisis will come again. I am happy to see that it has not happened yet :) watch out, the real fun starts right now. The problem seems to be related to httpd this time, not spamd. I'll let it continue a bit before restarting. JMarc
Re: Aussie is *slow*
Pavel Sanda [EMAIL PROTECTED] writes: btw i dont think that just restart is the way how to solve it. there must be some httpd config which bound the number of httpd childern. Is it the number of children that causes problems? There is also a spamd process that has grown to +100M virtual memory. JMarc
Re: Aussie is *slow*
Pavel Sanda wrote: Should be done now. yep, i'm gonna test it. i used trac to make stress test for the system. i send a lot of various request to certain trac pages for cca 5 min and waited cca 10min to have all pages showed in browser. in that time aussie load reached 24 https processes (you can see the peak around t=2120). Couldn't you at least upgrade some of the components? Trac is at 10.4 and we are still using 10.2, the changelog seems to say that the fix are important: after this peak it took the system aprox hour of time to get back to 8 httpd processes and some reasonable load (0.3 in time i'm writing this). during that one hour i watched apache logs and there was no extraordinary traffic which could be responsible for such kind of load ranging between 10-20. so i conclude this was needed just for (un)swapping purposes and cleaning processes which finally lead back to low load and number of httpd processes. this is _far_too_high_ and i suggest we should still go down with the number of maxclients; there is no point in allowing 24 apaches when their only work is swapping the whole box to death. i have observed that 8 processes are able to qork on some 0.x load, so this is the lower bound and i would say lets put the higher bound somewhere between 12-15. pavel
Re: Aussie is *slow*
btw i dont think that just restart is the way how to solve it. there must be some httpd config which bound the number of httpd childern. Is it the number of children that causes problems? There is also a spamd process that has grown to +100M virtual memory. look again on the picture, i made more zoomed view. i would say there is stronger correlation with the green line - especially look on the start of the peaks. If the httpd begin to use 1.6G of virtual memory, I am not surprised that the load increases... The question is to know how this happened: many medium-sized httpd processes, or on huge one? look on my reply to Sven. i guess many medium-sized httpd processes is the cause. pavel
Re: Aussie is *slow*
look on my reply to Sven. i guess many medium-sized httpd processes is the cause. now i have caught the peak online Is there a way to know what these httpd children do? either log settings or strace -p (plus some other options) ? i guess they just sleep, once a minute they get unswap then one microsecond of cpu time where they discover they havent got their data from disk yet and go sleeping again. pavel
Re: Aussie is *slow*
On Wed, 13 Feb 2008, Pavel Sanda wrote: so adjusting some 12 childern wont have effect on normal traffic while could significantly inhibit swapping when somebody starts playing with trac. Regarding Trac, the user base might be larger than you think as some of the wiki pages embed files from the repository using Trac. So if one or more users frequently look at such wiki pages, it will also indirectly invoke Trac (just to extract the relevan file of course). I'm just mentioning this in case it might be relevant to the problem. (Aussie is very slow at the moment btw) /Christian -- Christian Ridderström, +46-8-768 39 44 http://www.md.kth.se/~chr
Re: Aussie is *slow*
Pavel Sanda wrote: Should be done now. yep, i'm gonna test it. i used trac to make stress test for the system. i send a lot of various request to certain trac pages for cca 5 min and waited cca 10min to have all pages showed in browser. in that time aussie load reached 24 https processes (you can see the peak around t=2120). Couldn't you at least upgrade some of the components? Trac is at 10.4 and we are still using 10.2, the changelog seems to say that the fix are important: http://trac.edgewall.org/wiki/ChangeLog SpamAssassin is at 3.0.6, the last version is 3.2.3. Abdel.
Re: Aussie is *slow*
> > Should be done now. > > yep, i'm gonna test it. i used trac to make stress test for the system. i send a lot of various request to certain trac pages for cca 5 min and waited cca 10min to have all pages showed in browser. in that time aussie load reached 24 https processes (you can see the peak around t=2120). after this peak it took the system aprox hour of time to get back to 8 httpd processes and some reasonable load (0.3 in time i'm writing this). during that one hour i watched apache logs and there was no extraordinary traffic which could be responsible for such kind of load ranging between 10-20. so i conclude this was needed just for (un)swapping purposes and cleaning processes which finally lead back to low load and number of httpd processes. this is _far_too_high_ and i suggest we should still go down with the number of maxclients; there is no point in allowing 24 apaches when their only work is swapping the whole box to death. i have observed that 8 processes are able to qork on some 0.x load, so this is the lower bound and i would say lets put the higher bound somewhere between 12-15. pavel
Re: Aussie is *slow*
> i have observed that 8 processes are able to qork on some 0.x load, so this is > the lower bound and i would say lets put the higher bound somewhere between > 12-15. ... more thinking about this... another justification for the numbers above could be done this way: in the stress test max vss httpd mem with 24 childern reached its maximum somewhere around 900mb. when we take 12 childern, their vss mem would be somewhere around 450mb. when you look on the picture with the record of last seven days and plot horizontal line in 450mb, you will find that there are only two single measurements (i.e. maximal 10 mins spread time period) when "usual traffic" on server forced apache to fork this number of childern. so adjusting some 12 childern wont have effect on normal traffic while could significantly inhibit swapping when somebody starts playing with trac. pavel
Re: Aussie is *slow*
Pavel Sanda <[EMAIL PROTECTED]> writes: > JMarc, could you try to change the line in httpd.conf > MaxClients 48 > > into eg 24 and restart httpd? > > > we can give it some trac-test to see how will aussie manage it. Should be done now. JMarc
Re: Aussie is *slow*
> > Should be done now. yep, i'm gonna test it. pavel
Re: Aussie is *slow*
On Tue, Feb 12, 2008 at 06:22:02PM +0100, Pavel Sanda wrote: > > > > I noticed/suspected too that you were browsing trac at the time where > > > > the > > > > bump happened on the graph. Do you use rss feeds? > > > > > > No, I just browsed it with a plain web browser (konqueror). I used nothing > > > but the standard view, where the changes are colored. > > > > BTW WebSVN looks like a good alternative, and it seems easy to install. > > i would firstly just try to upgrade trac or change httpd settings. The whole system needs some major upgrading. If the kernel is as old as the running Apache uh no I better stop thinking about it. Maybe mod_fastcgi could help aswell if it's not used already. Sven -- If God passed a mic to me to speak I'd say stay in bed, world Sleep in peace [The Cardigans - 03:45: No sleep]
Re: Aussie is *slow*
Jürgen Spitzmüller wrote: > > I noticed/suspected too that you were browsing trac at the time where the > > bump happened on the graph. Do you use rss feeds? > > No, I just browsed it with a plain web browser (konqueror). I used nothing > but the standard view, where the changes are colored. BTW WebSVN looks like a good alternative, and it seems easy to install. Jürgen
Re: Aussie is *slow*
Pavel Sanda <[EMAIL PROTECTED]> writes: >> look on my reply to Sven. i guess many medium-sized httpd processes is the >> cause. > > now i have caught the peak online Is there a way to know what these httpd children do? JMarc
Re: Aussie is *slow*
> look on my reply to Sven. i guess many medium-sized httpd processes is the > cause. now i have caught the peak online load average: 20.40, 22.81, 17.49 30277 0.0 1.1 22208 2836 ?Ss Feb07 0:06 /usr/sbin/httpd 10213 0.6 1.2 29972 3124 ?S17:47 0:07 /usr/sbin/httpd 10254 0.1 1.2 29376 3116 ?S17:47 0:02 /usr/sbin/httpd 10255 0.3 1.9 29772 5052 ?D17:47 0:03 /usr/sbin/httpd 10324 0.5 1.5 30312 3936 ?S17:49 0:05 /usr/sbin/httpd 10337 0.1 1.2 29192 3124 ?S17:49 0:01 /usr/sbin/httpd 10338 0.6 1.6 30400 4204 ?S17:49 0:06 /usr/sbin/httpd 10339 0.3 1.3 30340 3512 ?S17:49 0:03 /usr/sbin/httpd 10344 0.1 1.2 29044 3144 ?S17:49 0:01 /usr/sbin/httpd 10347 0.1 1.1 29180 3028 ?S17:49 0:01 /usr/sbin/httpd 10351 0.1 1.2 29188 3144 ?S17:49 0:01 /usr/sbin/httpd 10358 0.7 1.3 30672 3484 ?S17:50 0:06 /usr/sbin/httpd 10361 0.4 1.4 30184 3652 ?S17:50 0:04 /usr/sbin/httpd 10366 0.7 3.6 30396 9448 ?S17:50 0:06 /usr/sbin/httpd 10372 0.4 2.7 30344 6976 ?S17:50 0:04 /usr/sbin/httpd 10380 0.3 1.3 31956 3532 ?S17:51 0:03 /usr/sbin/httpd 10385 1.2 3.9 30692 10164 ?R17:51 0:11 /usr/sbin/httpd 10391 0.1 1.1 29392 2992 ?S17:51 0:01 /usr/sbin/httpd 10394 0.0 1.1 28008 3004 ?S17:51 0:00 /usr/sbin/httpd 10421 0.2 1.2 30112 3188 ?S17:52 0:02 /usr/sbin/httpd 10434 0.4 1.5 30252 3928 ?S17:52 0:03 /usr/sbin/httpd 10436 0.1 1.2 28876 3220 ?S17:52 0:01 /usr/sbin/httpd 10438 0.4 3.3 30016 8628 ?D17:52 0:03 /usr/sbin/httpd 10439 0.2 3.2 29064 8204 ?D17:52 0:02 /usr/sbin/httpd 10440 0.1 1.1 29304 3064 ?S17:52 0:01 /usr/sbin/httpd 10441 0.2 1.3 29692 3576 ?S17:52 0:02 /usr/sbin/httpd 10443 0.1 1.2 29712 3188 ?S17:52 0:01 /usr/sbin/httpd 10454 0.1 1.3 28680 3536 ?D17:53 0:01 /usr/sbin/httpd 10455 0.4 6.4 35940 16416 ?S17:53 0:03 /usr/sbin/httpd 10456 0.1 1.1 29316 3060 ?S17:53 0:00 /usr/sbin/httpd 10461 0.0 1.2 26864 3092 ?S17:53 0:00 /usr/sbin/httpd 10470 0.2 1.2 28880 3192 ?S17:53 0:01 /usr/sbin/httpd 10476 1.0 3.8 30548 9800 ?D17:54 0:07 /usr/sbin/httpd 10477 0.6 3.0 29876 7856 ?S17:54 0:04 /usr/sbin/httpd 10479 1.0 5.4 39184 14008 ?S17:54 0:07 /usr/sbin/httpd 10480 0.0 1.2 26864 3096 ?S17:54 0:00 /usr/sbin/httpd 10481 0.1 1.2 29040 3180 ?S17:54 0:01 /usr/sbin/httpd 10482 0.8 5.3 39024 13756 ?S17:54 0:05 /usr/sbin/httpd 10484 0.1 1.2 29164 3252 ?S17:54 0:00 /usr/sbin/httpd 10491 0.4 2.4 29928 6144 ?S17:54 0:03 /usr/sbin/httpd 10494 0.1 1.2 29212 3304 ?S17:54 0:01 /usr/sbin/httpd 10498 0.5 3.4 30528 8760 ?S17:54 0:03 /usr/sbin/httpd 10499 0.2 3.0 29756 7768 ?S17:54 0:01 /usr/sbin/httpd 10500 0.6 1.6 30324 4180 ?S17:54 0:04 /usr/sbin/httpd 10508 0.1 1.3 29044 3452 ?S17:55 0:01 /usr/sbin/httpd 10509 0.1 1.2 28728 3308 ?S17:55 0:01 /usr/sbin/httpd 10510 0.0 1.2 28024 3216 ?S17:55 0:00 /usr/sbin/httpd 10523 0.6 1.4 29956 3716 ?S17:56 0:03 /usr/sbin/httpd 10551 0.7 1.4 29404 3812 ?S17:58 0:03 /usr/sbin/httpd wow pavel
Re: Aussie is *slow*
Jean-Marc Lasgouttes wrote: > I noticed/suspected too that you were browsing trac at the time where the > bump happened on the graph. Do you use rss feeds? No, I just browsed it with a plain web browser (konqueror). I used nothing but the standard view, where the changes are colored. Jürgen
Re: Aussie is *slow*
> > look again on the picture, i made more zoomed view. > > i would say there is stronger correlation with the green line - especially > > look on the start of the peaks. > > Today, I noticed that it started to slow down while I was heavily browing > trac. Could this be so evil? this is easy to test out. but wait when load is again down, we are again at peak. pavel
Re: Aussie is *slow*
Pavel Sanda wrote: > look again on the picture, i made more zoomed view. > i would say there is stronger correlation with the green line - especially > look on the start of the peaks. Today, I noticed that it started to slow down while I was heavily browing trac. Could this be so evil? Jürgen
Re: Aussie is *slow*
> > > I noticed/suspected too that you were browsing trac at the time where the > > > bump happened on the graph. Do you use rss feeds? > > > > No, I just browsed it with a plain web browser (konqueror). I used nothing > > but the standard view, where the changes are colored. > > BTW WebSVN looks like a good alternative, and it seems easy to install. i would firstly just try to upgrade trac or change httpd settings. pavel
Re: Aussie is *slow*
Jürgen Spitzmüller <[EMAIL PROTECTED]> writes: > Pavel Sanda wrote: >> look again on the picture, i made more zoomed view. >> i would say there is stronger correlation with the green line - especially >> look on the start of the peaks. > > Today, I noticed that it started to slow down while I was heavily browing > trac. Could this be so evil? I noticed/suspected too that you were browsing trac at the time where the bump happened on the graph. Do you use rss feeds? JMarc
Re: Aussie is *slow*
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: > Pavel Sanda <[EMAIL PROTECTED]> writes: > >> i'm not sure about anything, because i dont have read access to aussie logs. > > No you for for httpd. Erm. "Now you can for httpd". JMarc
Re: Aussie is *slow*
Pavel Sanda <[EMAIL PROTECTED]> writes: > i'm not sure about anything, because i dont have read access to aussie logs. No you for for httpd. I am not sure what to do with them personally. JMarc
Re: Aussie is *slow*
Pavel Sanda <[EMAIL PROTECTED]> writes: >> > btw i dont think that just restart is the way how to solve it. >> > there must be some httpd config which bound the number of httpd childern. >> >> Is it the number of children that causes problems? There is also a >> spamd process that has grown to +100M virtual memory. > > look again on the picture, i made more zoomed view. > i would say there is stronger correlation with the green line - especially > look on the start of the peaks. If the httpd begin to use 1.6G of virtual memory, I am not surprised that the load increases... The question is to know how this happened: many medium-sized httpd processes, or on huge one? JMarc
Re: Aussie is *slow*
> > > >> watch out, the real fun starts right now. > > > > > > Can you tell me at what time it started? > > > > cat ~sanda/log > > first offense seems to be around 10am today. > > > > btw i dont think that just restart is the way how to solve it. > > there must be some httpd config which bound the number of httpd childern. > > Depends on the worker module in use. Are you sure it's the number > of connections that suddenly raises or is it child that somehow consumes > a lot of the aviable resources? i'm not sure about anything, because i dont have read access to aussie logs. for the load around 20 before 15 mins (ie aussie unusable) there was: PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND 30277 0.0 1.1 22208 2912 ?Ss Feb07 0:06 /usr/sbin/httpd 7526 0.8 3.1 34120 8176 ?S16:53 0:02 /usr/sbin/httpd 7572 1.4 3.4 29524 8716 ?D16:54 0:04 /usr/sbin/httpd 7648 1.3 2.3 30180 5924 ?S16:54 0:03 /usr/sbin/httpd 7688 4.2 5.9 38052 15268 ?S16:55 0:09 /usr/sbin/httpd 7712 0.9 2.3 29740 6120 ?D16:55 0:01 /usr/sbin/httpd 7721 1.1 2.7 29044 6992 ?S16:56 0:01 /usr/sbin/httpd 7722 0.6 2.0 29272 5204 ?D16:56 0:01 /usr/sbin/httpd 7740 0.7 1.8 29028 4760 ?S16:56 0:01 /usr/sbin/httpd 7754 0.9 5.2 30916 13384 ?S16:56 0:01 /usr/sbin/httpd 7757 1.7 3.4 33368 8848 ?S16:57 0:01 /usr/sbin/httpd 7758 2.0 4.1 33324 10660 ?S16:57 0:02 /usr/sbin/httpd 7767 1.2 2.7 29316 6924 ?S16:57 0:01 /usr/sbin/httpd 7774 0.0 1.5 22704 4068 ?S16:57 0:00 /usr/sbin/httpd 7776 1.6 2.5 29288 6416 ?S16:57 0:01 /usr/sbin/httpd 7793 0.0 0.9 22344 2524 ?S16:58 0:00 /usr/sbin/httpd 7796 0.0 1.2 22344 3276 ?D16:58 0:00 /usr/sbin/httpd for the load 1.5 now there is: 30277 0.0 1.1 22208 2920 ?Ss Feb07 0:06 /usr/sbin/httpd 7857 3.9 7.1 36640 18396 ?S16:59 0:12 /usr/sbin/httpd 8095 1.7 3.4 29164 8904 ?S17:04 0:00 /usr/sbin/httpd 8096 2.2 3.3 28796 8504 ?S17:04 0:01 /usr/sbin/httpd (dont take the load numbers too precisely, ) pavel
Re: Aussie is *slow*
On Tue, Feb 12, 2008 at 04:27:13PM +0100, Pavel Sanda wrote: > > >> watch out, the real fun starts right now. > > > > Can you tell me at what time it started? > > cat ~sanda/log > first offense seems to be around 10am today. > > btw i dont think that just restart is the way how to solve it. > there must be some httpd config which bound the number of httpd childern. Depends on the worker module in use. Are you sure it's the number of connections that suddenly raises or is it child that somehow consumes a lot of the aviable resources? Sven FYI ... # prefork MPM # StartServers: number of server processes to start # MinSpareServers: minimum number of server processes which are kept spare # MaxSpareServers: maximum number of server processes which are kept spare # MaxClients: maximum number of server processes allowed to start # MaxRequestsPerChild: maximum number of requests a server process serves StartServers 5 MinSpareServers 5 MaxSpareServers 10 MaxClients 150 MaxRequestsPerChild 0 # worker MPM # StartServers: initial number of server processes to start # MaxClients: maximum number of simultaneous client connections # MinSpareThreads: minimum number of worker threads which are kept spare # MaxSpareThreads: maximum number of worker threads which are kept spare # ThreadsPerChild: constant number of worker threads in each server process # MaxRequestsPerChild: maximum number of requests a server process serves StartServers 2 MaxClients 150 MinSpareThreads 25 MaxSpareThreads 75 ThreadsPerChild 25 MaxRequestsPerChild 0 -- If God passed a mic to me to speak I'd say stay in bed, world Sleep in peace [The Cardigans - 03:45: No sleep]
Re: Aussie is *slow*
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: > Pavel Sanda <[EMAIL PROTECTED]> writes: >> watch out, the real fun starts right now. Can you tell me at what time it started? JMarc
Re: Aussie is *slow*
> > however the main point was to find correlation between black curve and other > > curves. now i see we have 3 spamd, so lets wait what will happen when > > some swap crisis will come again. > > I am happy to see that it has not happened yet :) watch out, the real fun starts right now. http://195.113.31.123/~sanda/junk/aussie_load.png http://195.113.31.123/~sanda/junk/aussie_load_whole_history.png pavel
Re: Aussie is *slow*
> >> > btw i dont think that just restart is the way how to solve it. > >> > there must be some httpd config which bound the number of httpd childern. > >> > >> Is it the number of children that causes problems? There is also a > >> spamd process that has grown to +100M virtual memory. > > > > look again on the picture, i made more zoomed view. > > i would say there is stronger correlation with the green line - especially > > look on the start of the peaks. > > If the httpd begin to use 1.6G of virtual memory, I am not surprised > that the load increases... The question is to know how this happened: > many medium-sized httpd processes, or on huge one? so, watching last few hours the system processes during the slowness phase its clear that the number of httpd childs keeps longly around the number of 50 with ~1.6 GB of virtual memory and the system have problems to get back into the sane state. I suggest that we find by some bisecting the number of maximum childern, which is aussie able to survive. JMarc, could you try to change the line in httpd.conf MaxClients 48 into eg 24 and restart httpd? we can give it some trac-test to see how will aussie manage it. pavel
Re: Aussie is *slow*
> >> watch out, the real fun starts right now. > > Can you tell me at what time it started? cat ~sanda/log first offense seems to be around 10am today. btw i dont think that just restart is the way how to solve it. there must be some httpd config which bound the number of httpd childern. pavel
Re: Aussie is *slow*
Pavel Sanda <[EMAIL PROTECTED]> writes: >> > however the main point was to find correlation between black curve and >> > other >> > curves. now i see we have 3 spamd, so lets wait what will happen when >> > some swap crisis will come again. >> >> I am happy to see that it has not happened yet :) > > watch out, the real fun starts right now. The problem seems to be related to httpd this time, not spamd. I'll let it continue a bit before restarting. JMarc