Re: [Savannah-hackers-public] excessively high load avg, again (daily?)

2011-12-09 Thread Karl Berry
This high load may be due to the nightly rsync you see below:

Looks more like a runaway python to me.

PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
   1927 www-data  39  19 1090m 740m 3596 S0 12.1 425:43.54 python
   2691 nobody20   0  109m  58m  33m S0  1.0   0:02.24 git
  10840 root  20   0  107m  48m 1556 D1  0.8   1:19.83 rsync

(BTW, not to be teaching my betters, but I suggest top c to show
command lines and not just names.)

But anyway ... Michael, what is the rsync job doing?  It is easy to make
it consume less resources in exchange for more time, --bwlimit=100
(Kbytes/sec) or whatever number turns out to be good.  It should
probably also be running with nice -19.

Best,
Karl



Re: [Savannah-hackers-public] excessively high load avg, again (daily?)

2011-12-09 Thread Michael J. Flickinger

On 12/09/2011 07:49 PM, Karl Berry wrote:

 This high load may be due to the nightly rsync you see below:

Looks more like a runaway python to me.

 PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
1927 www-data  39  19 1090m 740m 3596 S0 12.1 425:43.54 python
2691 nobody20   0  109m  58m  33m S0  1.0   0:02.24 git
   10840 root  20   0  107m  48m 1556 D1  0.8   1:19.83 rsync

(BTW, not to be teaching my betters, but I suggest top c to show
command lines and not just names.)

But anyway ... Michael, what is the rsync job doing?  It is easy to make
it consume less resources in exchange for more time, --bwlimit=100
(Kbytes/sec) or whatever number turns out to be good.  It should
probably also be running with nice -19.


I nice'd rsync to 19,  The python process, loggerhead, is already nice'd 
to 19 as well.





Re: [Savannah-hackers-public] excessively high load avg, again (daily?)

2011-12-08 Thread Ward Vandewege
On Thu, Dec 08, 2011 at 01:18:13PM +0100, Jim Meyering wrote:
 This high load may be due to the nightly rsync you see below:

The rsync job (nightly backups) is likely the cause of the high IO wait (22%
in your top output).

Still, I don't think that that would account for all of the high load
average.

Thanks,
Ward.

-- 
Ward Vandewege | CTO, Free Software Foundation
GPG Key: 25F774AB | http://identi.ca/cure | http://fsf.org/blogs/RSS

Do you use free software? Donate to join the FSF and support freedom at
 http://www.fsf.org/register_form?referrer=859



Re: [Savannah-hackers-public] excessively high load avg, again (daily?)

2011-12-08 Thread Michael J. Flickinger

On 12/8/11 7:18 AM, Jim Meyering wrote:

Since there were some git-daemon processes dating back to November
and since we have a limit on those (at least I think that's what
Michael said), I've just killed those November git-daemon processes.


I just added a timeout for them, just like we do for bzr.


Perhaps related, there are lots of these in dmesg.
We get from 1 to ~6 per hour:

[2572324.224898] cgit.cgi[29777]: segfault at 863d000 ip b768c810 sp bfe9da2c 
error 6 in libc-2.11.2.so[b7619000+14]



I'll investigate this further later today... In the near future I'll 
upgrade the version of git we're using, along with upgrading cgit.cgi.


--
Michael J. Flickinger



Re: [Savannah-hackers-public] excessively high load avg, again (daily?)

2011-12-08 Thread Jim Meyering
Michael J. Flickinger wrote:
 On 12/8/11 7:18 AM, Jim Meyering wrote:
 Since there were some git-daemon processes dating back to November
 and since we have a limit on those (at least I think that's what
 Michael said), I've just killed those November git-daemon processes.

 I just added a timeout for them, just like we do for bzr.

 Perhaps related, there are lots of these in dmesg.
 We get from 1 to ~6 per hour:

 [2572324.224898] cgit.cgi[29777]: segfault at 863d000 ip b768c810 sp
 bfe9da2c error 6 in libc-2.11.2.so[b7619000+14]

 I'll investigate this further later today... In the near future I'll
 upgrade the version of git we're using, along with upgrading cgit.cgi.

Great.  Thanks for pursuing it.



Re: [Savannah-hackers-public] excessively high load avg, again (daily?)

2011-12-08 Thread Michael J. Flickinger

On 12/8/11 3:30 PM, Jim Meyering wrote:

Michael J. Flickinger wrote:

On 12/8/11 7:18 AM, Jim Meyering wrote:

Since there were some git-daemon processes dating back to November
and since we have a limit on those (at least I think that's what
Michael said), I've just killed those November git-daemon processes.


I just added a timeout for them, just like we do for bzr.


Perhaps related, there are lots of these in dmesg.
We get from 1 to ~6 per hour:

[2572324.224898] cgit.cgi[29777]: segfault at 863d000 ip b768c810 sp
bfe9da2c error 6 in libc-2.11.2.so[b7619000+14]


I'll investigate this further later today... In the near future I'll
upgrade the version of git we're using, along with upgrading cgit.cgi.


Great.  Thanks for pursuing it.


Upgraded git:
git version 1.7.2.5

Upgraded to latest version of cgit:
0.9.0.2

I'm not really sure I like the idea of C cgi programs.  Hopefully the 
segfaults will stop.  If they don't, I'll debug it when I have some time...


--
Michael J. Flickinger