Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-28 Thread Martin Langhoff
On Mon, Dec 21, 2009 at 3:36 PM, Martin Langhoff wrote: > On Mon, Dec 21, 2009 at 3:32 PM, Martin Langhoff > wrote: >> I've added a big lock around the process, so from now on Moodle >> processes won't overlap in this sync. This means that your server is >> now running a lightly patched Moodle --

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-21 Thread Martin Langhoff
On Mon, Dec 21, 2009 at 4:18 PM, crodas wrote: > According to the ticket the solution is a locking file which prevent the > re-execution. Well, I've wrote a dumb script awhile ago that might help, > it's not innovating, but it might help: Thanks! The lock I coded is using a moodle-specific bit of

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-21 Thread crodas
Hello, According to the ticket the solution is a locking file which prevent the re-execution. Well, I've wrote a dumb script awhile ago that might help, it's not innovating, but it might help: #!/bin/bash LOCK=/tmp/erlang.lock CMD=$1 if [ -f $LOCK ] then PID=`cat $LOCK` UP=`ps $PID | wc

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-21 Thread Devon Connolly
Ok then. Thanks a lot for the assistance. Things seem to be back to normal. I will look closer tomorrow when the kids are here. ___ Server-devel mailing list Server-devel@lists.laptop.org http://lists.laptop.org/listinfo/server-devel

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-21 Thread Martin Langhoff
On Mon, Dec 21, 2009 at 3:32 PM, Martin Langhoff wrote: > I've added a big lock around the process, so from now on Moodle > processes won't overlap in this sync. This means that your server is > now running a lightly patched Moodle -- I will release this as a new > rpm soon. Filed as http://dev.l

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-21 Thread Martin Langhoff
On Mon, Dec 21, 2009 at 3:14 PM, Martin Langhoff wrote: > Now it's up on a pristine state, and I am monitoring it... Ok - the problem seems related to Moodle's control of ejabberd presence service. The sync between Moodle and ejabberd data (in mnesia) was taking too long, and a second Moodle sync

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-21 Thread Martin Langhoff
On Sun, Dec 20, 2009 at 12:57 PM, Martin Langhoff wrote: > Yep, I am interested in getting to the bottom of this. I think I have an initial assessment of the situation. Clearly, the mnesia DB got corrupted somehow. Because of that... - the init script did cannot stop ejabberd normally... - k

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-20 Thread Martin Langhoff
On Sat, Dec 19, 2009 at 7:32 PM, Devon Connolly wrote: > >>  - Is there any disk anomaly? (Reboot forcing a fsck?) > > Not that I've noticed. Ok, but can you try doing a reboot that forces fsck? As follows: touch /forcefsck reboot or shutdown -Fr now > Verify checked out on the ejabberd-x

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-19 Thread Devon Connolly
> - Is there any disk anomaly? (Reboot forcing a fsck?) Not that I've noticed. > > - Is there any problem in the binaries? If you run rpm with the > 'verify' options, it'll check that no binaries have been corrupted > on-disk... It's normal to see some config files changed, but no > binaries s

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-19 Thread Martin Langhoff
On Sat, Dec 19, 2009 at 1:31 PM, Devon Connolly wrote: > Changing the domain, I still get the following error when it tries (and > fails to shutdown ejabberd). As it doesn't stop cleanly, shut down ejabberd by hand, kill -9 it if needed, and then change the domain twice to clear the DB. Then star

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-19 Thread Martin Langhoff
On Sat, Dec 19, 2009 at 1:31 PM, Devon Connolly wrote: > Beam is still consuming 100% of the cpu after a few minutes.  I'm going to > leave that script running to see what it does over the next few hours. That's really abnormal. - Is there any disk anomaly? (Reboot forcing a fsck?) - Is there

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-19 Thread Devon Connolly
Changing the domain, I still get the following error when it tries (and fails to shutdown ejabberd). ___ Crash dump was written to: erl_crash.dump Kernel pid terminated (application_controller) ({application_start_failure,kern

[Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-19 Thread Devon Connolly
Here is another example after it has been running all night. http://pastebin.com/m11537281 As you can see, these runaway beam processes vary greatly in there RAM usage. Also, they are always using 100% of the cpu. I will try to clear the DB now and see what happens. On Fri, Dec 18, 2009 at 1

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-18 Thread Martin Langhoff
On Fri, Dec 18, 2009 at 1:37 PM, Devon Connolly wrote: > Anyway, back on topic...  Here is that script slightly modified running on > a fresh boot.  I'm going to leave this looping and post the file to > pastebin.  Here is an initial output after only like 10 minutes.  It will > get more interesti

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-18 Thread Devon Connolly
> Don't reinstall. If possible, let's try to debug this. If you're going > to give up, just > > 1 - Backup /var/lib/ejabberd -- just tar it up > 2 - Use the 'domain_config' script to change the domain -- this will > re-generate the ejabberd mnesia database. What I'd do: change it to > 'foo.com' an

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-17 Thread Martin Langhoff
On Thu, Dec 17, 2009 at 9:32 PM, Devon Connolly wrote: > The server had an uptime of about 50 days before this occurred.  There were > no problems and nothing has changed in the 2 or so days since this problem > began.  Like had said previously, it seems to have occurred since reflashing > and re-

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-17 Thread Devon Connolly
The server had an uptime of about 50 days before this occurred. There were no problems and nothing has changed in the 2 or so days since this problem began. Like had said previously, it seems to have occurred since reflashing and re-registering a student's XO, but I believe that to be a coinciden

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-17 Thread Martin Langhoff
On Thu, Dec 17, 2009 at 1:12 PM, Martin Langhoff wrote > On Thu, Dec 17, 2009 at 11:35 AM, Devon Connolly wrote: >> XS Version: 0.6 >> 1 GB Physical Ram, 2GB Swap > > Ok - the RAM is on the low side for an XS but should handle 150 ok. > >> # ejabberdctl connected-users > ... > I counted 12 lines

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-17 Thread Martin Langhoff
On Thu, Dec 17, 2009 at 11:35 AM, Devon Connolly wrote: > XS Version: 0.6 > 1 GB Physical Ram, 2GB Swap Ok - the RAM is on the low side for an XS but should handle 150 ok. > # ejabberdctl connected-users ... I counted 12 lines in the output of connected-users. That should not cause trouble. > A

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-17 Thread Devon Connolly
XS Version: 0.6 1 GB Physical Ram, 2GB Swap 154 XO's Registered, Any number connected when the problem happens, 0-XX The XS is controlling dhcp but nothing out of the ordinary as far as leases are concerned. No Active Antenna # /home/idmgr/list_registration http://pastebin.com/m762076bb # ejabb

Re: [Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-16 Thread Martin Langhoff
Hi Devon, Sure we can debug this. Lots of questions for you - version of XS? - How much physical RAM? - Number of XOs registered, and in use on the network when the problem happens - Output of the commands suggested in http://wiki.laptop.org/go/XS_Techniques_and_Configuration#Presence_Serv

[Server-devel] Ejabberd CPU/RAM Spike -> Crashes

2009-12-16 Thread Devon Connolly
I'm having some issues with ejabbered after re-flashing and re-registering a student's XO. No other changes were made to the server; however, the beam process has begun to constantly use 100% cpu while the ram usage swells to over 1GB and then proceeds to eat the 2GB swap. This continues until the