Re: [asterisk-users] Crash Hard, Crash Often

2009-02-06 Thread Josiah Bryan
Paul Chambers wrote:
 Josiah Bryan wrote:
 snip
 Problem is that its crashing for seemingly no reason at all, no errors 
 on the console, no logs (that I can find), nothing in /var/lib/messages 
 - its puzzeling! Management is screaming like banshees, calls are 
 dropping like flies, and all hell is about to break loose if I can't 
 stop asterisk from crashing every couple of hours, taking down any 
 Zaptel calls with it.
 /snip

snip
 That description reminds me of a problem I ran into a while back. One 
 fan had quietly failed, and the temperature would slowly creep up inside 
 the box until things started 'acting funny' and the box would lock up 
 soon after. It'd run fine for 3-4 hours, then just keel over and die. 
 The logs didn't show anything consistent just before the event.

The wierd thing is that its *just* the asterisk process that dies - the 
rest of the system stays solidly up...

snip
 Do you have another PC you can swap the drive and cards into, to try to 
 rule out hardware instability? could you run lm_sensors? (along with one 
 of the logging/alarm packages that support it).

Well, Paul, it looks like that was indeed the problem (hardware 
instability.) I came into the office last night after everyone left in 
order to swap out the RAM in the server - lo and behold, I didn't have 
any of that type of RAM around (RIMM's ??), so I had to do an emergency 
hard drive  PCI card transplant to a similar chassis.

After a bit of tweaking to get ALSA to work right and the NIC to play 
nice in the new chassis, asterisk came online and worked beautifully. 
(And, shockingly enough, the zaptel cards just *worked* - no tweaking 
needed!)

So far, no crashes today (by this time, normally it's crashed two or 
three times already in a day.)

So, we'll see how she runs - If I were a betting man, I'd say that 
something in that old chassis was going out - probably the RAM as stated 
before, but not sure.

As far as the power supply being good, I believe it was - didn't 
check. The server was a re-purposed high-end CAD workstation - the 
dismal RAM and CPU belie the solid construction of the chassis and the 
quality of the workmanship in the way the server was put together.

Now that I've waxed weird, I'll just say the hardware seems to have been 
the problem and I'll keep and eye on it. This may have yet saved me from 
converting over to the callweaver fork - we'll see. :-)

Cheers!
-josiah

-- 
Josiah Bryan
IT Manager
Productive Concepts, Inc.
jbr...@productiveconcepts.com
(765) 964-6009, ext. 224


___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


Re: [asterisk-users] Crash Hard, Crash Often

2009-02-05 Thread Danny Nicholas
I've been running 1.4.21.2 on SUSE 11.0 for about 4 months.  In my
experience, the fewer database interfaces you can use, the more stable it
will be.

-Original Message-
From: asterisk-users-boun...@lists.digium.com
[mailto:asterisk-users-boun...@lists.digium.com] On Behalf Of Josiah Bryan
Sent: Thursday, February 05, 2009 8:57 AM
To: Asterisk Users Mailing List - Non-Commercial Discussion
Subject: [asterisk-users] Crash Hard, Crash Often

I've been using asterisk for 3+ years now, I love it, but it doesnt love 
me back. :-)

It was crashing frequently and seemingly randomly prior to this latest 
upgrade. Not sure what version it was running prior to upgrade (it was 
probably an old CVS HEAD from 2+ years go.) Anyway, currently running 
1.4.21.2.

== Problem ==

Problem is that its crashing for seemingly no reason at all, no errors 
on the console, no logs (that I can find), nothing in /var/lib/messages 
- its puzzeling! Management is screaming like banshees, calls are 
dropping like flies, and all hell is about to break loose if I can't 
stop asterisk from crashing every couple of hours, taking down any 
Zaptel calls with it.

I've been thinking of switching over to CallWeaver, but I havn't got 
another Zaptel card to plugin to my testing box, so I'd like to just get 
Asterisk stabilized right now - but I'm at a loss of even where to start.

== System Details ==

Running FC3, 2.6.9-1.667 kernel, 32 bit, with 256 MB ram and a 20G hard 
drive. I've got two 4-port FXO cards in PCI slots.

lspci reports:
02:08.0 Communication controller: Tiger Jet Network Inc. Tiger3XX 
Modem/ISDN interface
02:0a.0 Communication controller: Tiger Jet Network Inc. Tiger3XX 
Modem/ISDN interface


[r...@asterisk ~]# cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 15
model   : 1
model name  : Intel(R) Pentium(R) 4 CPU 1.50GHz
stepping: 2
cpu MHz : 1483.679
cache size  : 256 KB
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 2
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca 
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips: 2924.54


==

Thanks for any help or advice anyone may have. Cheers!
-josiah

-- 
Josiah Bryan
IT Manager
Productive Concepts, Inc.
jbr...@productiveconcepts.com
(765) 964-6009, ext. 224


___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


Re: [asterisk-users] Crash Hard, Crash Often

2009-02-05 Thread Doug Lytle
Josiah Bryan wrote:
 I've been using asterisk for 3+ years now, I love it, but it doesnt love 
 me back. :-)

   

The first place I usually start is with memtest86

Doug



-- 
 
Ben Franklin quote:

Those who would give up Essential Liberty to purchase a little Temporary 
Safety, deserve neither Liberty nor Safety.


___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


Re: [asterisk-users] Crash Hard, Crash Often

2009-02-05 Thread David Gibbons
snip
Problem is that its crashing for seemingly no reason at all, no errors
on the console, no logs (that I can find), nothing in /var/lib/messages
- its puzzeling! Management is screaming like banshees, calls are
dropping like flies, and all hell is about to break loose if I can't
stop asterisk from crashing every couple of hours, taking down any
Zaptel calls with it.
/snip

I am assuming you have debug turned on so that you can see what's going on when 
it crashes? If not, open the * console (asterisk -r) and type (core set verbose 
100) and (core set debug 100). Then leave the console open so you can see if * 
was doing anything special when it crashed.

--Dave

___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


Re: [asterisk-users] Crash Hard, Crash Often

2009-02-05 Thread Roderick A. Anderson
Doug Lytle wrote:
 Josiah Bryan wrote:
 I've been using asterisk for 3+ years now, I love it, but it doesnt love 
 me back. :-)

   
 
 The first place I usually start is with memtest86

Here, here!

Every time I have had problems with a system (not just Asterisk) 
crashing and there is nothing in the logs it turns out to be hardware.

One slight exception was where a UPS would brownout every so often 
causing the system to go out to lunch.  Even though there were three 
power supplies in that system someone (not me) had _forgot_ to put them 
on separate UPS' ... they were all on one.

Actually, that is hardware, just not in the system case.

So check your hardware.


Rod
-- 


___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


Re: [asterisk-users] Crash Hard, Crash Often

2009-02-05 Thread Josiah Bryan
Roderick A. Anderson wrote:
 Doug Lytle wrote:
 Josiah Bryan wrote:
 I've been using asterisk for 3+ years now, I love it, but it doesnt love 
 me back. :-)

   
 The first place I usually start is with memtest86
 
 Here, here!
 
 Every time I have had problems with a system (not just Asterisk) 
 crashing and there is nothing in the logs it turns out to be hardware.
 
 One slight exception was where a UPS would brownout every so often 
 causing the system to go out to lunch.  Even though there were three 
 power supplies in that system someone (not me) had _forgot_ to put them 
 on separate UPS' ... they were all on one.
 
 Actually, that is hardware, just not in the system case.
 
 So check your hardware.

I must admit, I've suspected hardware as well - the individual FXO 
chips on the two TDM400P have slowly gone dead till I only have four 
working FXO chips between 8 slots - thats fine, since I only have four 
  POTS lines right now, but still a bit annoying. They are 2 - 3 yrs 
old, so I guess its just their time.

How would I go about pinpointing / diagnosing the hardware fault? Not 
sure exactly what to do with memtest86 - any pointers?

Thanks!
-josiah


-- 
Josiah Bryan
IT Manager
Productive Concepts, Inc.
jbr...@productiveconcepts.com
(765) 964-6009, ext. 224


___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


Re: [asterisk-users] Crash Hard, Crash Often

2009-02-05 Thread Josiah Bryan
It *is* doing mysql CDR and a whole host of custom AGI scripts. AGI to 
mudge the CID, AGI to handle receptionist routing/selections, AGI for 
voicemail (not using builtin vm app) - all the AGI scripts do mysql 
connections.

Would the CDR connection be a problem?

-josiah

Danny Nicholas wrote:
 I've been running 1.4.21.2 on SUSE 11.0 for about 4 months.  In my
 experience, the fewer database interfaces you can use, the more stable it
 will be.
 
 -Original Message-
 From: asterisk-users-boun...@lists.digium.com
 [mailto:asterisk-users-boun...@lists.digium.com] On Behalf Of Josiah Bryan
 Sent: Thursday, February 05, 2009 8:57 AM
 To: Asterisk Users Mailing List - Non-Commercial Discussion
 Subject: [asterisk-users] Crash Hard, Crash Often
 
 I've been using asterisk for 3+ years now, I love it, but it doesnt love 
 me back. :-)
 
 It was crashing frequently and seemingly randomly prior to this latest 
 upgrade. Not sure what version it was running prior to upgrade (it was 
 probably an old CVS HEAD from 2+ years go.) Anyway, currently running 
 1.4.21.2.
 
 == Problem ==
 
 Problem is that its crashing for seemingly no reason at all, no errors 
 on the console, no logs (that I can find), nothing in /var/lib/messages 
 - its puzzeling! Management is screaming like banshees, calls are 
 dropping like flies, and all hell is about to break loose if I can't 
 stop asterisk from crashing every couple of hours, taking down any 
 Zaptel calls with it.
 
 I've been thinking of switching over to CallWeaver, but I havn't got 
 another Zaptel card to plugin to my testing box, so I'd like to just get 
 Asterisk stabilized right now - but I'm at a loss of even where to start.
 
 == System Details ==
 
 Running FC3, 2.6.9-1.667 kernel, 32 bit, with 256 MB ram and a 20G hard 
 drive. I've got two 4-port FXO cards in PCI slots.
 
 lspci reports:
 02:08.0 Communication controller: Tiger Jet Network Inc. Tiger3XX 
 Modem/ISDN interface
 02:0a.0 Communication controller: Tiger Jet Network Inc. Tiger3XX 
 Modem/ISDN interface
 
 
 [r...@asterisk ~]# cat /proc/cpuinfo
 processor   : 0
 vendor_id   : GenuineIntel
 cpu family  : 15
 model   : 1
 model name  : Intel(R) Pentium(R) 4 CPU 1.50GHz
 stepping: 2
 cpu MHz : 1483.679
 cache size  : 256 KB
 fdiv_bug: no
 hlt_bug : no
 f00f_bug: no
 coma_bug: no
 fpu : yes
 fpu_exception   : yes
 cpuid level : 2
 wp  : yes
 flags   : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca 
 cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
 bogomips: 2924.54
 
 
 ==
 
 Thanks for any help or advice anyone may have. Cheers!
 -josiah
 

-- 
Josiah Bryan
IT Manager
Productive Concepts, Inc.
jbr...@productiveconcepts.com
(765) 964-6009, ext. 224


___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


Re: [asterisk-users] Crash Hard, Crash Often

2009-02-05 Thread Josiah Bryan
David Gibbons wrote:
 snip
 Problem is that its crashing for seemingly no reason at all, no errors
 on the console, no logs (that I can find), nothing in /var/lib/messages
 - its puzzeling! Management is screaming like banshees, calls are
 dropping like flies, and all hell is about to break loose if I can't
 stop asterisk from crashing every couple of hours, taking down any
 Zaptel calls with it.
 /snip
 
 I am assuming you have debug turned on so that you can see what's going on 
 when it crashes? If not, open the * console (asterisk -r) and type (core set 
 verbose 100) and (core set debug 100). Then leave the console open so you can 
 see if * was doing anything special when it crashed.
 

I've ran with verbose quite high lately, but havn't left debug on. Well, 
I just opened console and turned debug on to 100 so we'll wait and see 
what it shows next time it crashes. It's due for another any time now...

-josiah

-- 
Josiah Bryan
IT Manager
Productive Concepts, Inc.
jbr...@productiveconcepts.com
(765) 964-6009, ext. 224


___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


Re: [asterisk-users] Crash Hard, Crash Often

2009-02-05 Thread Danny Nicholas
Could be.  Mine works better using the CSV CDR.  MYSQL isn't the stoutest
thing out there and if you're processing the kind of volume other posters
here are, it would wig out.

-Original Message-
From: asterisk-users-boun...@lists.digium.com
[mailto:asterisk-users-boun...@lists.digium.com] On Behalf Of Josiah Bryan
Sent: Thursday, February 05, 2009 9:34 AM
To: Asterisk Users Mailing List - Non-Commercial Discussion
Subject: Re: [asterisk-users] Crash Hard, Crash Often

It *is* doing mysql CDR and a whole host of custom AGI scripts. AGI to 
mudge the CID, AGI to handle receptionist routing/selections, AGI for 
voicemail (not using builtin vm app) - all the AGI scripts do mysql 
connections.

Would the CDR connection be a problem?

-josiah

Danny Nicholas wrote:
 I've been running 1.4.21.2 on SUSE 11.0 for about 4 months.  In my
 experience, the fewer database interfaces you can use, the more stable it
 will be.
 
 -Original Message-
 From: asterisk-users-boun...@lists.digium.com
 [mailto:asterisk-users-boun...@lists.digium.com] On Behalf Of Josiah Bryan
 Sent: Thursday, February 05, 2009 8:57 AM
 To: Asterisk Users Mailing List - Non-Commercial Discussion
 Subject: [asterisk-users] Crash Hard, Crash Often
 
 I've been using asterisk for 3+ years now, I love it, but it doesnt love 
 me back. :-)
 
 It was crashing frequently and seemingly randomly prior to this latest 
 upgrade. Not sure what version it was running prior to upgrade (it was 
 probably an old CVS HEAD from 2+ years go.) Anyway, currently running 
 1.4.21.2.
 
 == Problem ==
 
 Problem is that its crashing for seemingly no reason at all, no errors 
 on the console, no logs (that I can find), nothing in /var/lib/messages 
 - its puzzeling! Management is screaming like banshees, calls are 
 dropping like flies, and all hell is about to break loose if I can't 
 stop asterisk from crashing every couple of hours, taking down any 
 Zaptel calls with it.
 
 I've been thinking of switching over to CallWeaver, but I havn't got 
 another Zaptel card to plugin to my testing box, so I'd like to just get 
 Asterisk stabilized right now - but I'm at a loss of even where to start.
 
 == System Details ==
 
 Running FC3, 2.6.9-1.667 kernel, 32 bit, with 256 MB ram and a 20G hard 
 drive. I've got two 4-port FXO cards in PCI slots.
 
 lspci reports:
 02:08.0 Communication controller: Tiger Jet Network Inc. Tiger3XX 
 Modem/ISDN interface
 02:0a.0 Communication controller: Tiger Jet Network Inc. Tiger3XX 
 Modem/ISDN interface
 
 
 [r...@asterisk ~]# cat /proc/cpuinfo
 processor   : 0
 vendor_id   : GenuineIntel
 cpu family  : 15
 model   : 1
 model name  : Intel(R) Pentium(R) 4 CPU 1.50GHz
 stepping: 2
 cpu MHz : 1483.679
 cache size  : 256 KB
 fdiv_bug: no
 hlt_bug : no
 f00f_bug: no
 coma_bug: no
 fpu : yes
 fpu_exception   : yes
 cpuid level : 2
 wp  : yes
 flags   : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca 
 cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
 bogomips: 2924.54
 
 
 ==
 
 Thanks for any help or advice anyone may have. Cheers!
 -josiah
 

-- 
Josiah Bryan
IT Manager
Productive Concepts, Inc.
jbr...@productiveconcepts.com
(765) 964-6009, ext. 224


___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


Re: [asterisk-users] Crash Hard, Crash Often

2009-02-05 Thread Russell Bryant

On Feb 5, 2009, at 9:32 AM, Josiah Bryan wrote:

 I've ran with verbose quite high lately, but havn't left debug on.  
 Well,
 I just opened console and turned debug on to 100 so we'll wait and see
 what it shows next time it crashes. It's due for another any time  
 now...


If it's crashing, the first thing I would do is upgrade from 1.4.41.2  
to the latest version, which is 1.4.23.1.  Quite a number of issues  
have been fixed since the version you're using.

If you're still having a problem with 1.4.23, then try to get a  
backtrace of the crash.  First, build Asterisk without optimizations  
enabled by running make menuselect and turning on the  
DONT_OPTIMIZE flag in the Compiler Flags section.  Then, if you  
start Asterisk with -g, it will generate a core dump on a crash.   
Finally, use gdb to get a backtrace.

$ gdb asterisk core.12345
(gdb) bt
(gdb) bt full

Then, post this information on http://bugs.digium.com/  and we'll help  
you resolve the issue.

Thanks,

--
Russell Bryant
Digium, Inc. | Senior Software Engineer, Open Source Team Lead
445 Jan Davis Drive NW - Huntsville, AL 35806 - USA
Check us out at: www.digium.com  www.asterisk.org





___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


Re: [asterisk-users] Crash Hard, Crash Often

2009-02-05 Thread Josiah Bryan
Josiah Bryan wrote:
 David Gibbons wrote:
 snip
 Problem is that its crashing for seemingly no reason at all, no errors
 on the console, no logs (that I can find), nothing in /var/lib/messages
 - its puzzeling! Management is screaming like banshees, calls are
 dropping like flies, and all hell is about to break loose if I can't
 stop asterisk from crashing every couple of hours, taking down any
 Zaptel calls with it.
 /snip

 I am assuming you have debug turned on so that you can see what's going on 
 when it crashes? If not, open the * console (asterisk -r) and type (core set 
 verbose 100) and (core set debug 100). Then leave the console open so you 
 can see if * was doing anything special when it crashed.

 
 I've ran with verbose quite high lately, but havn't left debug on. Well, 
 I just opened console and turned debug on to 100 so we'll wait and see 
 what it shows next time it crashes. It's due for another any time now...
 

Alright, latest console output right before latest crash shows:

   == Parsing '/etc/asterisk/manager.conf': Found
   == Manager 'script' logged on from 10.10.9.5
 -- Executing [...@playground:1] AGI(Local/9...@playground-604a,2, 
paging-hack.pl) in new stack
 -- Launched AGI Script /var/lib/asterisk/agi-bin/paging-hack.pl
 Channel Local/9...@playground-604a,1 was answered.
   == Manager 'script' logged off from 10.10.9.5
 -- Executing [...@playground:1] 
Answer(Local/9...@playground-604a,1, ) in new stack
 -- Executing [...@playground:2] 
PlayTones(Local/9...@playground-604a,1, 750+440+1030+3000+5000+15000) 
in new stack
 -- Executing [...@playground:3] Wait(Local/9...@playground-604a,1, 
2) in new stack
   == Parsing '/etc/asterisk/manager.conf': Found
   == Manager 'script' logged on from 127.0.0.1
 -- Executing [...@paging:1] Playback(Local/9...@paging-7883,2, 
beep) in new stack
 Channel Local/9...@paging-7883,1 was answered.
 -- Executing [...@playground:1] MeetMe(Local/9...@paging-7883,1, 
951|qaA) in new stack
   == Manager 'script' logged off from 127.0.0.1
 -- AGI Script paging-hack.pl completed, returning 0
 -- Executing [...@playground:2] Goto(Local/9...@playground-604a,2, 
conferences|951|1) in new stack
 -- Goto (conferences,951,1)
 -- Executing [...@conferences:1] 
MeetMe(Local/9...@playground-604a,2, 951|qaA) in new stack
   == Parsing '/etc/asterisk/meetme.conf': Found
   == Parsing '/etc/asterisk/meetme.conf': Found
 -- Created MeetMe conference 1023 for conference '951'
 -- Local/9...@paging-7883,2 Playing 'beep' (language 'en')
[Feb  5 11:29:03] WARNING[24824]: file.c:1204 waitstream_core: 
Unexpected control subclass '-1'
 -- Executing [...@paging:2] Dial(Local/9...@paging-7883,2, 
Console/dsp) in new stack
   Call placed to 'dsp' on console 
   Auto-answered 
 -- Called dsp
 -- ALSA/default answered Local/9...@paging-7883,2
asterisk*CLI
Disconnected from Asterisk server
Executing last minute cleanups
Asterisk cleanly ending (0).
[r...@asterisk ~]#

I know that all looks a bit weird, but its related to this problem I had 
last September:

http://lists.digium.com/pipermail/asterisk-users/2008-September/217822.html

My extensions.conf has the following notes:
; PAGING HACK
; AGI script: paging-hack.pl is called when user dials 249
; The script puts the user in 951, then calls the Console into
; 951, and starts a fork monitoring the users leg of the call -
; as soon as the user hangs up, the fork automatically
; hangs up the Console.
; ? ? WHY ? ??
; Well, simple, as of version 1.4.21.2 of asterisk,
; when a user dialed 249 and got the Console directly,
; after the user hung up, ringing tone was heard over
; the console until I manually typed 'hangup' in the
; asterisk console - even then, asterisk said 'no calls to hangup'
; The mailing list was no help, so I wrote paaging-hack.pl as a,
; well, a hack to get it to a point where paging still worked.
exten = 951,1,MeetMe(951|qaA)

So, 249 does AGI(paging-hack.pl), and from there, the user and the 
Console are dragged into a MeetMe conference for the user to speak 
his/her page. (The script doesn't do the hangup on the console actually 
- it just leaves the console active for the next page.)

So, anyway, thats the output right before the last crash - any ideas as 
to why based on that info?

Thanks!
-josiah

-- 
Josiah Bryan
IT Manager
Productive Concepts, Inc.
jbr...@productiveconcepts.com
(765) 964-6009, ext. 224


___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


Re: [asterisk-users] Crash Hard, Crash Often

2009-02-05 Thread Doug Lytle
Josiah Bryan wrote:
 Roderick A. Anderson wrote:
   
 How would I go about pinpointing / diagnosing the hardware fault? Not 
 sure exactly what to do with memtest86 - any pointers?

   
A lot of distros have memtest86 as a boot option on the CD/DVD.  You 
select it and let it run.  It'll scan for bad memory.  And, shoot lots 
of red errors when encountered.  If the memory checks fail, you'll know 
that you need to replace the chip.

Doug



-- 
 
Ben Franklin quote:

Those who would give up Essential Liberty to purchase a little Temporary 
Safety, deserve neither Liberty nor Safety.


___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


Re: [asterisk-users] Crash Hard, Crash Often

2009-02-05 Thread Josiah Bryan
Doug Lytle wrote:
 Josiah Bryan wrote:
 Roderick A. Anderson wrote:
   
 How would I go about pinpointing / diagnosing the hardware fault? Not 
 sure exactly what to do with memtest86 - any pointers?

 A lot of distros have memtest86 as a boot option on the CD/DVD.  You 
 select it and let it run.  It'll scan for bad memory.  And, shoot lots 
 of red errors when encountered.  If the memory checks fail, you'll know 
 that you need to replace the chip.

Ah I see! Gotcha. I'll try to run that tonite or this weekend then 
when the plant is closed.

Thanks!
-josiah

-- 
Josiah Bryan
IT Manager
Productive Concepts, Inc.
jbr...@productiveconcepts.com
(765) 964-6009, ext. 224


___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


Re: [asterisk-users] Crash Hard, Crash Often

2009-02-05 Thread Wilton Helm
One relevant question that hasn't been addressed is whether just the 
application is crashing or the whole computer (Linux).

I would second the hardware idea, with emphasis on generic hardware, especially 
RAM.  I had a Suse 10 box that kept crashing and doing funny stuff.  I ended up 
running an extended RAM test on it--one of those pattern sensitivity tests that 
takes an hour or two to run.  Turned out that one of the SIMMs I had just 
bought and installed had a subtle problem.  It would never show up on a 
straightforward test, but certain address ranges would fail on one or two of 
the exotic pattern tests.

It came from a reputable vendor who does 100% testing themselves, so it was 
apparently subtle enough to slip through their test.  They took it back and 
replaced it.  I ran for a few weeks without the module with no crashes and when 
I put the replacement in everything was still fine.

Wilton
___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users

Re: [asterisk-users] Crash Hard, Crash Often

2009-02-05 Thread Josiah Bryan
Wilton Helm wrote:
 One relevant question that hasn't been addressed is whether just the 
 application is crashing or the whole computer (Linux).
  
 I would second the hardware idea, with emphasis on generic hardware, 
 especially RAM.  I had a Suse 10 box that kept crashing and doing funny 
 stuff.  I ended up running an extended RAM test on it--one of those 
 pattern sensitivity tests that takes an hour or two to run.  Turned out 
 that one of the SIMMs I had just bought and installed had a subtle 
 problem.  It would never show up on a straightforward test, but certain 
 address ranges would fail on one or two of the exotic pattern tests.
  
 It came from a reputable vendor who does 100% testing themselves, so it 
 was apparently subtle enough to slip through their test.  They took it 
 back and replaced it.  I ran for a few weeks without the module with no 
 crashes and when I put the replacement in everything was still fine.
  
 Wilton

Just the application crashes.

I'll try changing RAM simms to see if that helps.

Thanks!
-josiah

-- 
Josiah Bryan
IT Manager
Productive Concepts, Inc.
jbr...@productiveconcepts.com
(765) 964-6009, ext. 224


___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


Re: [asterisk-users] Crash Hard, Crash Often

2009-02-05 Thread Paul Chambers
Josiah Bryan wrote:
 snip
 Problem is that its crashing for seemingly no reason at all, no errors 
 on the console, no logs (that I can find), nothing in /var/lib/messages 
 - its puzzeling! Management is screaming like banshees, calls are 
 dropping like flies, and all hell is about to break loose if I can't 
 stop asterisk from crashing every couple of hours, taking down any 
 Zaptel calls with it.
 /snip

That description reminds me of a problem I ran into a while back. One 
fan had quietly failed, and the temperature would slowly creep up inside 
the box until things started 'acting funny' and the box would lock up 
soon after. It'd run fine for 3-4 hours, then just keel over and die. 
The logs didn't show anything consistent just before the event.

The failing fxo modules are also an interesting symptom. Perhaps your 
power supply is misbehaving? is the power supply in that machine of good 
quality?

I've never experienced it, but a friend has had two motherboards become 
unstable within a couple of months of each other, after running fine for 
3-4 years. When he examined the motherboards, both had capacitors around 
the CPU that had visibly 'ballooned' like a leaking alkaline battery 
would.  A long shot, but another example of previously stable hardware 
ceasing to be so.

Do you have another PC you can swap the drive and cards into, to try to 
rule out hardware instability? could you run lm_sensors? (along with one 
of the logging/alarm packages that support it).

-- Paul

___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


Re: [asterisk-users] Crash Hard, Crash Often

2009-02-05 Thread Wilton Helm
When he examined the motherboards, both had capacitors around 
the CPU that had visibly 'ballooned'

A good reason to look for motherboards with either Tantalum capacitors or 
Organic capacitors.  Its a marketing point I'm seeing these days, and as a 
design engineer, I can say its worth looking for.  The ESR in typical aluminum 
electrolytics is considerably higher.  These caps are at the output of a 
switching regulator on the CPU that is handling many amps.  This creates large 
charging and discharging currents at hundreds of KHz rate.  Any ESR (internal 
series resistance) turns some of this into heat.

Wilton
___
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users