On Mon, 2006-05-22 at 12:16 -0400, Greg Boehnlein wrote: > Hello, > I was wondering if anyone out there is successfully running > Asterisk 1.2 svn w/ Centos 4.3. I had an experience over the last two > weeks that has me scratching my head and muttering strange things in the > wee hours of the morning. I am going to try and be as descriptive as my > brain will allow right now, but if there is something that I do not cover, > please do not hesitate to ask and I'll be happy to answer. > > For the last 2 years, I have been running a mixture of Tao Linux > and Centos (both RHEL derivatives) on our production boxes. Asterisk has > run flawlessly on all installations. Last week, I updated one of our > gateway boxes from Centos 4.2 (under which it ran for 6 months without > issue) to the new 4.3 code. Almost immediately, we began to experience > problems. Asterisk would core w/ the following: > > #0 0x004878ab in test_err () from > /usr/lib/asterisk/modules/codec_g729a.so > > The segfaults would happen under very light loads, in some cases > with just a single call. Kevin was able to log in to the box, and put a > debugging version of codec_g729 on the box. He determined that the problem > was that the values that were being returned in that routine were > incorrect. I.E. something in the system was returning a non-zero value > when multiplying a number by "0". Barring any other explanations, we > assumed that there was a hardware issue somewhere, either in the memory, > or the FPU on the CPU. > So, we replaced the box w/ a brand new Dual-Core system running a > Dual-Core Pentium D 920. We loaded the 32 bit version of Centos 4.3 onto > the box and proceeded to start testing. BAM.. same problem.. the backtrace > showed the failure in the same routine. > We scratched our heads, and after many hours of trying various > things (backing off the kernel to 2.6.9-22) and even moving to the new > development kernel 2.6.9-34.19 (from the testing tree) we could do nothing > to solve the issue. > Mind you, this is the exact same behavior on two different > hardware platforms running the exact same distribution. We even loaded up > a third box and could reproduce the behavior on it as well. Three > different boxes, one common distribution. > > As a test, we installed Fedora Core 5 x86_64 on the new Dual Core > box and ran extensive tests overnight, simulating 96 channels doing G729 > to Ulaw transcoding. The box ran completely stable. No hiccups. > > So, this morning, we put it back into the cluster, and it's now > taking about 200 concurrent calls, doing an insane amount of transcoding > and it is working just fine. Before, it would have cored in the first > couple of minutes. > > I'm scratching my head here, because I generally have had excellent > experiences with Centos. However, I have NO idea what might be the issue > here. Could it be the kernel? (We tried three different ones!). Could it > be the libc? Maybe it is the compiler? > > In any case, if anyone is having success with Centos 4.3 (32 bit), please > speak up. I'd like to get to the bottom of it. I generally do not like to > run Fedora on production equipment as it is generally bleeding edge. In > this case, FC5 is running 2.6.16 something.. >
Have you tried compiling statically on CentOS 4.2 and running on 4.3? I am assuming you have made sure the dist is up to date with patches. We do not use 729, so I cannot try it out for you, but we do use CentOS. Is it only w/ SVN, or all releases of *? -Greg _______________________________________________ --Bandwidth and Colocation provided by Easynews.com -- Asterisk-Users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users