[Asterisk-Users] Something every TDMP user should know
Hi team, Not long ago a bunch of us were posting reports of a strange phenomenon where voice quality would pack up completely from time to time, typically resulting in loud crackling on the line and/or the voice channel breaking up completely. With our installation it would occur from time to time, typically when the * server was at it's busiest. Most of the time this problem would result in all users having to terminate their calls and re-establish them. After a lot of (very frustrating) troubleshooting we have have now gone two weeks without a re-occurrence of the problem and we are hoping that we may have finally resolved it altogether. I wanted to post a quick summary of the steps that we have taken to resolve this issue and what we think the problem turned out to be, as (from the number of responses to my last posts about this issue), it sounds like a few people have been experiencing it, so hopefully our experiences will help. The * server in question is based on a single-processor IBM xSeries 205 with a gig of RAM, SCSI 320 HDD's (RAID 1) and Red Hat ES 3. It uses ISDN (via CAPI and a four port Eicon Diva Pro Server card) and a mixture of SIP and analogue extensions. A TDM400P with four FXS ports supports the four analogue extensions (all Uniden cordless phones) and the SIP handsets consist of a mixture of BT102's and SNOM190's. Our turning point with this issue came when we bit the bullet and purchased a support incident from Digium. By this stage we had spent dozens and dozens of hours trying unsuccessfully to research and diagnose the problem and still had no accurate idea of what was causing it. Several people replied to our posts to this list saying that they were having a very similar issue as well, but no one had a clue what was causing it. Digium support zeroed in on the issue fairly quickly and we got the *distinct* impression that they have seen this problem many times before. They instantly got us to look at the output of zttest and we found that this was (in their words) 'extremely low', with 'best' and 'worst' readings of 99.975586% and 99.963379% respectively. They told us that we needed to be getting at least 99.98% and recommended that we: Check that the TDMP is on it's own IRQ (much to our embarrassment our card wasn't at the time, so we had to play with it a bit to get it to occupy a unique IRQ). Disable hyper threading on the Xeon CPU. Uninstall our SCSI hardware and replace it with IDE hardware. Upgrade to the latest stable releases of Asterisk, Zaptel and Libpri. We made changes 1 and 2 in the above list and are prepared to make changes 3 and 4 if we find the problem hasn't gone away. It hasn't happened in over two weeks now (after occuring many times per day for a while), so we hopefully won't have to throw out our SCSI hardware. After we made each change (1 and 2 were made about two weeks apart from each other) we found that the quality improved, with the incidence of the issue halving after '1' and disappearing (hopefully for good) after '2'. Incidentally the results of zttest *did not* noticeably improve after making these changes (it is still below 99.98%). Apparently our problem is related to the fact that the TDMP generates massive amounts of IRQ requests and that it becomes extremely upset if a suitable number of those IRQ requests are not honoured. Dispite the fact that a PCI device has to be able to share an IRQ in order to meet the PCI specification, it appears that having a TDMP sharing an IRQ with *anything* is a really really bad idea. I haven't been able to get an explanation about why hyper threading is a bad thing, but apparently high-performance devices such as SCSI adapters can cause resource contention issues with the TDMP, resource issues that the TDMP becomes very upset about. So hopefully we have seen the back of this problem and I have to say that I have been pretty dissappointed to find out that this issue appears to be relatively well known by Digium, but seemingly not publicised in the slightest. We searched for days to find anything relating to our issue but to no avail. Hopefully the next time someone has this issue they might find this mail and save themselves some of the frustration that we had. When we challenged Digiums advice about retarding the CPU (i.e. disabling hyper threading) and slowing I/O (by throwing out our SCSI RAID controller and replacing with IDE) they fell strangely silent - after getting prompt and meaningful responses to our requests they suddenly stopped responding at all. I think that this issue constitutes a pretty major flaw in the design of the TDMP and we will strongly avoid putting these cards into any * servers from now on. This is a real shame, as we as a company really want to reward Digium for all of their good work by actually buying their products, but we no longer have any faith in the design and suitability for production use of this product. Maybe it's time for Digium to think about
RE: [Asterisk-Users] Something every TDMP user should know
They instantly got us to look at the output of zttest and we found that this was (in their words) 'extremely low', with 'best' and'worst' readings of 99.975586% and 99.963379% respectively. Might want to give PCIlatency setting a try, it helped for me. My ZTTEST would drop occasionally to 99.95% until I set: setpci -v -s 01:01.0 latency_timer=ff --Digium PRI card setpci -v -s 01:04:0latency_timer=ff --Digium 401 4 X FXS setpci -v -s XX:XX:X latency_timer=0 --1 entry for every other PCI card in system from LSPCI output, modify XX:XX accordingly Before setpci I would get best in ZTTEST at 99.987793% and worst ~ 99.95% After setpci best is 100% and worst is 99.987793% consitient. I use SpanDSP to recieve faxes and before faxes were garbled and now they are OK (BTW, nowrecieving ~150 faxes a day 99.95% OK, so SpanDSP *does* work fine, you just have to set it up right. Ask me how.) I put the setpci statements in /etc/rc.d/rc.local before my modprobes to the Digium hardware and Asterisk startup. I'm using a 4-way Netfinity FC2 * 1.0 stable I dunno, maybe the community is being too hard on Digium about the design of the card. I can understand their perpective, it's brutal to make a card that has to have such tight tolerances and make it work acceptably on the huge variation in white box hardware (or black box, in your case). There's a page on the Wiki about motherboards that work well with installation notes but that's pointless since motherboards are such a moving target. Even the motherboard vendor screwing around with BIOS updates can invalidate that information. What I think is best for Asterisk implementation is for Digium to sell a motherboard. No, seriously. Find aECS or Abit or ASUS mobo that consitiently yields 100% or 99.% and white-box it as a barebones kit with a TXXX card. Sell it as a case, good PSU, mobo, and TXXX card - you add your own RAM, NIC, CPU HDD. Would you buy one for $699? I probably would. It took me a couple of months of fooling around with my Netfinity before I was pleased with the performance and satisfied that it would handle the things I wanted it to do without choking. If I had the option of saving the couple of months time obsessing over things like timing for $699, it would have been a no brainer. Digium wins too, because they get an incremental sale that they can make money on (margin on the mobo) and lower support costs because they don't have to chase down IRQ latency phantoms. hth my 2c ___ Asterisk-Users mailing list Asterisk-Users@lists.digium.com http://lists.digium.com/mailman/listinfo/asterisk-users To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [Asterisk-Users] Something every TDMP user should know
On May 12, 2005 01:17 pm, Colin Anderson wrote: I use SpanDSP to recieve faxes and before faxes were garbled and now they are OK (BTW, now recieving ~150 faxes a day 99.95% OK, so SpanDSP *does* work fine, you just have to set it up right. Ask me how.) No, don't ask you how. Show us how. Don't tease out information like this, like some cheap stripper. -A. ___ Asterisk-Users mailing list Asterisk-Users@lists.digium.com http://lists.digium.com/mailman/listinfo/asterisk-users To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [Asterisk-Users] Something every TDMP user should know
Damian Funnell wrote: 1. Check that the TDMP is on it's own IRQ (much to our embarrassment our card wasn't at the time, so we had to play with it a bit to get it to occupy a unique IRQ). 2. Disable hyper threading on the Xeon CPU. 3. Uninstall our SCSI hardware and replace it with IDE hardware. 4. Upgrade to the latest stable releases of Asterisk, Zaptel and Libpri. We made changes 1 and 2 in the above list and are prepared to make changes 3 and 4 if we find the problem hasn't gone away. It hasn't happened in over two weeks now (after occuring many times per day for a while), so we hopefully won't have to throw out our SCSI hardware. After we made each change (1 and 2 were made about two weeks apart from each other) we found that the quality improved, with the incidence of the issue halving after '1' and disappearing (hopefully for good) after '2'. Incidentally the results of zttest *did not* noticeably improve after making these changes (it is still below 99.98%). This is great info. I am running on an Intel box and attempting to go to a dual AMD Opteron setup on a Tyan board. I am not having luck luck getting my numbers above 99.6%. I've disabled every hardware gadget and service not needed and still haven't had much luck. I'm going to try a custom kernel as opposed to the stock one's I've tried, but that's been about 4 different OS's with the same results. Is there something to disable on Opteron's that would be the equivalent of disabling hyperthreading? Oh, and I even tried setting the pci latencies and it made no noticeable difference. Mark ___ Asterisk-Users mailing list Asterisk-Users@lists.digium.com http://lists.digium.com/mailman/listinfo/asterisk-users To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [Asterisk-Users] Something every TDMP user should know
I have never had to play with setpci before. Can you elaborate on the use and purpose of this command? On 5/12/05, Colin Anderson [EMAIL PROTECTED] wrote: They instantly got us to look at the output of zttest and we found that this was (in their words) 'extremely low', with 'best' and 'worst' readings of 99.975586% and 99.963379% respectively. Might want to give PCI latency setting a try, it helped for me. My ZTTEST would drop occasionally to 99.95% until I set: setpci -v -s 01:01.0 latency_timer=ff --Digium PRI card setpci -v -s 01:04:0 latency_timer=ff --Digium 401 4 X FXS setpci -v -s XX:XX:X latency_timer=0 --1 entry for every other PCI card in system from LSPCI output, modify XX:XX accordingly Before setpci I would get best in ZTTEST at 99.987793% and worst ~ 99.95% After setpci best is 100% and worst is 99.987793% consitient. I use SpanDSP to recieve faxes and before faxes were garbled and now they are OK (BTW, now recieving ~150 faxes a day 99.95% OK, so SpanDSP *does* work fine, you just have to set it up right. Ask me how.) I put the setpci statements in /etc/rc.d/rc.local before my modprobes to the Digium hardware and Asterisk startup. I'm using a 4-way Netfinity FC2 * 1.0 stable I dunno, maybe the community is being too hard on Digium about the design of the card. I can understand their perpective, it's brutal to make a card that has to have such tight tolerances and make it work acceptably on the huge variation in white box hardware (or black box, in your case). There's a page on the Wiki about motherboards that work well with installation notes but that's pointless since motherboards are such a moving target. Even the motherboard vendor screwing around with BIOS updates can invalidate that information. What I think is best for Asterisk implementation is for Digium to sell a motherboard. No, seriously. Find a ECS or Abit or ASUS mobo that consitiently yields 100% or 99.% and white-box it as a barebones kit with a TXXX card. Sell it as a case, good PSU, mobo, and TXXX card - you add your own RAM, NIC, CPU HDD. Would you buy one for $699? I probably would. It took me a couple of months of fooling around with my Netfinity before I was pleased with the performance and satisfied that it would handle the things I wanted it to do without choking. If I had the option of saving the couple of months time obsessing over things like timing for $699, it would have been a no brainer. Digium wins too, because they get an incremental sale that they can make money on (margin on the mobo) and lower support costs because they don't have to chase down IRQ latency phantoms. hth my 2c ___ Asterisk-Users mailing list Asterisk-Users@lists.digium.com http://lists.digium.com/mailman/listinfo/asterisk-users To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users -- --- Erick Perez Linux User 376588 http://counter.li.org/ (Get counted!!!) Panama, Republic of Panama ___ Asterisk-Users mailing list Asterisk-Users@lists.digium.com http://lists.digium.com/mailman/listinfo/asterisk-users To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
RE: [Asterisk-Users] Something every TDMP user should know
-Original Message- From: Erick Perez [mailto:[EMAIL PROTECTED] Sent: Thursday, May 12, 2005 2:19 PM To: Asterisk Users Mailing List - Non-Commercial Discussion; [EMAIL PROTECTED] Subject: Re: [Asterisk-Users] Something every TDMP user should know I have never had to play with setpci before. Can you elaborate on the use and purpose of this command? See: http://www-106.ibm.com/developerworks/library/l-hw2.html Also, for more PCI latency timer specifics: http://www.reric.net/linux/pci_latency.html Kris Boutilier Information Services Coordinator Sunshine Coast Regional District ___ Asterisk-Users mailing list Asterisk-Users@lists.digium.com http://lists.digium.com/mailman/listinfo/asterisk-users To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users