[asterisk-users] Scaling voicemail
(I sent this previously, but my list membership was still processing so I don’t think it made its way out, apologies if it did!) Hi all, I’m building a voicemail system using Asterisk, with pjsip, and IMAP. I used to do a lot of Asterisk, but, catching up after a few years away. We have existing voice infrastructure using Kamailio, registrations go there, not to Asterisk. Functionality is mostly fine. Problem I’m having is scaling. I’m just testing stuff right now, and dumped in around 3500 mocked up voicemail users, and found that startup times and so on are quite long. I’m wondering: 1) Is there a way to avoid defining an endpoint + AOR for each user, while retaining the ability to poll mailboxes, and sent NOTIFYs to users? I was hoping that I could at least define a generic AOR for our Kamailio server, however, there doesn’t seem to be a way to set the RURI and To: user on a per-endpoint basis for the MWI notifications, so the NOTIFYs go out with the RURI/To: set to a generic URI (which is something like sip:kamailio_ip:5060 ). I’d really like the ability to just say “send NOTIFYs to this SIP server with the mailbox (or better, some configurable value) as the user-part, but that doesn’t appear to be possible..? 2) Is there a way to spread mailbox polling out? All the mailboxes are checked at once, which means we get big load spikes, including at startup. It’d be great to have this spread out over time. 3) Does anyone have thoughts/ideas re. scaling an Asterisk voicemail-only server? Can it be done while retaining mailbox polling with the current infrastructure? FWIW, I’m running 14.7.7, though that can easily be changed if there’s a better version to target for this. -- Nathan Ward -- _ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- Check out the new Asterisk community forum at: https://community.asterisk.org/ New to Asterisk? Start here: https://wiki.asterisk.org/wiki/display/AST/Getting+Started asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling
On Wed, Jun 17, 2009 at 7:41 PM, Steve Totaro < stot...@totarotechnologies.com> wrote: > > > On Wed, Jun 17, 2009 at 3:18 PM, John Todd wrote: > >> >> On Jun 17, 2009, at 8:16 AM, Steve Totaro wrote: >> >> > Hi, >> > >> > Quick question to the real world. >> > >> > Approx what specs would I need on server to handle 95 ZAP or Dahdi - >> > > SIP gateway using G729 on the SIP to carrier side (nothing else, >> > just media conversion)? >> > >> > Does the latest Asterisk/DAHDI significantly improve these numbers >> > over say, Asterisk 1.2.X? >> > >> > Sure, there is plenty to read but nothing I could find quickly on my >> > exact needs that was clear and I want to be fairly sure before >> > ordering a server. >> > >> > Obviously load avg has something to do with it but CPU and mem seems >> > to be the biggest factors. >> > >> > -- >> > Thanks, >> > Steve Totaro >> > +18887771888 (Toll Free) >> > +12409381212 (Cell) >> > +12024369784 (Skype) >> >> >> [Digium hat off, ITSP (previous employer) hat on.] >> >> Not speaking for Digium on this one, but speaking from personal >> experience at another company. >> >> We could get 100 G.729 channels in a 2x3ghz P4 machine with plenty of >> CPU room to spare, 4+ years ago, and that rule of thumb served us >> well. This was for SIP (G.711<->G.729) but I can't imagine that DAHDI >> or Asterisk takes significantly more or less horsepower for this task >> now for the base transcoding load. >> >> JT >> >> --- >> John Todd >> email:jt...@digium.com >> Digium, Inc. | Asterisk Open Source Community Director >> 445 Jan Davis Drive NW - Huntsville AL 35806 - USA >> direct: +1-256-428-6083 http://www.digium.com/ >> >> >> >> > John, > > Thanks for the info. > > Can't go wrong with these in a cluster if you are correct. > http://www.surpluscomputers.com/348663/hp-dl140-proliant-dual-xeon.html > > I used to have a garden of seven (too small to be a farm) of Asterisk boxen > that were simple PRI to SIP gateways doing ulaw (no transcoding as far as > codec), and nothing else running, just a few lines in each conf file. > > They would run at ~65% CPU utilization with 95 channels in use on each box. > > This was the Asterisk, Zaptel, Libpri 1.2.X flavor. > > I wonder if anyone else has input on the CPU utilization of of a simple > PSTN <-> VoIP gateway based on Asterisk? > > I wonder how much of my observed 65% CPU utilization was because of software echo cancellation? If someone knows, I would appreciate any input. If not, I will follow-up with my own testing in this thread. -- Thanks, Steve Totaro +18887771888 (Toll Free) +12409381212 (Cell) +12024369784 (Skype) ___ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling
On Wed, Jun 17, 2009 at 3:18 PM, John Todd wrote: > > On Jun 17, 2009, at 8:16 AM, Steve Totaro wrote: > > > Hi, > > > > Quick question to the real world. > > > > Approx what specs would I need on server to handle 95 ZAP or Dahdi - > > > SIP gateway using G729 on the SIP to carrier side (nothing else, > > just media conversion)? > > > > Does the latest Asterisk/DAHDI significantly improve these numbers > > over say, Asterisk 1.2.X? > > > > Sure, there is plenty to read but nothing I could find quickly on my > > exact needs that was clear and I want to be fairly sure before > > ordering a server. > > > > Obviously load avg has something to do with it but CPU and mem seems > > to be the biggest factors. > > > > -- > > Thanks, > > Steve Totaro > > +18887771888 (Toll Free) > > +12409381212 (Cell) > > +12024369784 (Skype) > > > [Digium hat off, ITSP (previous employer) hat on.] > > Not speaking for Digium on this one, but speaking from personal > experience at another company. > > We could get 100 G.729 channels in a 2x3ghz P4 machine with plenty of > CPU room to spare, 4+ years ago, and that rule of thumb served us > well. This was for SIP (G.711<->G.729) but I can't imagine that DAHDI > or Asterisk takes significantly more or less horsepower for this task > now for the base transcoding load. > > JT > > --- > John Todd > email:jt...@digium.com > Digium, Inc. | Asterisk Open Source Community Director > 445 Jan Davis Drive NW - Huntsville AL 35806 - USA > direct: +1-256-428-6083 http://www.digium.com/ > > > > John, Thanks for the info. Can't go wrong with these in a cluster if you are correct. http://www.surpluscomputers.com/348663/hp-dl140-proliant-dual-xeon.html I used to have a garden of seven (too small to be a farm) of Asterisk boxen that were simple PRI to SIP gateways doing ulaw (no transcoding as far as codec), and nothing else running, just a few lines in each conf file. They would run at ~65% CPU utilization with 95 channels in use on each box. This was the Asterisk, Zaptel, Libpri 1.2.X flavor. I wonder if anyone else has input on the CPU utilization of of a simple PSTN <-> VoIP gateway based on Asterisk? -- Thanks, Steve Totaro +18887771888 (Toll Free) +12409381212 (Cell) +12024369784 (Skype) ___ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling
On Wed, Jun 17, 2009 at 09:34:55AM -0400, Matt Florell wrote: > On 6/17/09, Gordon Henderson wrote: > > On Wed, 17 Jun 2009, Steve Totaro wrote: > > > > > Hi, [snip] > > > > Gordon > > The TC400B is up to 120 channels of G729a now: > http://www.digium.com/en/products/voice/tc400b.php wow if you could only get it as a module for a tdm410, that would be cool > > > MATT--- > > ___ > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > > asterisk-users mailing list > To UNSUBSCRIBE or update options visit: >http://lists.digium.com/mailman/listinfo/asterisk-users > -- Ninety percent of everything is crap. -- Theodore Sturgeon signature.asc Description: Digital signature ___ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling
On Jun 17, 2009, at 8:16 AM, Steve Totaro wrote: > Hi, > > Quick question to the real world. > > Approx what specs would I need on server to handle 95 ZAP or Dahdi - > > SIP gateway using G729 on the SIP to carrier side (nothing else, > just media conversion)? > > Does the latest Asterisk/DAHDI significantly improve these numbers > over say, Asterisk 1.2.X? > > Sure, there is plenty to read but nothing I could find quickly on my > exact needs that was clear and I want to be fairly sure before > ordering a server. > > Obviously load avg has something to do with it but CPU and mem seems > to be the biggest factors. > > -- > Thanks, > Steve Totaro > +18887771888 (Toll Free) > +12409381212 (Cell) > +12024369784 (Skype) [Digium hat off, ITSP (previous employer) hat on.] Not speaking for Digium on this one, but speaking from personal experience at another company. We could get 100 G.729 channels in a 2x3ghz P4 machine with plenty of CPU room to spare, 4+ years ago, and that rule of thumb served us well. This was for SIP (G.711<->G.729) but I can't imagine that DAHDI or Asterisk takes significantly more or less horsepower for this task now for the base transcoding load. JT --- John Todd email:jt...@digium.com Digium, Inc. | Asterisk Open Source Community Director 445 Jan Davis Drive NW - Huntsville AL 35806 - USA direct: +1-256-428-6083 http://www.digium.com/ ___ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling
On Wed, 17 Jun 2009, Matt Florell wrote: > The TC400B is up to 120 channels of G729a now: > http://www.digium.com/en/products/voice/tc400b.php I guess the UK disty hasn't updated their website yet then :) Gordon ___ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling
On 6/17/09, Gordon Henderson wrote: > On Wed, 17 Jun 2009, Steve Totaro wrote: > > > Hi, > > > > Quick question to the real world. > > > > Approx what specs would I need on server to handle 95 ZAP or Dahdi -> SIP > > gateway using G729 on the SIP to carrier side (nothing else, just media > > conversion)? > > > > Does the latest Asterisk/DAHDI significantly improve these numbers over > say, > > Asterisk 1.2.X? > > > > Sure, there is plenty to read but nothing I could find quickly on my exact > > needs that was clear and I want to be fairly sure before ordering a server. > > > > Obviously load avg has something to do with it but CPU and mem seems to be > > the biggest factors. > > > Transcoding - It's CPU grunt and multi processors what you need (more so > than memory) - handling that many calls ought to be a breeze on any modern > hardware without the transcoding. Personally, I think I'd be looking at a > TC400B card which can handle 96 concurrent g729 transcodes... > > But as a rough benchmark, I can do 12 concurrent g729 transcodes on a 1GHz > VIA processor before it's totally maxed out, (stupid) extrapolation to 96 > would suggest 8 x 1GHz processors, 4 x 2GHz, or 3 x 3GHz processors... > However you gain more with the faster processors in terms of bigger cache > (but that can also be a loss too) I'd start with a quad core system and > see how it goes under benchmark... > > Gordon The TC400B is up to 120 channels of G729a now: http://www.digium.com/en/products/voice/tc400b.php MATT--- ___ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling
On Wed, 17 Jun 2009, Steve Totaro wrote: > Hi, > > Quick question to the real world. > > Approx what specs would I need on server to handle 95 ZAP or Dahdi -> SIP > gateway using G729 on the SIP to carrier side (nothing else, just media > conversion)? > > Does the latest Asterisk/DAHDI significantly improve these numbers over say, > Asterisk 1.2.X? > > Sure, there is plenty to read but nothing I could find quickly on my exact > needs that was clear and I want to be fairly sure before ordering a server. > > Obviously load avg has something to do with it but CPU and mem seems to be > the biggest factors. Transcoding - It's CPU grunt and multi processors what you need (more so than memory) - handling that many calls ought to be a breeze on any modern hardware without the transcoding. Personally, I think I'd be looking at a TC400B card which can handle 96 concurrent g729 transcodes... But as a rough benchmark, I can do 12 concurrent g729 transcodes on a 1GHz VIA processor before it's totally maxed out, (stupid) extrapolation to 96 would suggest 8 x 1GHz processors, 4 x 2GHz, or 3 x 3GHz processors... However you gain more with the faster processors in terms of bigger cache (but that can also be a loss too) I'd start with a quad core system and see how it goes under benchmark... Gordon ___ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
[asterisk-users] Scaling
Hi, Quick question to the real world. Approx what specs would I need on server to handle 95 ZAP or Dahdi -> SIP gateway using G729 on the SIP to carrier side (nothing else, just media conversion)? Does the latest Asterisk/DAHDI significantly improve these numbers over say, Asterisk 1.2.X? Sure, there is plenty to read but nothing I could find quickly on my exact needs that was clear and I want to be fairly sure before ordering a server. Obviously load avg has something to do with it but CPU and mem seems to be the biggest factors. -- Thanks, Steve Totaro +18887771888 (Toll Free) +12409381212 (Cell) +12024369784 (Skype) ___ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
[asterisk-users] scaling with SMP
Is there a way to cause asterisk to benefit from running on a machine with more than two cores? I only see two processes running, with one at a very low priority and the other at a very high priority. I'm guessing one is managing the other. Thanks, Mark ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
Matthew J. Roth wrote: > In the meantime, I'm looking for insights as to what would cause > Asterisk (or any other process) to idle at the same value, despite > having similar workloads and twice as many CPUs available to it. I'll > be working on benchmarking Asterisk from very low to very high call > volumes so any suggestions or tips, such as how to generate a large > number of calls or what statistics I should gather, would also be > appreciated. I am very curious if using this library on your system will help increase the load you are able to put on the dual core system. http://www.hoard.org/ People that are running Asterisk on Solaris have noted that using the mtmalloc library allows for much higher call density. I am hoping that hoard will let the people running Asterisk on Linux see similar performance improvements, but I have yet to convince anyone to give it a try and let me know how it goes. :) -- Russell Bryant Software Engineer Digium, Inc. ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
RE: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yieldinggains at high call volumes
Hi, The system seems to be IO bound for some reason. Reading at the older posts you mentioned that there is no significant disc activity so it could be ethernet i/o and/or interrupts that are causing this (old or insuficient ethernet driver maybe ?) Usually this kind of i/o wait is present on machines that have run out of memory and need to swap to disk Also with regard to the higher system usage on multicore systems, its very probable that its due to task migration from core to core > Here is something we recently noticed that may explain why the dual-core > server is under-performing at high call volumes. The following numbers > were collected off both servers while they were in production. Note > that while they have similar cumulative idle values, the ratio of system > time to user time on the single-core server is roughly 2.3 to 1, but on > the dual-core server it is roughly 19.6 to 1. I'm not quite sure what > to make of this, but it seems to be very relevant to the problem. > > Mon Apr 2 12:15:01 EDT 2007 > Idle (sar -P ALL 60 14) (60 seconds 14 slices) > Linux 2.6.12-1.1376_FC3smp (4core.imminc.com) 04/02/07 > > 12:24:01 CPU %user %nice %system %iowait %idle > 12:25:02 all 14.97 0.03 34.25 0.92 49.82 > 12:25:020 8.83 0.05 33.60 1.28 56.24 > 12:25:021 17.50 0.02 34.60 0.57 47.32 > 12:25:022 19.94 0.02 33.52 1.31 45.22 > 12:25:023 13.62 0.02 35.29 0.52 50.55 > > Thu May 10 15:30:01 EDT 2007 > Idle (sar -P ALL 60 14) (60 seconds 14 slices) > Linux 2.6.12-1.1376_FC3smp (8core.imminc.com) 05/10/07 > > 15:38:01 CPU %user %nice %system %iowait %idle > 15:39:01 all 2.47 0.01 48.29 0.00 49.23 > 15:39:010 2.92 0.00 53.17 0.00 43.91 > 15:39:011 2.98 0.00 48.68 0.02 48.33 > 15:39:012 2.47 0.02 48.61 0.00 48.91 > 15:39:013 2.27 0.00 48.35 0.00 49.38 > 15:39:014 2.38 0.02 47.38 0.00 50.22 > 15:39:015 2.37 0.02 46.94 0.00 50.67 > 15:39:016 2.23 0.02 46.63 0.00 51.12 > 15:39:017 2.17 0.02 46.54 0.00 51.27 > Stelios S. Koroneos Digital OPSiS - Embedded Intelligence http://www.digital-opsis.com ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: High volume benchmarks (0 to 450 calls)
Remco Post wrote: I guess that if I read these stats correctly, the bottleneck for * is not so much cpu power, it's the cpu cache. As I see it, the cpu cache becomes far less efficient for larger call volumes, eg. the cache is unable to keep the most frequently used code and data in cache, due to the sheer amount of call date going through the cpu. I guess that you do have some gain from going from single core to dual-core but is dwarfed by the very limited effect on the cache. But that is just a guess. Maybe for pure voip solutions cpu's with a huge cache like eg Power5+ would perform much better that ia32/x64 cpu's. Remco, Allow me to add some extra information about the processors on each server. Dual-Core/8 CPU Server == CPU Manufacturer: Intel CPU Family: Xeon MP Codename: Tulsa Processor Number: 7120M CPU Speed: 3.00 GHz Bus Speed: 800 MHz CPU Cores: 2 L1 Cache: 32 KB L2 Cache: 2048 KB (1024 KB Per Core) L3 Cache: 4096 KB (Shared) More Info: http://processorfinder.intel.com/details.aspx?sSpec=SL9HC Single-Core/4 CPU Server CPU Manufacturer: Intel CPU Family: Xeon MP Codename: Cranford Processor Number: N/A CPU Speed: 3.16 GHz Bus Speed: 667 MHz CPU Cores: 1 L1 Cache: 16 KB L2 Cache: 1024 KB L3 Cache: 0 KB More Info: http://processorfinder.intel.com/details.aspx?sSpec=SL84U As you can see, the dual-core server not only has more processing power it also has much more CPU cache. While I don't doubt that CPU cache may eventually limit Asterisk's scalability, I don't think it's the bottleneck I'm hitting. Moving up to a 2.6.17 kernel with multi-core scheduling support (IIRC this considers the shared cache when making scheduling decisions) would likely gain me a small performance benefit, but nothing in the range of what we were expecting when we went to dual-core. One difference that really stands out between the performance of the single-core and dual-core servers is the ratio of system to user time at similar loads. Note that a load that would bring the single-core server to 50% idle would bring the dual-core server to roughly 74% idle. We used sar to capture these loads on both types of servers when they were in production (the output is shown below). The system to user time ratio on the single-core server is 2.3 to 1, but it is 14.6 to 1 on the dual-core server. Stephen Davies has suggested using oprofile to determine where the system time is being spent, but I haven't gotten a chance to do so yet. I'm also considering strace. Any other possible explanations or suggestions for diagnosing this problem would be appreciated. Mon Apr 2 12:15:01 EDT 2007 Idle (sar -P ALL 60 14) (60 seconds 14 slices) Linux 2.6.12-1.1376_FC3smp (4core.imminc.com) 04/02/07 12:24:01 CPU %user %nice %system %iowait %idle 12:25:02 all 14.97 0.03 34.25 0.92 49.82 12:25:020 8.83 0.05 33.60 1.28 56.24 12:25:021 17.50 0.02 34.60 0.57 47.32 12:25:022 19.94 0.02 33.52 1.31 45.22 12:25:023 13.62 0.02 35.29 0.52 50.55 Fri May 11 12:00:01 EDT 2007 Idle (sar -P ALL 60 14) (60 seconds 14 slices) Linux 2.6.12-1.1376_FC3smp (8core.imminc.com) 05/11/07 12:08:02 CPU %user %nice %system %iowait %idle 12:09:02 all 1.69 0.00 24.70 0.08 73.52 12:09:020 2.08 0.02 30.16 0.00 67.74 12:09:021 1.95 0.00 25.59 0.62 71.85 12:09:022 1.73 0.00 25.12 0.00 73.15 12:09:023 1.55 0.02 24.70 0.00 73.73 12:09:024 1.67 0.00 23.54 0.02 74.78 12:09:025 1.57 0.02 23.13 0.00 75.29 12:09:026 1.45 0.02 22.90 0.00 75.64 12:09:027 1.48 0.00 22.54 0.00 75.98 Thank you for your response, Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: High volume benchmarks (0 to 450 calls)
Matthew J. Roth wrote: > List users, > > This post contains the benchmarks for Asterisk at high call volumes on a > 4 CPU, dual-core (8 cores total) server. It's a continuation of the > posts titled "Scaling Asterisk: Dual-Core CPUs not yielding gains at > high call volumes". They contain a fair amount of information, > including details about our servers and the software on them. I'm happy > to answer any questions you might have, but please take a moment to > review those posts to make sure they don't contain the information > you're seeking. I guess that if I read these stats correctly, the bottleneck for * is not so much cpu power, it's the cpu cache. As I see it, the cpu cache becomes far less efficient for larger call volumes, eg. the cache is unable to keep the most frequently used code and data in cache, due to the sheer amount of call date going through the cpu. I guess that you do have some gain from going from single core to dual-core but is dwarfed by the very limited effect on the cache. But that is just a guess. Maybe for pure voip solutions cpu's with a huge cache like eg Power5+ would perform much better that ia32/x64 cpu's. > > Thank you, > > Matthew Roth > InterMedia Marketing Solutions > Software Engineer and Systems Developer > > > Conclusions > --- > Once again, I'm presenting the conclusions first. Scroll down if you're > more interested in the raw data. > > 1. Asterisk scales quite well up to a certain number of calls. At this > point, the cost in CPU cycles per call starts to increase more > drastically. A graph of the Avg Used% values can be used to demonstrate > this. It can be described as consisting of two roughly linear > segments. The first segment is from 0 to 110 calls. The rest of the > graph is a second, steeper segment. This is not entirely true, as in > fact each new call costs a little more than the last, but it is a useful > simplification. > 2. Even at very high call volumes, Asterisk uses less than 512 KB of > memory. 2 GB of RAM would probably avoid swapping and excessive disk > activity on most Asterisk installations. > 3. Future benchmarks should be based on the number of active channels, > not active calls. > > I'm relying on you to point out my mistakes and omissions, so please > take a look at the data and respond with your own analysis and conclusions. > > Benchmarking Methodology > > The benchmarks are based on data I collected over the period of > 5/12/2007 to 05/30/2007 from two production servers used in our inbound > call center. The servers are identical 8-core Dell PowerEdge 6850s as > documented in my prior posts. They are meant to be used as a > primary/backup pair, but both were used in production in order to rule > out a hardware failure as the cause of our scaling issues. > > The data was collected by a bash script executed from cron every 2 > minutes. This script utilizes some basic Linux tools (such as sar, > free, df, and the proc filesystem) to record information about the > system, and 'asterisk -rx "show channels"' to record information about > the number of active calls and channels within Asterisk. > > Unfortunately, the sample sizes this produced are relatively small for > the 300-450 call range. This is due to two factors: > > 1. The majority of the time we don't operate at such high call volumes. > 2. Asterisk intermittently fails to report call and channel statistics > when the CPU idle is low. > > This means that the benchmarking results are somewhat erratic for the > 300-450 call range. The good news is that they are pretty consistent > for 0 to 300 calls, and I'd imagine that covers the range most people > are interested in. > > Keep in mind that the impetus behind this benchmarking was the lack of a > performance boost on the dual-core server at high call volumes, so the > high call range may also be skewed by whatever bottleneck is being hit > on the 8-core servers. In the near future, we will be adding one of our > 4-core PowerEdge 6850s to our production environment. I'll collect and > analyze the same data, which I believe will show similar performance (as > defined by cumulative idle CPU percentage) at around 200-300 calls. > > In the end, I hope to understand this problem well enough to overcome it > or determine what the optimal point is for achieving the highest call > volume without over-dimensioning the hardware. > > Call Types and a Note on Channels > - > All of the calls are SIP-to-SIP using the uLaw codec. The vast majority > of the calls are either in queue or connected to an agent, but there are > also a small number of regular outbound calls and transfers. Every call > that is connected to an agent is recorded via the Monitor() application > in PCM format to a RAM disk. In short, there was no transcoding, > protocol bridging, or TDM hardware involved on the servers being > benchmarked. > > At any given time, t
[asterisk-users] Scaling Asterisk: High volume benchmarks (0 to 450 calls)
List users, This post contains the benchmarks for Asterisk at high call volumes on a 4 CPU, dual-core (8 cores total) server. It's a continuation of the posts titled "Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes". They contain a fair amount of information, including details about our servers and the software on them. I'm happy to answer any questions you might have, but please take a moment to review those posts to make sure they don't contain the information you're seeking. Thank you, Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer Conclusions --- Once again, I'm presenting the conclusions first. Scroll down if you're more interested in the raw data. 1. Asterisk scales quite well up to a certain number of calls. At this point, the cost in CPU cycles per call starts to increase more drastically. A graph of the Avg Used% values can be used to demonstrate this. It can be described as consisting of two roughly linear segments. The first segment is from 0 to 110 calls. The rest of the graph is a second, steeper segment. This is not entirely true, as in fact each new call costs a little more than the last, but it is a useful simplification. 2. Even at very high call volumes, Asterisk uses less than 512 KB of memory. 2 GB of RAM would probably avoid swapping and excessive disk activity on most Asterisk installations. 3. Future benchmarks should be based on the number of active channels, not active calls. I'm relying on you to point out my mistakes and omissions, so please take a look at the data and respond with your own analysis and conclusions. Benchmarking Methodology The benchmarks are based on data I collected over the period of 5/12/2007 to 05/30/2007 from two production servers used in our inbound call center. The servers are identical 8-core Dell PowerEdge 6850s as documented in my prior posts. They are meant to be used as a primary/backup pair, but both were used in production in order to rule out a hardware failure as the cause of our scaling issues. The data was collected by a bash script executed from cron every 2 minutes. This script utilizes some basic Linux tools (such as sar, free, df, and the proc filesystem) to record information about the system, and 'asterisk -rx "show channels"' to record information about the number of active calls and channels within Asterisk. Unfortunately, the sample sizes this produced are relatively small for the 300-450 call range. This is due to two factors: 1. The majority of the time we don't operate at such high call volumes. 2. Asterisk intermittently fails to report call and channel statistics when the CPU idle is low. This means that the benchmarking results are somewhat erratic for the 300-450 call range. The good news is that they are pretty consistent for 0 to 300 calls, and I'd imagine that covers the range most people are interested in. Keep in mind that the impetus behind this benchmarking was the lack of a performance boost on the dual-core server at high call volumes, so the high call range may also be skewed by whatever bottleneck is being hit on the 8-core servers. In the near future, we will be adding one of our 4-core PowerEdge 6850s to our production environment. I'll collect and analyze the same data, which I believe will show similar performance (as defined by cumulative idle CPU percentage) at around 200-300 calls. In the end, I hope to understand this problem well enough to overcome it or determine what the optimal point is for achieving the highest call volume without over-dimensioning the hardware. Call Types and a Note on Channels - All of the calls are SIP-to-SIP using the uLaw codec. The vast majority of the calls are either in queue or connected to an agent, but there are also a small number of regular outbound calls and transfers. Every call that is connected to an agent is recorded via the Monitor() application in PCM format to a RAM disk. In short, there was no transcoding, protocol bridging, or TDM hardware involved on the servers being benchmarked. At any given time, the makeup of the calls varied (ie. calls in queue vs. calls connected to agents). The calls connected to agents involve bridging two SIP channels, so they are more resource intensive. This means that the number of active channels is probably a better base benchmarking unit than the number of active calls. Fortunately, the ratio of calls to channels is somewhat consistent so this round of benchmarking still produced useful results. Future Tests I'm aware that using a live environment isn't ideal for testing. I have some ideas for setting up more controlled tests using SIPp, VICIDIAL, or call files. I think I have the necessary hardware, but I haven't had the time to do much research, let alone implement anything. I
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
On 01/06/07, Matthew J. Roth <[EMAIL PROTECTED]> wrote: Mon Apr 2 12:15:01 EDT 2007 Idle (sar -P ALL 60 14) (60 seconds 14 slices) Linux 2.6.12-1.1376_FC3smp (4core.imminc.com) 04/02/07 12:24:01 CPU %user %nice %system %iowait %idle 12:25:02 all 14.97 0.03 34.25 0.92 49.82 12:25:020 8.83 0.05 33.60 1.28 56.24 12:25:021 17.50 0.02 34.60 0.57 47.32 12:25:022 19.94 0.02 33.52 1.31 45.22 12:25:023 13.62 0.02 35.29 0.52 50.55 Thu May 10 15:30:01 EDT 2007 Idle (sar -P ALL 60 14) (60 seconds 14 slices) Linux 2.6.12-1.1376_FC3smp (8core.imminc.com) 05/10/07 15:38:01 CPU %user %nice %system %iowait %idle 15:39:01 all 2.47 0.01 48.29 0.00 49.23 15:39:010 2.92 0.00 53.17 0.00 43.91 15:39:011 2.98 0.00 48.68 0.02 48.33 15:39:012 2.47 0.02 48.61 0.00 48.91 15:39:013 2.27 0.00 48.35 0.00 49.38 15:39:014 2.38 0.02 47.38 0.00 50.22 15:39:015 2.37 0.02 46.94 0.00 50.67 15:39:016 2.23 0.02 46.63 0.00 51.12 15:39:017 2.17 0.02 46.54 0.00 51.27 Have you got, or could you install oprofile? That will give you a LOT of information as to where your CPUs are spending their time, One guess is that you could be hitting contention in the kernel with all the cores contending for some scarce resource. So your cores can't execute because they are waiting on some kernel mutex for access to some resource. That would account for the increase in system time - oprofile would show where in the kernel they are spending time (where those 50%ishes are going). Steve Uhler at Sun has been studying this on his big multi-core Sparc boxes so he can probably contribute some insight. Hope you don't mind a cc, Steve. We're talking about Asterisk/Linux running out of scaling on an 8 core box. Steve ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
John Hughes wrote: Matthew J. Roth wrote: As far as Asterisk is concerned, at low call volumes the dual-core server outperforms the single-core server at a similar rate. Outperforms in what sense? At low call volumes the cumulative CPU utilization, expressed as a percentage of available processor, is lower on the dual-core server. This is the expected behavior. What I'm proposing (and hope to back up with numbers in the near future) is that as the number of calls rises to the 300-400 range, the cumulative CPU utilization starts to approach the same number on both servers. Unfortunately, I wasn't collecting as much data when the single-core server was in production so some of this is speculation based on my memory of the system's performance. The environment is also different, because we have added agents so the ratio of calls connected vs. calls in queue has changed. Nonetheless, the dual-core server is not performing anywhere near our expectations. Here is something we recently noticed that may explain why the dual-core server is under-performing at high call volumes. The following numbers were collected off both servers while they were in production. Note that while they have similar cumulative idle values, the ratio of system time to user time on the single-core server is roughly 2.3 to 1, but on the dual-core server it is roughly 19.6 to 1. I'm not quite sure what to make of this, but it seems to be very relevant to the problem. Mon Apr 2 12:15:01 EDT 2007 Idle (sar -P ALL 60 14) (60 seconds 14 slices) Linux 2.6.12-1.1376_FC3smp (4core.imminc.com) 04/02/07 12:24:01 CPU %user %nice %system %iowait %idle 12:25:02 all 14.97 0.03 34.25 0.92 49.82 12:25:020 8.83 0.05 33.60 1.28 56.24 12:25:021 17.50 0.02 34.60 0.57 47.32 12:25:022 19.94 0.02 33.52 1.31 45.22 12:25:023 13.62 0.02 35.29 0.52 50.55 Thu May 10 15:30:01 EDT 2007 Idle (sar -P ALL 60 14) (60 seconds 14 slices) Linux 2.6.12-1.1376_FC3smp (8core.imminc.com) 05/10/07 15:38:01 CPU %user %nice %system %iowait %idle 15:39:01 all 2.47 0.01 48.29 0.00 49.23 15:39:010 2.92 0.00 53.17 0.00 43.91 15:39:011 2.98 0.00 48.68 0.02 48.33 15:39:012 2.47 0.02 48.61 0.00 48.91 15:39:013 2.27 0.00 48.35 0.00 49.38 15:39:014 2.38 0.02 47.38 0.00 50.22 15:39:015 2.37 0.02 46.94 0.00 50.67 15:39:016 2.23 0.02 46.63 0.00 51.12 15:39:017 2.17 0.02 46.54 0.00 51.27 I'm working on a follow-up post that will demonstrate this with some benchmarks for a small number of calls in various scenarios on each machine. However, to our surprise as the number of concurrent calls increases, the performance gains begin to flatten out. In fact, it seems that somewhere between 200 and 300 calls, the two servers start to exhibit similar idle times despite one of them having twice as many cores. What do you mean by "idle" here? Idle percentage as shown in top's or sar's cumulative view. Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yieldinggains at high call volumes
On 01/06/07, Douglas Garstang <[EMAIL PROTECTED]> wrote: I previously worked for a company that did some heavy load testing with Asterisk on multiple core Sun systems. We saw that no matter how many cores you threw at Asterisk, it always used ONE core to process calls, even at very high loads. This is definitely not true in the general case. But using IAX2 prior to 1.4 does have a limit like that because all network traffic is handled in a single thread. Take a core dump of a working Asterisk box and count all the threads. There's no general lack of multi-threadedness, that's for sure. Steve ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
Hi Matthew: Your environment sounds quite challenging and I'd be interested in the analysis of what is limiting the throughput. I agree that there's no easy way to distribute and single queue across multiple boxes. But here is a scaling idea for you. We've used it successfully to handle a large inbound call centre. It also provides resilience: 1) Incoming PRIs connect to multiple boxes that we'll call the "voice" gateways. Each box can have a proportion of your PRIs connected. Depending on the box power, up to 8 or so. 2) Agent registrations are spread across these same boxes. 3) Lastly you define two or more additional boxes as your queue servers. Every queue server has defined on it all the queues you need. But for each queue one server is regarded as the primary and the other as secondary. You mix things up so in the normal event about half your queueing calls are on each server (extend the idea for more than 2 queue servers). Incoming calls on the voice gateways are sent to the Queue server over IAX: exten => 1234,1,Dial(IAX2/primary1234/${EXTEN}) exten => 1234,n,Dial(IAX2/sec1234/${EXTEN}) ; if we can't get to the primary Now when an Agent wants to login, you have their agent gateway log in to both of the queue servers on their behalf, using an IAX2/.. channel to get back to the agent's voice gateway. So on the queue server we have the agents for the queue logged in as say IAX2/voicegw1/6001, IAX2/voicegw2/6002 etc etc. The trick is to use transfer=yes aka notransfer=no on the various boxes. So as soon as the call gets connected to an agent it disappears off the queue box completely. The nett result is that the queue servers only have to handle customers who are still in the queue. As soon as they get connected to an agent the call is directly from the arriving voice gateway to the agent's voice gateway and on to the agent. In a proportion of the time that even turns out to be the same box. You can scale up the number of voice gateways as required and handle 1000s of calls "connected" to agents without needing supercomputers. You still handle all the people queueing on a particular queue all on the same queueing server. So you can tell them where they are in the queue and all that. But you can split up your queues across multiple boxes to help divide and conquer the load. If you can reach the agent phone directly using IAX (use an IAX softphone or something) you can make a little optimisation and log IAX2/agentipaddress into the queue directly. Then the call gets optimised to go directly from the incoming voice gateway to the agent's PC. Resilience? If a queue server is down, new callers will automatically start to queue on the backup box for the queues affected. The agents are known on both primary and backup queue boxes so things keep going. If a voice gateway goes down you lose just some of your PRIs, so you are still in business. If you need the capacity, use an ISDNguard to kick the PRIs onto one of the other voice gateways. Agents that were on the voice gateway that went down will need to reregister to a box still running. IP address takeover can make that happen. For me this sort of design is much better than one giant box. Regards, Steve Davies Technical Director Connection Telecom (Pty) Ltd ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
RE: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yieldinggains at high call volumes
I previously worked for a company that did some heavy load testing with Asterisk on multiple core Sun systems. We saw that no matter how many cores you threw at Asterisk, it always used ONE core to process calls, even at very high loads. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Matthew J. Roth Sent: Friday, June 01, 2007 9:07 AM To: Asterisk Users Mailing List - Non-Commercial Discussion Subject: Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yieldinggains at high call volumes John Hughes wrote: > OpenSSI can't (at the moment) migrate threads between compute nodes. It > can migrate separate processes, but doesn't Asterisk use threads? John, Asterisk uses 1 thread per call, plus about 10 to 15 background threads that persist throughout the life of the process. I'm curious if the 1 thread per call model is efficient as the number of calls increases. It's possible that in the 100+ call range that there is a significant overhead to managing all of those threads without much gain since most servers have 1 to 8 processors to actually schedule them on. Acquiring locks on shared resources between the threads could be pretty nasty at that point, too. I wonder if pooling the calls in X threads, where X is a value that is determined at compile time by looking at the number of processors available, would be more efficient? This is probably just an academic question, because I'd imagine it would require an overhaul of the codebase to accomplish. Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks
John Hughes wrote: For me all these numbers look too small to be useful for benchmarking. John, They are small, and they are probably more useful as baseline numbers. I'm working on writing up some data I've collected off of our production switch. The call range is 0-450 at 10 call increments. Unfortunately, it's a live environment so it's less than ideal for benchmarking. The makeup of the calls varies, I'm relying on historical data (ie. I can't reproduce the scenarios), and my sample sizes are much bigger for 0-300 calls than they are for 300-450. Nonetheless, there is some knowledge to be gained by studying the numbers and I'm sure that 300 calls constitutes large scale for most people. In the future, I'd like to recreate these numbers using something like SIPp to give me more control. Until then, I'm working with what I have. Thank you for your replies, Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
John Hughes wrote: OpenSSI can't (at the moment) migrate threads between compute nodes. It can migrate separate processes, but doesn't Asterisk use threads? John, Asterisk uses 1 thread per call, plus about 10 to 15 background threads that persist throughout the life of the process. I'm curious if the 1 thread per call model is efficient as the number of calls increases. It's possible that in the 100+ call range that there is a significant overhead to managing all of those threads without much gain since most servers have 1 to 8 processors to actually schedule them on. Acquiring locks on shared resources between the threads could be pretty nasty at that point, too. I wonder if pooling the calls in X threads, where X is a value that is determined at compile time by looking at the number of processors available, would be more efficient? This is probably just an academic question, because I'd imagine it would require an overhaul of the codebase to accomplish. Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks
Matthew J. Roth wrote: > This post contains the benchmarks for Asterisk at low call volumes on > similar single and dual-core servers. I'd appreciate it greatly if > you took the time to read and comment on it. For me all these numbers look too small to be useful for benchmarking. ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
Matthew J. Roth wrote: > Recently, we were pushing our server to almost full CPU utilization. > Since we've observed that Asterisk is CPU bound, we upgraded our > server from a PowerEdge 6850 with four single-core Intel Xeon CPUs > running at 3.16GHz, to a PowerEdge 6850 with 4 dual-core Intel Xeon > CPUs running at 3.00GHz. The software installed is identical and a > kernel build benchmark yielded promising results. The new dual-core > server ran roughly 80% faster, which is about what we expected. > > As far as Asterisk is concerned, at low call volumes the dual-core > server outperforms the single-core server at a similar rate. Outperforms in what sense? > I'm working on a follow-up post that will demonstrate this with some > benchmarks for a small number of calls in various scenarios on each > machine. However, to our surprise as the number of concurrent calls > increases, the performance gains begin to flatten out. In fact, it > seems that somewhere between 200 and 300 calls, the two servers start > to exhibit similar idle times despite one of them having twice as many > cores. What do you mean by "idle" here? ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
Sean M. Pappalardo wrote: > Hi there. > > Just curious if you've checked out Linux clustering software such as > OpenSSI ( http://www.openssi.org/ ) and run Asterisk on it? It > features a multi-threaded cluster-aware shell (and custom kernel) that > will automatically cluster-ize any regular Linux executable (such as > the main Asterisk process.) If it works as advertised, it should just > be a matter of adding boxes to the cluster to speed up processing. OpenSSI can't (at the moment) migrate threads between compute nodes. It can migrate separate processes, but doesn't Asterisk use threads? ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
FYI, http://www.voip-info.org/wiki/index.php?page=Asterisk+FAQ *Can i install Asterisk on a beowulf cluster?* A cluster can't migrate threads that use shared memory. Asterisk uses that kind of threads.So no, Asterisk wouldn't work on a cluster. *(It might be helpful to know whether anyone has a working load-balanced Asterisk configuration where multiple systems can share the load of an Asterisk environment (IAX2, not SIP) and whether this environment would fail over nicely in the event of downtime!)* On 5/25/07, Sean M. Pappalardo <[EMAIL PROTECTED]> wrote: Hi there. Just curious if you've checked out Linux clustering software such as OpenSSI ( http://www.openssi.org/ ) and run Asterisk on it? It features a multi-threaded cluster-aware shell (and custom kernel) that will automatically cluster-ize any regular Linux executable (such as the main Asterisk process.) If it works as advertised, it should just be a matter of adding boxes to the cluster to speed up processing. As for Asterisk itself, is it multi-threaded enough to take advantage of 4+ way systems? Sean Pappalardo <<->> This E-Mail message has been scanned for viruses and cleared by >>SmartMail<< from Smarter Technology, Inc. <<->> ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users -- Esta mensagem (incluindo quaisquer anexos) pode conter informação confidencial para uso exclusivo do destinatário. Se não for o destinatário pretendido, não deverá usar, distribuir ou copiar este e-mail. Se recebeu esta mensagem por engano, por favor informe o emissor e elimine-a imediatamente. Obrigado. This e-mail message is intended only for individual(s) to whom it is addressed and may contain information that is privileged, confidential, proprietary, or otherwise exempt from disclosure under applicable law. If you believe you have received this message in error, please advise the sender by return e-mail and delete it from your mailbox. Thank you. ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
Mark Coccimiglio wrote: Sounds like you are running into the hardware limitations of your systems PCI or "Front Side Bus" (FSB) and not necessarily an issue of asterisk. In short there is a limited amount of bandwidth on the computer's PCI Bus (33 MHz) and the FSB (100-800MHz). One thing to remember is that ALL cores and data streams need to share the PCI and FSB.Asterisk is very processor and memory intensive. At the extreme level of usage more cores won't help if data is "stuck in the pipe". So the performance planing you described would be expected. Mark, That is a great theory and I'd like to follow up on it. Do you know if the PCI or FSB buses are instrumented by Linux? If not, are you aware of any way to gather statistics about their utilization? I'd like to see if the numbers support your idea and, if so, which bus is saturated. Let me add a little bit of extra information to this discussion. The CPU utilization does not flatten out at 50%. In fact, as more calls are added, Asterisk will eventually drive the idle percentage down to single digits with surprisingly few problems. If PCI or FSB bandwidth were the limiting factor, wouldn't the CPU utilization top out at the point that the available bandwidth was used? Thank you, Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks
William Moore wrote: Are you recording memory figures as well and have you checked the total used memory? Or did I miss it somewhere? Thanks for doing this, scalability testing is always good. William, This round of benchmarking is heavily focused on CPU utilization, because it is causing an immediate problem for me. However, I am tracking some other statistics on a daily basis including memory utilization, swap utilization, load averages, and active channels and calls. One of my colleagues takes the text file I produce and creates graphs using Cacti and rrdtool. You'll be interested in these two (sorry for the format of the URLS, but otherwise the list was eating my posts): - Percent CPU Used With No. Calls and No. Channels - Asterisk Memory Used (KB) Note that even with a peak call volume of approximately 400 active calls and 550 active SIP channels, the memory utilization never surpasses 600 KB. I'd estimate that most Asterisk installations would avoid swapping with 1 GB of RAM. A 2nd GB might be useful to provide plenty of room for file caching so that your hard disk doesn't become a bottleneck. We also record all of our calls to a 6 GB RAM disk, so our server has a total of 8 GB of RAM but that isn't necessary in most circumstances. Overall, Asterisk seems to be very efficiently coded as far as memory is concerned. Note that for other reasons we perform a nightly reboot, so I don't know if there are any memory leaks that would surface over time. Thank you, Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks - Correction
Luki wrote: Perhaps a naive question, but how does 0.137% CPU utilization per call equal 1735 MHz per call? If 1735 MHz / 0.137% = 1735 MHz / 0.00137 => 1266423 MHz at 100% utilization ??! Even with 4 CPUs, those would be 316 GHz CPUs. I think you meant: Average CPU utilization per call: 0.137% (~17 MHz) Luki, You are absolutely right. Thank you for pointing out and correcting my mistake. The corrected statistics are below. Note that the MHz per call statistic is calculated with the following formula: MHzPerCall = (numCPUs * CPUspeed) * (avgCPUperCall * .01) Thank you, Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer The Numbers (Corrected) --- DC - Incoming SIP to the Playback() application === calls %user %system %iowait %idle 00.00 0.01 0.01 99.98 10.02 0.04 0.00 99.94 20.02 0.06 0.00 99.92 30.03 0.11 0.00 99.86 40.04 0.13 0.00 99.83 50.05 0.16 0.00 99.80 60.05 0.20 0.00 99.75 70.07 0.24 0.00 99.70 80.07 0.25 0.00 99.67 90.08 0.27 0.00 99.65 100.09 0.33 0.00 99.58 Average CPU utilization per call: 0.040% (~9.60 MHz) SC - Incoming SIP to the Playback() application === calls %user %system %iowait %idle 00.01 0.02 0.00 99.98 10.02 0.10 0.00 99.88 20.03 0.17 0.00 99.80 30.06 0.21 0.00 99.73 40.08 0.28 0.00 99.63 50.10 0.34 0.01 99.55 60.11 0.48 0.00 99.41 70.14 0.49 0.00 99.37 80.16 0.57 0.00 99.28 90.17 0.63 0.01 99.19 100.18 0.75 0.00 99.07 Average CPU utilization per call: 0.091% (~11.52 MHz) DC - Incoming SIP to the Queue() application - In queue === calls %user %system %iowait %idle 00.00 0.01 0.00 99.99 10.01 0.03 0.00 99.96 20.01 0.05 0.00 99.94 30.01 0.08 0.00 99.91 40.02 0.10 0.00 99.88 50.03 0.12 0.00 99.84 60.04 0.16 0.00 99.80 70.03 0.17 0.00 99.80 80.04 0.20 0.00 99.76 90.03 0.22 0.00 99.75 100.05 0.27 0.00 99.68 Average CPU utilization per call: 0.031% (~7.44 MHz) SC - Incoming SIP to the Queue() application - In queue === calls %user %system %iowait %idle 00.02 0.02 0.00 99.96 10.03 0.07 0.00 99.91 20.03 0.13 0.00 99.83 30.04 0.18 0.00 99.78 40.05 0.23 0.00 99.72 50.06 0.27 0.00 99.67 60.07 0.33 0.00 99.60 70.09 0.38 0.00 99.53 80.09 0.40 0.00 99.51 90.11 0.46 0.01 99.43 100.11 0.48 0.00 99.41 Average CPU utilization per call: 0.055% (~6.97 MHz) DC - Incoming SIP to the Queue() application - Bridged to an agent == calls %user %system %iowait %idle 00.00 0.01 0.00 99.99 10.01 0.06 0.00 99.93 20.02 0.14 0.00 99.84 30.03 0.16 0.00 99.81 Average CPU utilization per call: 0.060% (~14.40 MHz) SC - Incoming SIP to the Queue() application - Bridged to an agent == calls %user %system %iowait %idle 00.01 0.02 0.00 99.98 10.02 0.16 0.00 99.82 20.04 0.28 0.00 99.68 30.07 0.36 0.00 99.57 Average CPU utilization per call: 0.137% (~17.35 MHz) ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks
Average CPU utilization per call: 0.137% (~1735 MHz) Perhaps a naive question, but how does 0.137% CPU utilization per call equal 1735 MHz per call? If 1735 MHz / 0.137% = 1735 MHz / 0.00137 => 1266423 MHz at 100% utilization ??! Even with 4 CPUs, those would be 316 GHz CPUs. I think you meant: Average CPU utilization per call: 0.137% (~17 MHz) --Luki ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks
On Saturday 26 May 2007 1:21 am, Edgar Guadamuz wrote: > Very good... by the way, I'm studing electrical engineering and I've > chosen asterisk scalation as my final graduation project. I hope do a > similar work within and asterisk cluster. I've been working as an EE, and I've got to ask... what does software scalability have to do with electrical engineering? If you were in a CS prog I could see it, but I've been doing electronic design and power electronics work for over 10 years and I can't think of where these two intersect. -A. ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
Matthew J. Roth wrote: In fact, it seems that somewhere between 200 and 300 calls, the two servers start to exhibit similar idle times despite one of them having twice as many cores. Sounds like you are running into the hardware limitations of your systems PCI or "Front Side Bus" (FSB) and not necessarily an issue of asterisk. In short there is a limited amount of bandwidth on the computer's PCI Bus (33 MHz) and the FSB (100-800MHz). One thing to remember is that ALL cores and data streams need to share the PCI and FSB.Asterisk is very processor and memory intensive. At the extreme level of usage more cores won't help if data is "stuck in the pipe". So the performance planing you described would be expected. Mark C. ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks
Very good... by the way, I'm studing electrical engineering and I've chosen asterisk scalation as my final graduation project. I hope do a similar work within and asterisk cluster. On 5/25/07, William Moore <[EMAIL PROTECTED]> wrote: On 5/25/07, Matthew J. Roth <[EMAIL PROTECTED]> wrote: > List users, > > This post contains the benchmarks for Asterisk at low call volumes on > similar single and dual-core servers. I'd appreciate it greatly if you > took the time to read and comment on it. Are you recording memory figures as well and have you checked the total used memory? Or did I miss it somewhere? Thanks for doing this, scalability testing is always good. ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks
On 5/25/07, Matthew J. Roth <[EMAIL PROTECTED]> wrote: List users, This post contains the benchmarks for Asterisk at low call volumes on similar single and dual-core servers. I'd appreciate it greatly if you took the time to read and comment on it. Are you recording memory figures as well and have you checked the total used memory? Or did I miss it somewhere? Thanks for doing this, scalability testing is always good. ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
Sean M. Pappalardo wrote: Just curious if you've checked out Linux clustering software such as OpenSSI ( http://www.openssi.org/ ) and run Asterisk on it? It features a multi-threaded cluster-aware shell (and custom kernel) that will automatically cluster-ize any regular Linux executable (such as the main Asterisk process.) If it works as advertised, it should just be a matter of adding boxes to the cluster to speed up processing. As for Asterisk itself, is it multi-threaded enough to take advantage of 4+ way systems? Sean, Thanks for your response. I'm going to take a look into OpenSSI. It'd be amazing if it ran Asterisk without any side effects. I've addressed the number of threads that Asterisk uses in my first follow-up post. In short, the answer is yes because it uses a 1 thread per call model. Thank you, Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks
List users, This post contains the benchmarks for Asterisk at low call volumes on similar single and dual-core servers. I'd appreciate it greatly if you took the time to read and comment on it. Thank you, Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer Conclusions --- I'm presenting the conclusions first, because they are the most important part of the benchmarking. If you like details and numbers, scroll down. I've drawn three conclusions from this set of benchmarks. 1. At low call volumes, the dual-core server outperforms the single-core server by the expected margin. 2. Calls bridged to an agent are more CPU intensive than calls listening to audio via the Playback() application or calls in queue. This is expected, because they involve more SIP channels and more work is done on the RTP frames (bridging, recording, etc.). 3. For all call types, the majority of the CPU time is spent in the kernel (servicing system calls, etc.). I've observed this to be true at all call volumes on our production server, with the ratio sometimes in the range of 20 to 1. This may suggest that the popular perception that Asterisk doesn't scale well because of its extensive use of linked lists doesn't tell the whole story. So far there are no surprises, but over the next week or so I'll be collecting data that I expect to reveal that at high call volumes (200-300 concurrent calls) the idle percentage on both machines starts to approach the same value. In the end, my goal is to break through (or, at the least, understand) this scaling issue, so I welcome all forms of critique. It's quite possible that the problem lies in my setup or that I'm missing something obvious, but I suspect it is deeper than that. Benchmarking Methodology I collected each type of data as follows. - Active channel and call counts: 'asterisk -rx "show channels"' and 'asterisk -rx "sip show channels"' - Thread counts: 'ps -eLf' and 'ps axms' - Idle time values: 'sar 30 1' - Average CPU utilization per call: (startIdle - endIdle) / numCalls The servers were rebooted between tests. Call Types -- I tested the following three call types. - Incoming SIP to the Playback() application - 1 active SIP channel per call - From the originating Asterisk server to the Playback() application - Incoming SIP to the Queue() application - In queue - 1 active SIP channel per call - From the originating Asterisk server to the Queue() application - Incoming SIP to the Queue() application - Bridged to an agent - 2 active SIP channels per call - From the originating Asterisk server to the Queue() application - Bridged from the Queue() application to the agent All calls were pure VOIP (SIP/RTP) and originated from another Asterisk server. Calls that were bridged to agents terminated at SIP hardphones (Snom 320s) and were recorded to a RAM disk via the Monitor() application. All calls were in the uLaw codec and all audio files (including the call recordings, the native MOH, and the periodic queue announcements which played approximately every 60 seconds) were in the PCM file format. There was no transcoding, protocol bridging, or TDM hardware involved on the servers being benchmarked. A Note on Asterisk and Threads -- On both systems, a freshly started Asterisk process consisted of 10 threads. Some events, such as performing an 'asterisk -rx reload' triggered the creation of a new persistent thread. The benchmarking revealed that in general, the Asterisk process will consist of 10-15 persistent background threads plus exactly 1 additional thread per active call. This means that at even modest call volumes, Asterisk will utilize all of the CPUs in most modern PC-based servers. Server Profiles --- The servers I performed the benchmarking on are described below. Note that the CPUs support hyperthreading, but it is disabled. This is reflected in the CPU count, which is the number of physical processors available to the OS. Short Name: DC Manufacturer: Dell Computer Corporation Product Name: PowerEdge 6850 Processors: Four Dual-Core Intel Xeon MP CPUs at 3.00GHz CPU Count: 8 FSB Speed: 800 MHz OS: Fedora Core 3 - 2.6.13-ztdummy SMP x86_64 Kernel Asterisk Ver: ABE-B.1-3 Short Name: SC Manufacturer: Dell Computer Corporation Product Name: PowerEdge 6850 Processors: Four Single-Core Intel Xeon MP CPUs at 3.16GHz CPU Count: 4 FSB Speed: 667 MHz OS: Fedora Core 3 - 2.6.13-ztdummy SMP x86_64 Kernel Asterisk Ver: ABE-B.1-3 The kernel is a vanilla 2.6.13 kernel with enhanced realtime clock support and a timer frequency of 1000 HZ (earning it the EXTRAVERSION of '-ztdummy'). I am aware that the 2.6.17 kernel introduced multi-core scheduler support, but it exhibited negligible gains in the kernel build benchmark. Nonetheless, I am open to
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
Hi there. Just curious if you've checked out Linux clustering software such as OpenSSI ( http://www.openssi.org/ ) and run Asterisk on it? It features a multi-threaded cluster-aware shell (and custom kernel) that will automatically cluster-ize any regular Linux executable (such as the main Asterisk process.) If it works as advertised, it should just be a matter of adding boxes to the cluster to speed up processing. As for Asterisk itself, is it multi-threaded enough to take advantage of 4+ way systems? Sean Pappalardo <<->> This E-Mail message has been scanned for viruses and cleared by >>SmartMail<< from Smarter Technology, Inc. <<->> ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
[asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
List users, Using Asterisk in an inbound call center environment has led us to pushing the limits of vertical scaling. In order to treat each caller fairly and to utilize our agents as efficiently as possible, it is desirable to configure each client as a single queue. As far as I know, Asterisk's queues cannot be distributed across servers, so the size of the largest queue we service is our vertical scaling goal. In our case, that queue must be able to hold in excess of 300 calls regardless of their makeup (ie. number of calls in queue vs. number of calls connected to an agent). In reality, we are servicing more than one client on our server, so on busy days the total number of calls we're handling is greater than 300. Recently, we were pushing our server to almost full CPU utilization. Since we've observed that Asterisk is CPU bound, we upgraded our server from a PowerEdge 6850 with four single-core Intel Xeon CPUs running at 3.16GHz, to a PowerEdge 6850 with 4 dual-core Intel Xeon CPUs running at 3.00GHz. The software installed is identical and a kernel build benchmark yielded promising results. The new dual-core server ran roughly 80% faster, which is about what we expected. As far as Asterisk is concerned, at low call volumes the dual-core server outperforms the single-core server at a similar rate. I'm working on a follow-up post that will demonstrate this with some benchmarks for a small number of calls in various scenarios on each machine. However, to our surprise as the number of concurrent calls increases, the performance gains begin to flatten out. In fact, it seems that somewhere between 200 and 300 calls, the two servers start to exhibit similar idle times despite one of them having twice as many cores. Once I collect the data, I will add a second follow-up post with a performance curve tracking the full range of call volumes we experience. Unfortunately, from day to day there are some variables that I'm sure affect performance, such as the number of agents logged in and the makeup of the calls. I'll do my best to choose a sample size that smooths out these bumps. In the meantime, I'm looking for insights as to what would cause Asterisk (or any other process) to idle at the same value, despite having similar workloads and twice as many CPUs available to it. I'll be working on benchmarking Asterisk from very low to very high call volumes so any suggestions or tips, such as how to generate a large number of calls or what statistics I should gather, would also be appreciated. Thank you, Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling/Loadbalancing a Call Center and Redundancy
Much higher, maybe double but that is when the agents start to complain that their conversations start "cutting in and out". This is the main reason I am looking into building a re-invite solution that moves the call into "recording/cdr" server's media path. Then we just cap these servers at 50-70 calls and keep track of it in a DB. I think my developers could re-write asterisk to handle this. It will also allow the main "Queue" server(s) to go down, so long as the "recording/cdr" server is still up no ongoing calls would get lost. Thanks, Steve Totaro Matt Florell wrote: To maintain high recording quality with no audio skips we have found that you should not go over 50 conversations being recorded on a single server. What have you found is your limit while maintaining very good audio quality? MATT--- On 9/16/06, Steve Totaro <[EMAIL PROTECTED]> wrote: Right now we are all inbound and every call is recorded. Matt Florell wrote: > Hello, > > At this point in time VICIDIAL is more focused on outbound features, > but inbound and blended capcbilities have been part of VICIDIAL for > about two years. The most I have done inbound-only with it is 3 T1s > with 60 agents. But for outbound and inbound agents together we have > had upto 120 agents on one setup with 20 T1s(spread across 8 Asterisk > servers, 2 web servers and 1 MySQL server) handling over 200,000 calls > a day(mostly outbound of course). > > The inbound portion of VICIDIAL does not have customized hold music or > periodic announcements yet, but we plan on adding those features in a > future version as we begin to focus more on inbound in the project. > > We do use meetme rooms in VICIDIAL. This allows for easy third-party > calls and multiple monitoring/manager intrusion into an agent session. > > The load balancing works by having the Database keep track of agents > and calls for all of the servers and then use IAX to move the calls > from where they originated to whoever the next available agent is, no > matter what server they are on. So the calls can come from anywhere > and the agents can be logged in anywhere. > > If you receive inbound callerID on these calls you will also be able > to have caller information appear on the Agent's web interface. And if > they are a customer that is already in the system it will bring up > their existing record. > > As for reliability, we have not had a total system failure in the last > 2 years(aside from long power outages and hurricane interruptions). > MySQL can handle a tremendous volume and it is the only > total-system-single-point-of-failure in VICIDIAL, ours never crashes. > The web servers can be load balanced(no need for session-awareness) > and you can use any Apache/PHP webserver that may be on your system to > serve the AJAX-based agent web interface. As for Asterisk, we have had > servers crash periodically(a couple crashes a month across 8 servers), > but that is to be expected when you push tens of thousands of calls > through each one per day. > > Will you be recording all calls in this setup? > > MATT--- > > On 9/16/06, Steve Totaro <[EMAIL PROTECTED]> wrote: >> Matt, >> >> I am sure this is a RTFM and I am pretty sure you are using meetme >> rooms. Just not too sure how you do the magic. >> >> 28 T1s with NFAS so 95 channels per trunk group, seven trunk groups = >> 665 lines. My client's call volume has shot from 5,000 to about 10,000 >> calls a day. Due to recent product offerings/advertising, I expect to >> be eating up 6 T1 (peak) by the end of October. They will eventually >> have every channel in use during peaks, whether that is in November or >> December, I am not sure. I just know it can't break at that point due >> the the sheer expense of revenue lost for downtime. >> >> Thanks, >> Steve >> >> >> Matt Florell wrote: >> > How many lines and agents are you looking at? >> > >> > What kind of call volume? >> > >> > Average expected hold time? >> > >> > VICIDIAL could be an option for you since it does not use Asterisk >> > Queues and can already easily scale across many servers. >> > >> > MATT--- >> > >> > >> > On 9/15/06, Steve Totaro <[EMAIL PROTECTED]> wrote: >> >> I have been tossing around some ideas about scaling a call center >> with >> >> load balancing and redundancy and would like the comunities input, >> >> thoughts, criticism and anything anyone wants to toss in. >> >> >> >> The most evident thing is to start with beefy servers and only run >> procs >> >> that are required. All of the TDM boxes run stripped down >> versions of >> >> Linux and Asterisk, they just take the call from the PRIs and convert >> >> them to SIP, everything stays ulaw end to end. >> >> >> >> *Shared queues across multiple servers would be ideal*. I don't >> think >> >> it is possible in asterisk, as is. Maybe DUNDI could be useful >> but I am >> >> not up to speed on it enough to really know. >> >> >> >> I was toying with a concept of a DB server tracking the number
Re: [asterisk-users] Scaling/Loadbalancing a Call Center and Redundancy
To maintain high recording quality with no audio skips we have found that you should not go over 50 conversations being recorded on a single server. What have you found is your limit while maintaining very good audio quality? MATT--- On 9/16/06, Steve Totaro <[EMAIL PROTECTED]> wrote: Right now we are all inbound and every call is recorded. Matt Florell wrote: > Hello, > > At this point in time VICIDIAL is more focused on outbound features, > but inbound and blended capcbilities have been part of VICIDIAL for > about two years. The most I have done inbound-only with it is 3 T1s > with 60 agents. But for outbound and inbound agents together we have > had upto 120 agents on one setup with 20 T1s(spread across 8 Asterisk > servers, 2 web servers and 1 MySQL server) handling over 200,000 calls > a day(mostly outbound of course). > > The inbound portion of VICIDIAL does not have customized hold music or > periodic announcements yet, but we plan on adding those features in a > future version as we begin to focus more on inbound in the project. > > We do use meetme rooms in VICIDIAL. This allows for easy third-party > calls and multiple monitoring/manager intrusion into an agent session. > > The load balancing works by having the Database keep track of agents > and calls for all of the servers and then use IAX to move the calls > from where they originated to whoever the next available agent is, no > matter what server they are on. So the calls can come from anywhere > and the agents can be logged in anywhere. > > If you receive inbound callerID on these calls you will also be able > to have caller information appear on the Agent's web interface. And if > they are a customer that is already in the system it will bring up > their existing record. > > As for reliability, we have not had a total system failure in the last > 2 years(aside from long power outages and hurricane interruptions). > MySQL can handle a tremendous volume and it is the only > total-system-single-point-of-failure in VICIDIAL, ours never crashes. > The web servers can be load balanced(no need for session-awareness) > and you can use any Apache/PHP webserver that may be on your system to > serve the AJAX-based agent web interface. As for Asterisk, we have had > servers crash periodically(a couple crashes a month across 8 servers), > but that is to be expected when you push tens of thousands of calls > through each one per day. > > Will you be recording all calls in this setup? > > MATT--- > > On 9/16/06, Steve Totaro <[EMAIL PROTECTED]> wrote: >> Matt, >> >> I am sure this is a RTFM and I am pretty sure you are using meetme >> rooms. Just not too sure how you do the magic. >> >> 28 T1s with NFAS so 95 channels per trunk group, seven trunk groups = >> 665 lines. My client's call volume has shot from 5,000 to about 10,000 >> calls a day. Due to recent product offerings/advertising, I expect to >> be eating up 6 T1 (peak) by the end of October. They will eventually >> have every channel in use during peaks, whether that is in November or >> December, I am not sure. I just know it can't break at that point due >> the the sheer expense of revenue lost for downtime. >> >> Thanks, >> Steve >> >> >> Matt Florell wrote: >> > How many lines and agents are you looking at? >> > >> > What kind of call volume? >> > >> > Average expected hold time? >> > >> > VICIDIAL could be an option for you since it does not use Asterisk >> > Queues and can already easily scale across many servers. >> > >> > MATT--- >> > >> > >> > On 9/15/06, Steve Totaro <[EMAIL PROTECTED]> wrote: >> >> I have been tossing around some ideas about scaling a call center >> with >> >> load balancing and redundancy and would like the comunities input, >> >> thoughts, criticism and anything anyone wants to toss in. >> >> >> >> The most evident thing is to start with beefy servers and only run >> procs >> >> that are required. All of the TDM boxes run stripped down >> versions of >> >> Linux and Asterisk, they just take the call from the PRIs and convert >> >> them to SIP, everything stays ulaw end to end. >> >> >> >> *Shared queues across multiple servers would be ideal*. I don't >> think >> >> it is possible in asterisk, as is. Maybe DUNDI could be useful >> but I am >> >> not up to speed on it enough to really know. >> >> >> >> I was toying with a concept of a DB server tracking the number of >> calls >> >> to queue(s), number of agents logged into the queue(s). Some agents >> >> will be logged into multiple queues and providing the logic to a >> series >> >> of Asterisk servers. Calls could be made to the db to determine >> which >> >> queue/server to route the call to. In this situation, duplicate >> queues >> >> would exist on several servers, so balancing would work somewhat >> if the >> >> DB made the selection on which box to route the call to and which >> box an >> >> agent should log into. FastAGI and the manager interface will >> provide >> >> the routing and DB up
Re: [asterisk-users] Scaling/Loadbalancing a Call Center and Redundancy
Right now we are all inbound and every call is recorded. Matt Florell wrote: Hello, At this point in time VICIDIAL is more focused on outbound features, but inbound and blended capcbilities have been part of VICIDIAL for about two years. The most I have done inbound-only with it is 3 T1s with 60 agents. But for outbound and inbound agents together we have had upto 120 agents on one setup with 20 T1s(spread across 8 Asterisk servers, 2 web servers and 1 MySQL server) handling over 200,000 calls a day(mostly outbound of course). The inbound portion of VICIDIAL does not have customized hold music or periodic announcements yet, but we plan on adding those features in a future version as we begin to focus more on inbound in the project. We do use meetme rooms in VICIDIAL. This allows for easy third-party calls and multiple monitoring/manager intrusion into an agent session. The load balancing works by having the Database keep track of agents and calls for all of the servers and then use IAX to move the calls from where they originated to whoever the next available agent is, no matter what server they are on. So the calls can come from anywhere and the agents can be logged in anywhere. If you receive inbound callerID on these calls you will also be able to have caller information appear on the Agent's web interface. And if they are a customer that is already in the system it will bring up their existing record. As for reliability, we have not had a total system failure in the last 2 years(aside from long power outages and hurricane interruptions). MySQL can handle a tremendous volume and it is the only total-system-single-point-of-failure in VICIDIAL, ours never crashes. The web servers can be load balanced(no need for session-awareness) and you can use any Apache/PHP webserver that may be on your system to serve the AJAX-based agent web interface. As for Asterisk, we have had servers crash periodically(a couple crashes a month across 8 servers), but that is to be expected when you push tens of thousands of calls through each one per day. Will you be recording all calls in this setup? MATT--- On 9/16/06, Steve Totaro <[EMAIL PROTECTED]> wrote: Matt, I am sure this is a RTFM and I am pretty sure you are using meetme rooms. Just not too sure how you do the magic. 28 T1s with NFAS so 95 channels per trunk group, seven trunk groups = 665 lines. My client's call volume has shot from 5,000 to about 10,000 calls a day. Due to recent product offerings/advertising, I expect to be eating up 6 T1 (peak) by the end of October. They will eventually have every channel in use during peaks, whether that is in November or December, I am not sure. I just know it can't break at that point due the the sheer expense of revenue lost for downtime. Thanks, Steve Matt Florell wrote: > How many lines and agents are you looking at? > > What kind of call volume? > > Average expected hold time? > > VICIDIAL could be an option for you since it does not use Asterisk > Queues and can already easily scale across many servers. > > MATT--- > > > On 9/15/06, Steve Totaro <[EMAIL PROTECTED]> wrote: >> I have been tossing around some ideas about scaling a call center with >> load balancing and redundancy and would like the comunities input, >> thoughts, criticism and anything anyone wants to toss in. >> >> The most evident thing is to start with beefy servers and only run procs >> that are required. All of the TDM boxes run stripped down versions of >> Linux and Asterisk, they just take the call from the PRIs and convert >> them to SIP, everything stays ulaw end to end. >> >> *Shared queues across multiple servers would be ideal*. I don't think >> it is possible in asterisk, as is. Maybe DUNDI could be useful but I am >> not up to speed on it enough to really know. >> >> I was toying with a concept of a DB server tracking the number of calls >> to queue(s), number of agents logged into the queue(s). Some agents >> will be logged into multiple queues and providing the logic to a series >> of Asterisk servers. Calls could be made to the db to determine which >> queue/server to route the call to. In this situation, duplicate queues >> would exist on several servers, so balancing would work somewhat if the >> DB made the selection on which box to route the call to and which box an >> agent should log into. FastAGI and the manager interface will provide >> the routing and DB updates. >> >> Another thought was to have one central server with all of the queues >> and agents, then somehow the central server would cause a "recording/CDR >> server" to send re-invites to the two SIP endpoints so that the call/RTP >> stream is moved to another asterisk server which would record the call >> and keep the CDR info. Again, this would be done with a DB to decide >> which asterisk (recording/CDR) box has the lightest load. It would take >> the burden of maintaining the call from the "Queue" server. I/O is the
Re: [asterisk-users] Scaling/Loadbalancing a Call Center and Redundancy
Hello, At this point in time VICIDIAL is more focused on outbound features, but inbound and blended capcbilities have been part of VICIDIAL for about two years. The most I have done inbound-only with it is 3 T1s with 60 agents. But for outbound and inbound agents together we have had upto 120 agents on one setup with 20 T1s(spread across 8 Asterisk servers, 2 web servers and 1 MySQL server) handling over 200,000 calls a day(mostly outbound of course). The inbound portion of VICIDIAL does not have customized hold music or periodic announcements yet, but we plan on adding those features in a future version as we begin to focus more on inbound in the project. We do use meetme rooms in VICIDIAL. This allows for easy third-party calls and multiple monitoring/manager intrusion into an agent session. The load balancing works by having the Database keep track of agents and calls for all of the servers and then use IAX to move the calls from where they originated to whoever the next available agent is, no matter what server they are on. So the calls can come from anywhere and the agents can be logged in anywhere. If you receive inbound callerID on these calls you will also be able to have caller information appear on the Agent's web interface. And if they are a customer that is already in the system it will bring up their existing record. As for reliability, we have not had a total system failure in the last 2 years(aside from long power outages and hurricane interruptions). MySQL can handle a tremendous volume and it is the only total-system-single-point-of-failure in VICIDIAL, ours never crashes. The web servers can be load balanced(no need for session-awareness) and you can use any Apache/PHP webserver that may be on your system to serve the AJAX-based agent web interface. As for Asterisk, we have had servers crash periodically(a couple crashes a month across 8 servers), but that is to be expected when you push tens of thousands of calls through each one per day. Will you be recording all calls in this setup? MATT--- On 9/16/06, Steve Totaro <[EMAIL PROTECTED]> wrote: Matt, I am sure this is a RTFM and I am pretty sure you are using meetme rooms. Just not too sure how you do the magic. 28 T1s with NFAS so 95 channels per trunk group, seven trunk groups = 665 lines. My client's call volume has shot from 5,000 to about 10,000 calls a day. Due to recent product offerings/advertising, I expect to be eating up 6 T1 (peak) by the end of October. They will eventually have every channel in use during peaks, whether that is in November or December, I am not sure. I just know it can't break at that point due the the sheer expense of revenue lost for downtime. Thanks, Steve Matt Florell wrote: > How many lines and agents are you looking at? > > What kind of call volume? > > Average expected hold time? > > VICIDIAL could be an option for you since it does not use Asterisk > Queues and can already easily scale across many servers. > > MATT--- > > > On 9/15/06, Steve Totaro <[EMAIL PROTECTED]> wrote: >> I have been tossing around some ideas about scaling a call center with >> load balancing and redundancy and would like the comunities input, >> thoughts, criticism and anything anyone wants to toss in. >> >> The most evident thing is to start with beefy servers and only run procs >> that are required. All of the TDM boxes run stripped down versions of >> Linux and Asterisk, they just take the call from the PRIs and convert >> them to SIP, everything stays ulaw end to end. >> >> *Shared queues across multiple servers would be ideal*. I don't think >> it is possible in asterisk, as is. Maybe DUNDI could be useful but I am >> not up to speed on it enough to really know. >> >> I was toying with a concept of a DB server tracking the number of calls >> to queue(s), number of agents logged into the queue(s). Some agents >> will be logged into multiple queues and providing the logic to a series >> of Asterisk servers. Calls could be made to the db to determine which >> queue/server to route the call to. In this situation, duplicate queues >> would exist on several servers, so balancing would work somewhat if the >> DB made the selection on which box to route the call to and which box an >> agent should log into. FastAGI and the manager interface will provide >> the routing and DB updates. >> >> Another thought was to have one central server with all of the queues >> and agents, then somehow the central server would cause a "recording/CDR >> server" to send re-invites to the two SIP endpoints so that the call/RTP >> stream is moved to another asterisk server which would record the call >> and keep the CDR info. Again, this would be done with a DB to decide >> which asterisk (recording/CDR) box has the lightest load. It would take >> the burden of maintaining the call from the "Queue" server. I/O is the >> first bottleneck in scaling when you record each and every call. >> >> Would it be difficult to hav
Re: [asterisk-users] Scaling/Loadbalancing a Call Center and Redundancy
Matt, I am sure this is a RTFM and I am pretty sure you are using meetme rooms. Just not too sure how you do the magic. 28 T1s with NFAS so 95 channels per trunk group, seven trunk groups = 665 lines. My client's call volume has shot from 5,000 to about 10,000 calls a day. Due to recent product offerings/advertising, I expect to be eating up 6 T1 (peak) by the end of October. They will eventually have every channel in use during peaks, whether that is in November or December, I am not sure. I just know it can't break at that point due the the sheer expense of revenue lost for downtime. Thanks, Steve Matt Florell wrote: How many lines and agents are you looking at? What kind of call volume? Average expected hold time? VICIDIAL could be an option for you since it does not use Asterisk Queues and can already easily scale across many servers. MATT--- On 9/15/06, Steve Totaro <[EMAIL PROTECTED]> wrote: I have been tossing around some ideas about scaling a call center with load balancing and redundancy and would like the comunities input, thoughts, criticism and anything anyone wants to toss in. The most evident thing is to start with beefy servers and only run procs that are required. All of the TDM boxes run stripped down versions of Linux and Asterisk, they just take the call from the PRIs and convert them to SIP, everything stays ulaw end to end. *Shared queues across multiple servers would be ideal*. I don't think it is possible in asterisk, as is. Maybe DUNDI could be useful but I am not up to speed on it enough to really know. I was toying with a concept of a DB server tracking the number of calls to queue(s), number of agents logged into the queue(s). Some agents will be logged into multiple queues and providing the logic to a series of Asterisk servers. Calls could be made to the db to determine which queue/server to route the call to. In this situation, duplicate queues would exist on several servers, so balancing would work somewhat if the DB made the selection on which box to route the call to and which box an agent should log into. FastAGI and the manager interface will provide the routing and DB updates. Another thought was to have one central server with all of the queues and agents, then somehow the central server would cause a "recording/CDR server" to send re-invites to the two SIP endpoints so that the call/RTP stream is moved to another asterisk server which would record the call and keep the CDR info. Again, this would be done with a DB to decide which asterisk (recording/CDR) box has the lightest load. It would take the burden of maintaining the call from the "Queue" server. I/O is the first bottleneck in scaling when you record each and every call. Would it be difficult to have asterisk send two SIP endpoints re-invites and then bridge the call? Then it is just a matter of the "Queue" server checking the DB which recording/CDR server the call should go to and send it a message to re-invite and bridge the endpoints. A transfer to a meetme is another possiblility but I want the "Queue" server out of the stream. Has anybody else thought through the best way to scale something like this. I have a DS3 and will be using all of the channels in the semi-near future. I need to come up with a workable plan before then. Thanks, Steve ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling/Loadbalancing a Call Center and Redundancy
How many lines and agents are you looking at? What kind of call volume? Average expected hold time? VICIDIAL could be an option for you since it does not use Asterisk Queues and can already easily scale across many servers. MATT--- On 9/15/06, Steve Totaro <[EMAIL PROTECTED]> wrote: I have been tossing around some ideas about scaling a call center with load balancing and redundancy and would like the comunities input, thoughts, criticism and anything anyone wants to toss in. The most evident thing is to start with beefy servers and only run procs that are required. All of the TDM boxes run stripped down versions of Linux and Asterisk, they just take the call from the PRIs and convert them to SIP, everything stays ulaw end to end. *Shared queues across multiple servers would be ideal*. I don't think it is possible in asterisk, as is. Maybe DUNDI could be useful but I am not up to speed on it enough to really know. I was toying with a concept of a DB server tracking the number of calls to queue(s), number of agents logged into the queue(s). Some agents will be logged into multiple queues and providing the logic to a series of Asterisk servers. Calls could be made to the db to determine which queue/server to route the call to. In this situation, duplicate queues would exist on several servers, so balancing would work somewhat if the DB made the selection on which box to route the call to and which box an agent should log into. FastAGI and the manager interface will provide the routing and DB updates. Another thought was to have one central server with all of the queues and agents, then somehow the central server would cause a "recording/CDR server" to send re-invites to the two SIP endpoints so that the call/RTP stream is moved to another asterisk server which would record the call and keep the CDR info. Again, this would be done with a DB to decide which asterisk (recording/CDR) box has the lightest load. It would take the burden of maintaining the call from the "Queue" server. I/O is the first bottleneck in scaling when you record each and every call. Would it be difficult to have asterisk send two SIP endpoints re-invites and then bridge the call? Then it is just a matter of the "Queue" server checking the DB which recording/CDR server the call should go to and send it a message to re-invite and bridge the endpoints. A transfer to a meetme is another possiblility but I want the "Queue" server out of the stream. Has anybody else thought through the best way to scale something like this. I have a DS3 and will be using all of the channels in the semi-near future. I need to come up with a workable plan before then. Thanks, Steve ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
[asterisk-users] Scaling/Loadbalancing a Call Center and Redundancy
I have been tossing around some ideas about scaling a call center with load balancing and redundancy and would like the comunities input, thoughts, criticism and anything anyone wants to toss in. The most evident thing is to start with beefy servers and only run procs that are required. All of the TDM boxes run stripped down versions of Linux and Asterisk, they just take the call from the PRIs and convert them to SIP, everything stays ulaw end to end. *Shared queues across multiple servers would be ideal*. I don't think it is possible in asterisk, as is. Maybe DUNDI could be useful but I am not up to speed on it enough to really know. I was toying with a concept of a DB server tracking the number of calls to queue(s), number of agents logged into the queue(s). Some agents will be logged into multiple queues and providing the logic to a series of Asterisk servers. Calls could be made to the db to determine which queue/server to route the call to. In this situation, duplicate queues would exist on several servers, so balancing would work somewhat if the DB made the selection on which box to route the call to and which box an agent should log into. FastAGI and the manager interface will provide the routing and DB updates. Another thought was to have one central server with all of the queues and agents, then somehow the central server would cause a "recording/CDR server" to send re-invites to the two SIP endpoints so that the call/RTP stream is moved to another asterisk server which would record the call and keep the CDR info. Again, this would be done with a DB to decide which asterisk (recording/CDR) box has the lightest load. It would take the burden of maintaining the call from the "Queue" server. I/O is the first bottleneck in scaling when you record each and every call. Would it be difficult to have asterisk send two SIP endpoints re-invites and then bridge the call? Then it is just a matter of the "Queue" server checking the DB which recording/CDR server the call should go to and send it a message to re-invite and bridge the endpoints. A transfer to a meetme is another possiblility but I want the "Queue" server out of the stream. Has anybody else thought through the best way to scale something like this. I have a DS3 and will be using all of the channels in the semi-near future. I need to come up with a workable plan before then. Thanks, Steve ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users