Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
Matthew J. Roth wrote: In the meantime, I'm looking for insights as to what would cause Asterisk (or any other process) to idle at the same value, despite having similar workloads and twice as many CPUs available to it. I'll be working on benchmarking Asterisk from very low to very high call volumes so any suggestions or tips, such as how to generate a large number of calls or what statistics I should gather, would also be appreciated. I am very curious if using this library on your system will help increase the load you are able to put on the dual core system. http://www.hoard.org/ People that are running Asterisk on Solaris have noted that using the mtmalloc library allows for much higher call density. I am hoping that hoard will let the people running Asterisk on Linux see similar performance improvements, but I have yet to convince anyone to give it a try and let me know how it goes. :) -- Russell Bryant Software Engineer Digium, Inc. ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
Matthew J. Roth wrote: Recently, we were pushing our server to almost full CPU utilization. Since we've observed that Asterisk is CPU bound, we upgraded our server from a PowerEdge 6850 with four single-core Intel Xeon CPUs running at 3.16GHz, to a PowerEdge 6850 with 4 dual-core Intel Xeon CPUs running at 3.00GHz. The software installed is identical and a kernel build benchmark yielded promising results. The new dual-core server ran roughly 80% faster, which is about what we expected. As far as Asterisk is concerned, at low call volumes the dual-core server outperforms the single-core server at a similar rate. Outperforms in what sense? I'm working on a follow-up post that will demonstrate this with some benchmarks for a small number of calls in various scenarios on each machine. However, to our surprise as the number of concurrent calls increases, the performance gains begin to flatten out. In fact, it seems that somewhere between 200 and 300 calls, the two servers start to exhibit similar idle times despite one of them having twice as many cores. What do you mean by idle here? ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
Sean M. Pappalardo wrote: Hi there. Just curious if you've checked out Linux clustering software such as OpenSSI ( http://www.openssi.org/ ) and run Asterisk on it? It features a multi-threaded cluster-aware shell (and custom kernel) that will automatically cluster-ize any regular Linux executable (such as the main Asterisk process.) If it works as advertised, it should just be a matter of adding boxes to the cluster to speed up processing. OpenSSI can't (at the moment) migrate threads between compute nodes. It can migrate separate processes, but doesn't Asterisk use threads? ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks
Matthew J. Roth wrote: This post contains the benchmarks for Asterisk at low call volumes on similar single and dual-core servers. I'd appreciate it greatly if you took the time to read and comment on it. For me all these numbers look too small to be useful for benchmarking. ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks
John Hughes wrote: For me all these numbers look too small to be useful for benchmarking. John, They are small, and they are probably more useful as baseline numbers. I'm working on writing up some data I've collected off of our production switch. The call range is 0-450 at 10 call increments. Unfortunately, it's a live environment so it's less than ideal for benchmarking. The makeup of the calls varies, I'm relying on historical data (ie. I can't reproduce the scenarios), and my sample sizes are much bigger for 0-300 calls than they are for 300-450. Nonetheless, there is some knowledge to be gained by studying the numbers and I'm sure that 300 calls constitutes large scale for most people. In the future, I'd like to recreate these numbers using something like SIPp to give me more control. Until then, I'm working with what I have. Thank you for your replies, Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
John Hughes wrote: OpenSSI can't (at the moment) migrate threads between compute nodes. It can migrate separate processes, but doesn't Asterisk use threads? John, Asterisk uses 1 thread per call, plus about 10 to 15 background threads that persist throughout the life of the process. I'm curious if the 1 thread per call model is efficient as the number of calls increases. It's possible that in the 100+ call range that there is a significant overhead to managing all of those threads without much gain since most servers have 1 to 8 processors to actually schedule them on. Acquiring locks on shared resources between the threads could be pretty nasty at that point, too. I wonder if pooling the calls in X threads, where X is a value that is determined at compile time by looking at the number of processors available, would be more efficient? This is probably just an academic question, because I'd imagine it would require an overhaul of the codebase to accomplish. Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
Hi Matthew: Your environment sounds quite challenging and I'd be interested in the analysis of what is limiting the throughput. I agree that there's no easy way to distribute and single queue across multiple boxes. But here is a scaling idea for you. We've used it successfully to handle a large inbound call centre. It also provides resilience: 1) Incoming PRIs connect to multiple boxes that we'll call the voice gateways. Each box can have a proportion of your PRIs connected. Depending on the box power, up to 8 or so. 2) Agent registrations are spread across these same boxes. 3) Lastly you define two or more additional boxes as your queue servers. Every queue server has defined on it all the queues you need. But for each queue one server is regarded as the primary and the other as secondary. You mix things up so in the normal event about half your queueing calls are on each server (extend the idea for more than 2 queue servers). Incoming calls on the voice gateways are sent to the Queue server over IAX: exten = 1234,1,Dial(IAX2/primary1234/${EXTEN}) exten = 1234,n,Dial(IAX2/sec1234/${EXTEN}) ; if we can't get to the primary Now when an Agent wants to login, you have their agent gateway log in to both of the queue servers on their behalf, using an IAX2/.. channel to get back to the agent's voice gateway. So on the queue server we have the agents for the queue logged in as say IAX2/voicegw1/6001, IAX2/voicegw2/6002 etc etc. The trick is to use transfer=yes aka notransfer=no on the various boxes. So as soon as the call gets connected to an agent it disappears off the queue box completely. The nett result is that the queue servers only have to handle customers who are still in the queue. As soon as they get connected to an agent the call is directly from the arriving voice gateway to the agent's voice gateway and on to the agent. In a proportion of the time that even turns out to be the same box. You can scale up the number of voice gateways as required and handle 1000s of calls connected to agents without needing supercomputers. You still handle all the people queueing on a particular queue all on the same queueing server. So you can tell them where they are in the queue and all that. But you can split up your queues across multiple boxes to help divide and conquer the load. If you can reach the agent phone directly using IAX (use an IAX softphone or something) you can make a little optimisation and log IAX2/agentipaddress into the queue directly. Then the call gets optimised to go directly from the incoming voice gateway to the agent's PC. Resilience? If a queue server is down, new callers will automatically start to queue on the backup box for the queues affected. The agents are known on both primary and backup queue boxes so things keep going. If a voice gateway goes down you lose just some of your PRIs, so you are still in business. If you need the capacity, use an ISDNguard to kick the PRIs onto one of the other voice gateways. Agents that were on the voice gateway that went down will need to reregister to a box still running. IP address takeover can make that happen. For me this sort of design is much better than one giant box. Regards, Steve Davies Technical Director Connection Telecom (Pty) Ltd ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
John Hughes wrote: Matthew J. Roth wrote: As far as Asterisk is concerned, at low call volumes the dual-core server outperforms the single-core server at a similar rate. Outperforms in what sense? At low call volumes the cumulative CPU utilization, expressed as a percentage of available processor, is lower on the dual-core server. This is the expected behavior. What I'm proposing (and hope to back up with numbers in the near future) is that as the number of calls rises to the 300-400 range, the cumulative CPU utilization starts to approach the same number on both servers. Unfortunately, I wasn't collecting as much data when the single-core server was in production so some of this is speculation based on my memory of the system's performance. The environment is also different, because we have added agents so the ratio of calls connected vs. calls in queue has changed. Nonetheless, the dual-core server is not performing anywhere near our expectations. Here is something we recently noticed that may explain why the dual-core server is under-performing at high call volumes. The following numbers were collected off both servers while they were in production. Note that while they have similar cumulative idle values, the ratio of system time to user time on the single-core server is roughly 2.3 to 1, but on the dual-core server it is roughly 19.6 to 1. I'm not quite sure what to make of this, but it seems to be very relevant to the problem. Mon Apr 2 12:15:01 EDT 2007 Idle (sar -P ALL 60 14) (60 seconds 14 slices) Linux 2.6.12-1.1376_FC3smp (4core.imminc.com) 04/02/07 12:24:01 CPU %user %nice %system %iowait %idle 12:25:02 all 14.97 0.03 34.25 0.92 49.82 12:25:020 8.83 0.05 33.60 1.28 56.24 12:25:021 17.50 0.02 34.60 0.57 47.32 12:25:022 19.94 0.02 33.52 1.31 45.22 12:25:023 13.62 0.02 35.29 0.52 50.55 Thu May 10 15:30:01 EDT 2007 Idle (sar -P ALL 60 14) (60 seconds 14 slices) Linux 2.6.12-1.1376_FC3smp (8core.imminc.com) 05/10/07 15:38:01 CPU %user %nice %system %iowait %idle 15:39:01 all 2.47 0.01 48.29 0.00 49.23 15:39:010 2.92 0.00 53.17 0.00 43.91 15:39:011 2.98 0.00 48.68 0.02 48.33 15:39:012 2.47 0.02 48.61 0.00 48.91 15:39:013 2.27 0.00 48.35 0.00 49.38 15:39:014 2.38 0.02 47.38 0.00 50.22 15:39:015 2.37 0.02 46.94 0.00 50.67 15:39:016 2.23 0.02 46.63 0.00 51.12 15:39:017 2.17 0.02 46.54 0.00 51.27 I'm working on a follow-up post that will demonstrate this with some benchmarks for a small number of calls in various scenarios on each machine. However, to our surprise as the number of concurrent calls increases, the performance gains begin to flatten out. In fact, it seems that somewhere between 200 and 300 calls, the two servers start to exhibit similar idle times despite one of them having twice as many cores. What do you mean by idle here? Idle percentage as shown in top's or sar's cumulative view. Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
On 01/06/07, Matthew J. Roth [EMAIL PROTECTED] wrote: Mon Apr 2 12:15:01 EDT 2007 Idle (sar -P ALL 60 14) (60 seconds 14 slices) Linux 2.6.12-1.1376_FC3smp (4core.imminc.com) 04/02/07 12:24:01 CPU %user %nice %system %iowait %idle 12:25:02 all 14.97 0.03 34.25 0.92 49.82 12:25:020 8.83 0.05 33.60 1.28 56.24 12:25:021 17.50 0.02 34.60 0.57 47.32 12:25:022 19.94 0.02 33.52 1.31 45.22 12:25:023 13.62 0.02 35.29 0.52 50.55 Thu May 10 15:30:01 EDT 2007 Idle (sar -P ALL 60 14) (60 seconds 14 slices) Linux 2.6.12-1.1376_FC3smp (8core.imminc.com) 05/10/07 15:38:01 CPU %user %nice %system %iowait %idle 15:39:01 all 2.47 0.01 48.29 0.00 49.23 15:39:010 2.92 0.00 53.17 0.00 43.91 15:39:011 2.98 0.00 48.68 0.02 48.33 15:39:012 2.47 0.02 48.61 0.00 48.91 15:39:013 2.27 0.00 48.35 0.00 49.38 15:39:014 2.38 0.02 47.38 0.00 50.22 15:39:015 2.37 0.02 46.94 0.00 50.67 15:39:016 2.23 0.02 46.63 0.00 51.12 15:39:017 2.17 0.02 46.54 0.00 51.27 Have you got, or could you install oprofile? That will give you a LOT of information as to where your CPUs are spending their time, One guess is that you could be hitting contention in the kernel with all the cores contending for some scarce resource. So your cores can't execute because they are waiting on some kernel mutex for access to some resource. That would account for the increase in system time - oprofile would show where in the kernel they are spending time (where those 50%ishes are going). Steve Uhler at Sun has been studying this on his big multi-core Sparc boxes so he can probably contribute some insight. Hope you don't mind a cc, Steve. We're talking about Asterisk/Linux running out of scaling on an 8 core box. Steve ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
FYI, http://www.voip-info.org/wiki/index.php?page=Asterisk+FAQ *Can i install Asterisk on a beowulf cluster?* A cluster can't migrate threads that use shared memory. Asterisk uses that kind of threads.So no, Asterisk wouldn't work on a cluster. *(It might be helpful to know whether anyone has a working load-balanced Asterisk configuration where multiple systems can share the load of an Asterisk environment (IAX2, not SIP) and whether this environment would fail over nicely in the event of downtime!)* On 5/25/07, Sean M. Pappalardo [EMAIL PROTECTED] wrote: Hi there. Just curious if you've checked out Linux clustering software such as OpenSSI ( http://www.openssi.org/ ) and run Asterisk on it? It features a multi-threaded cluster-aware shell (and custom kernel) that will automatically cluster-ize any regular Linux executable (such as the main Asterisk process.) If it works as advertised, it should just be a matter of adding boxes to the cluster to speed up processing. As for Asterisk itself, is it multi-threaded enough to take advantage of 4+ way systems? Sean Pappalardo - This E-Mail message has been scanned for viruses and cleared by SmartMail from Smarter Technology, Inc. - ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users -- Esta mensagem (incluindo quaisquer anexos) pode conter informação confidencial para uso exclusivo do destinatário. Se não for o destinatário pretendido, não deverá usar, distribuir ou copiar este e-mail. Se recebeu esta mensagem por engano, por favor informe o emissor e elimine-a imediatamente. Obrigado. This e-mail message is intended only for individual(s) to whom it is addressed and may contain information that is privileged, confidential, proprietary, or otherwise exempt from disclosure under applicable law. If you believe you have received this message in error, please advise the sender by return e-mail and delete it from your mailbox. Thank you. ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks - Correction
Luki wrote: Perhaps a naive question, but how does 0.137% CPU utilization per call equal 1735 MHz per call? If 1735 MHz / 0.137% = 1735 MHz / 0.00137 = 1266423 MHz at 100% utilization ??! Even with 4 CPUs, those would be 316 GHz CPUs. I think you meant: Average CPU utilization per call: 0.137% (~17 MHz) Luki, You are absolutely right. Thank you for pointing out and correcting my mistake. The corrected statistics are below. Note that the MHz per call statistic is calculated with the following formula: MHzPerCall = (numCPUs * CPUspeed) * (avgCPUperCall * .01) Thank you, Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer The Numbers (Corrected) --- DC - Incoming SIP to the Playback() application === calls %user %system %iowait %idle 00.00 0.01 0.01 99.98 10.02 0.04 0.00 99.94 20.02 0.06 0.00 99.92 30.03 0.11 0.00 99.86 40.04 0.13 0.00 99.83 50.05 0.16 0.00 99.80 60.05 0.20 0.00 99.75 70.07 0.24 0.00 99.70 80.07 0.25 0.00 99.67 90.08 0.27 0.00 99.65 100.09 0.33 0.00 99.58 Average CPU utilization per call: 0.040% (~9.60 MHz) SC - Incoming SIP to the Playback() application === calls %user %system %iowait %idle 00.01 0.02 0.00 99.98 10.02 0.10 0.00 99.88 20.03 0.17 0.00 99.80 30.06 0.21 0.00 99.73 40.08 0.28 0.00 99.63 50.10 0.34 0.01 99.55 60.11 0.48 0.00 99.41 70.14 0.49 0.00 99.37 80.16 0.57 0.00 99.28 90.17 0.63 0.01 99.19 100.18 0.75 0.00 99.07 Average CPU utilization per call: 0.091% (~11.52 MHz) DC - Incoming SIP to the Queue() application - In queue === calls %user %system %iowait %idle 00.00 0.01 0.00 99.99 10.01 0.03 0.00 99.96 20.01 0.05 0.00 99.94 30.01 0.08 0.00 99.91 40.02 0.10 0.00 99.88 50.03 0.12 0.00 99.84 60.04 0.16 0.00 99.80 70.03 0.17 0.00 99.80 80.04 0.20 0.00 99.76 90.03 0.22 0.00 99.75 100.05 0.27 0.00 99.68 Average CPU utilization per call: 0.031% (~7.44 MHz) SC - Incoming SIP to the Queue() application - In queue === calls %user %system %iowait %idle 00.02 0.02 0.00 99.96 10.03 0.07 0.00 99.91 20.03 0.13 0.00 99.83 30.04 0.18 0.00 99.78 40.05 0.23 0.00 99.72 50.06 0.27 0.00 99.67 60.07 0.33 0.00 99.60 70.09 0.38 0.00 99.53 80.09 0.40 0.00 99.51 90.11 0.46 0.01 99.43 100.11 0.48 0.00 99.41 Average CPU utilization per call: 0.055% (~6.97 MHz) DC - Incoming SIP to the Queue() application - Bridged to an agent == calls %user %system %iowait %idle 00.00 0.01 0.00 99.99 10.01 0.06 0.00 99.93 20.02 0.14 0.00 99.84 30.03 0.16 0.00 99.81 Average CPU utilization per call: 0.060% (~14.40 MHz) SC - Incoming SIP to the Queue() application - Bridged to an agent == calls %user %system %iowait %idle 00.01 0.02 0.00 99.98 10.02 0.16 0.00 99.82 20.04 0.28 0.00 99.68 30.07 0.36 0.00 99.57 Average CPU utilization per call: 0.137% (~17.35 MHz) ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks
William Moore wrote: Are you recording memory figures as well and have you checked the total used memory? Or did I miss it somewhere? Thanks for doing this, scalability testing is always good. William, This round of benchmarking is heavily focused on CPU utilization, because it is causing an immediate problem for me. However, I am tracking some other statistics on a daily basis including memory utilization, swap utilization, load averages, and active channels and calls. One of my colleagues takes the text file I produce and creates graphs using Cacti and rrdtool. You'll be interested in these two (sorry for the format of the URLS, but otherwise the list was eating my posts): - Percent CPU Used With No. Calls and No. Channels img509DOTimageshackDOTusSLASHimg509SLASH3927SLASHastcpuandcallsbf4DOTpng - Asterisk Memory Used (KB) img47DOTimageshackDOTusSLASHimg47SLASH7615SLASHastmemusedgq9DOTpng Note that even with a peak call volume of approximately 400 active calls and 550 active SIP channels, the memory utilization never surpasses 600 KB. I'd estimate that most Asterisk installations would avoid swapping with 1 GB of RAM. A 2nd GB might be useful to provide plenty of room for file caching so that your hard disk doesn't become a bottleneck. We also record all of our calls to a 6 GB RAM disk, so our server has a total of 8 GB of RAM but that isn't necessary in most circumstances. Overall, Asterisk seems to be very efficiently coded as far as memory is concerned. Note that for other reasons we perform a nightly reboot, so I don't know if there are any memory leaks that would surface over time. Thank you, Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
Mark Coccimiglio wrote: Sounds like you are running into the hardware limitations of your systems PCI or Front Side Bus (FSB) and not necessarily an issue of asterisk. In short there is a limited amount of bandwidth on the computer's PCI Bus (33 MHz) and the FSB (100-800MHz). One thing to remember is that ALL cores and data streams need to share the PCI and FSB.Asterisk is very processor and memory intensive. At the extreme level of usage more cores won't help if data is stuck in the pipe. So the performance planing you described would be expected. Mark, That is a great theory and I'd like to follow up on it. Do you know if the PCI or FSB buses are instrumented by Linux? If not, are you aware of any way to gather statistics about their utilization? I'd like to see if the numbers support your idea and, if so, which bus is saturated. Let me add a little bit of extra information to this discussion. The CPU utilization does not flatten out at 50%. In fact, as more calls are added, Asterisk will eventually drive the idle percentage down to single digits with surprisingly few problems. If PCI or FSB bandwidth were the limiting factor, wouldn't the CPU utilization top out at the point that the available bandwidth was used? Thank you, Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
Matthew J. Roth wrote: In fact, it seems that somewhere between 200 and 300 calls, the two servers start to exhibit similar idle times despite one of them having twice as many cores. Sounds like you are running into the hardware limitations of your systems PCI or Front Side Bus (FSB) and not necessarily an issue of asterisk. In short there is a limited amount of bandwidth on the computer's PCI Bus (33 MHz) and the FSB (100-800MHz). One thing to remember is that ALL cores and data streams need to share the PCI and FSB.Asterisk is very processor and memory intensive. At the extreme level of usage more cores won't help if data is stuck in the pipe. So the performance planing you described would be expected. Mark C. ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks
On Saturday 26 May 2007 1:21 am, Edgar Guadamuz wrote: Very good... by the way, I'm studing electrical engineering and I've chosen asterisk scalation as my final graduation project. I hope do a similar work within and asterisk cluster. I've been working as an EE, and I've got to ask... what does software scalability have to do with electrical engineering? If you were in a CS prog I could see it, but I've been doing electronic design and power electronics work for over 10 years and I can't think of where these two intersect. -A. ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks
Average CPU utilization per call: 0.137% (~1735 MHz) Perhaps a naive question, but how does 0.137% CPU utilization per call equal 1735 MHz per call? If 1735 MHz / 0.137% = 1735 MHz / 0.00137 = 1266423 MHz at 100% utilization ??! Even with 4 CPUs, those would be 316 GHz CPUs. I think you meant: Average CPU utilization per call: 0.137% (~17 MHz) --Luki ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
[asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
List users, Using Asterisk in an inbound call center environment has led us to pushing the limits of vertical scaling. In order to treat each caller fairly and to utilize our agents as efficiently as possible, it is desirable to configure each client as a single queue. As far as I know, Asterisk's queues cannot be distributed across servers, so the size of the largest queue we service is our vertical scaling goal. In our case, that queue must be able to hold in excess of 300 calls regardless of their makeup (ie. number of calls in queue vs. number of calls connected to an agent). In reality, we are servicing more than one client on our server, so on busy days the total number of calls we're handling is greater than 300. Recently, we were pushing our server to almost full CPU utilization. Since we've observed that Asterisk is CPU bound, we upgraded our server from a PowerEdge 6850 with four single-core Intel Xeon CPUs running at 3.16GHz, to a PowerEdge 6850 with 4 dual-core Intel Xeon CPUs running at 3.00GHz. The software installed is identical and a kernel build benchmark yielded promising results. The new dual-core server ran roughly 80% faster, which is about what we expected. As far as Asterisk is concerned, at low call volumes the dual-core server outperforms the single-core server at a similar rate. I'm working on a follow-up post that will demonstrate this with some benchmarks for a small number of calls in various scenarios on each machine. However, to our surprise as the number of concurrent calls increases, the performance gains begin to flatten out. In fact, it seems that somewhere between 200 and 300 calls, the two servers start to exhibit similar idle times despite one of them having twice as many cores. Once I collect the data, I will add a second follow-up post with a performance curve tracking the full range of call volumes we experience. Unfortunately, from day to day there are some variables that I'm sure affect performance, such as the number of agents logged in and the makeup of the calls. I'll do my best to choose a sample size that smooths out these bumps. In the meantime, I'm looking for insights as to what would cause Asterisk (or any other process) to idle at the same value, despite having similar workloads and twice as many CPUs available to it. I'll be working on benchmarking Asterisk from very low to very high call volumes so any suggestions or tips, such as how to generate a large number of calls or what statistics I should gather, would also be appreciated. Thank you, Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
Hi there. Just curious if you've checked out Linux clustering software such as OpenSSI ( http://www.openssi.org/ ) and run Asterisk on it? It features a multi-threaded cluster-aware shell (and custom kernel) that will automatically cluster-ize any regular Linux executable (such as the main Asterisk process.) If it works as advertised, it should just be a matter of adding boxes to the cluster to speed up processing. As for Asterisk itself, is it multi-threaded enough to take advantage of 4+ way systems? Sean Pappalardo - This E-Mail message has been scanned for viruses and cleared by SmartMail from Smarter Technology, Inc. - ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks
List users, This post contains the benchmarks for Asterisk at low call volumes on similar single and dual-core servers. I'd appreciate it greatly if you took the time to read and comment on it. Thank you, Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer Conclusions --- I'm presenting the conclusions first, because they are the most important part of the benchmarking. If you like details and numbers, scroll down. I've drawn three conclusions from this set of benchmarks. 1. At low call volumes, the dual-core server outperforms the single-core server by the expected margin. 2. Calls bridged to an agent are more CPU intensive than calls listening to audio via the Playback() application or calls in queue. This is expected, because they involve more SIP channels and more work is done on the RTP frames (bridging, recording, etc.). 3. For all call types, the majority of the CPU time is spent in the kernel (servicing system calls, etc.). I've observed this to be true at all call volumes on our production server, with the ratio sometimes in the range of 20 to 1. This may suggest that the popular perception that Asterisk doesn't scale well because of its extensive use of linked lists doesn't tell the whole story. So far there are no surprises, but over the next week or so I'll be collecting data that I expect to reveal that at high call volumes (200-300 concurrent calls) the idle percentage on both machines starts to approach the same value. In the end, my goal is to break through (or, at the least, understand) this scaling issue, so I welcome all forms of critique. It's quite possible that the problem lies in my setup or that I'm missing something obvious, but I suspect it is deeper than that. Benchmarking Methodology I collected each type of data as follows. - Active channel and call counts: 'asterisk -rx show channels' and 'asterisk -rx sip show channels' - Thread counts: 'ps -eLf' and 'ps axms' - Idle time values: 'sar 30 1' - Average CPU utilization per call: (startIdle - endIdle) / numCalls The servers were rebooted between tests. Call Types -- I tested the following three call types. - Incoming SIP to the Playback() application - 1 active SIP channel per call - From the originating Asterisk server to the Playback() application - Incoming SIP to the Queue() application - In queue - 1 active SIP channel per call - From the originating Asterisk server to the Queue() application - Incoming SIP to the Queue() application - Bridged to an agent - 2 active SIP channels per call - From the originating Asterisk server to the Queue() application - Bridged from the Queue() application to the agent All calls were pure VOIP (SIP/RTP) and originated from another Asterisk server. Calls that were bridged to agents terminated at SIP hardphones (Snom 320s) and were recorded to a RAM disk via the Monitor() application. All calls were in the uLaw codec and all audio files (including the call recordings, the native MOH, and the periodic queue announcements which played approximately every 60 seconds) were in the PCM file format. There was no transcoding, protocol bridging, or TDM hardware involved on the servers being benchmarked. A Note on Asterisk and Threads -- On both systems, a freshly started Asterisk process consisted of 10 threads. Some events, such as performing an 'asterisk -rx reload' triggered the creation of a new persistent thread. The benchmarking revealed that in general, the Asterisk process will consist of 10-15 persistent background threads plus exactly 1 additional thread per active call. This means that at even modest call volumes, Asterisk will utilize all of the CPUs in most modern PC-based servers. Server Profiles --- The servers I performed the benchmarking on are described below. Note that the CPUs support hyperthreading, but it is disabled. This is reflected in the CPU count, which is the number of physical processors available to the OS. Short Name: DC Manufacturer: Dell Computer Corporation Product Name: PowerEdge 6850 Processors: Four Dual-Core Intel Xeon MP CPUs at 3.00GHz CPU Count: 8 FSB Speed: 800 MHz OS: Fedora Core 3 - 2.6.13-ztdummy SMP x86_64 Kernel Asterisk Ver: ABE-B.1-3 Short Name: SC Manufacturer: Dell Computer Corporation Product Name: PowerEdge 6850 Processors: Four Single-Core Intel Xeon MP CPUs at 3.16GHz CPU Count: 4 FSB Speed: 667 MHz OS: Fedora Core 3 - 2.6.13-ztdummy SMP x86_64 Kernel Asterisk Ver: ABE-B.1-3 The kernel is a vanilla 2.6.13 kernel with enhanced realtime clock support and a timer frequency of 1000 HZ (earning it the EXTRAVERSION of '-ztdummy'). I am aware that the 2.6.17 kernel introduced multi-core scheduler support, but it exhibited negligible gains in the kernel build benchmark. Nonetheless, I am open to any
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes
Sean M. Pappalardo wrote: Just curious if you've checked out Linux clustering software such as OpenSSI ( http://www.openssi.org/ ) and run Asterisk on it? It features a multi-threaded cluster-aware shell (and custom kernel) that will automatically cluster-ize any regular Linux executable (such as the main Asterisk process.) If it works as advertised, it should just be a matter of adding boxes to the cluster to speed up processing. As for Asterisk itself, is it multi-threaded enough to take advantage of 4+ way systems? Sean, Thanks for your response. I'm going to take a look into OpenSSI. It'd be amazing if it ran Asterisk without any side effects. I've addressed the number of threads that Asterisk uses in my first follow-up post. In short, the answer is yes because it uses a 1 thread per call model. Thank you, Matthew Roth InterMedia Marketing Solutions Software Engineer and Systems Developer ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks
On 5/25/07, Matthew J. Roth [EMAIL PROTECTED] wrote: List users, This post contains the benchmarks for Asterisk at low call volumes on similar single and dual-core servers. I'd appreciate it greatly if you took the time to read and comment on it. Are you recording memory figures as well and have you checked the total used memory? Or did I miss it somewhere? Thanks for doing this, scalability testing is always good. ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Re: [asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks
Very good... by the way, I'm studing electrical engineering and I've chosen asterisk scalation as my final graduation project. I hope do a similar work within and asterisk cluster. On 5/25/07, William Moore [EMAIL PROTECTED] wrote: On 5/25/07, Matthew J. Roth [EMAIL PROTECTED] wrote: List users, This post contains the benchmarks for Asterisk at low call volumes on similar single and dual-core servers. I'd appreciate it greatly if you took the time to read and comment on it. Are you recording memory figures as well and have you checked the total used memory? Or did I miss it somewhere? Thanks for doing this, scalability testing is always good. ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users