I do 3000+ CPS with Opensips and MySQL using unixODBC, no problem. My query is for routing only. I read a 280 MM table (RocksDB) for every call. So it's comparable. It seems to work flawlessly. However, I only use $60.000 servers from Dell, R920s, with 64 physical cores and 120 threads, plus 1.5 TB of RAM. My Opensips is under Vmware ESX 6.X. Honestly, I paid an Opensips guru to assemble the application for me. I designed the logic base on my previous switch for Asterisk that could not handle the pressure. Basically all routing is done in MariaDB. Opensips just asks a question using a Stored Procedure. The brain is the database, and opensips executes instructions. It just works.
On Wed, Jun 10, 2020 at 5:15 PM Jon Abrams <[email protected]> wrote: > I built a similar functioning platform back in 2015 based on similar > hardware (Westmere Xeons, hyperthreading enabled) running bare metal on > Centos6. At some point we bumped it up to dual X5670s (cheap upgrade > nowadays), but it was handling 12000 CPS peaks on 1 server with 3000-5000 > CPS sustained for large parts of the day. I don't think you are too far off > in hardware. > This was on version 1.9, so there was no Async. IIRC it was either 32 or > 64 children. Async requires the TM module which adds additional overhead > and memory allocation. > The LRN database was stored in mysql with a very simple table (TN, LRN) to > keep memory usage down so that it could be pinned in memory (server had 48 > or 72GB I think). MySQL was set to dump the innodb buffer cache to disk on > restart so that the whole database would be back in memory on restart. > Doing a full table scan would initially populate the MySQL cache. > Blacklists and other smaller datasets were stored in OpenSIPs using the > userblacklist module. There are better ways to do that in version 2 and > onwards. Bigger list were stored in memcached. I prefer redis for this > purpose now. > > I would suggest simplifying testing by using a single MySQL server and > bypassing the F5 to eliminate that as a source of connection problems or > additional latencies. > In the OpenSIPs script, eliminate everything but 1 dip, probably just dip > the LRN to start. > Performance test the stripped down scenario with sipp. Based on past > experience, you should be able to hit or come close to your performance > goal with only 1 dip in play. > If you do hit your performance targets, keep adding more dips one by one > until it breaks. > If you can't reach your performance target with this stripped down > scenario, then I'd suggest testing without the async and transactions > enabled. I wouldn't think transactions would be a necessity in this > scenario. I ran into CPS problems on that other open source SIP server when > using async under heavy load. The transaction creation was chewing up CPU > and memory. I'm not sure how different the implementation is here. > I seem to start having problems with sipp when I hit a few thousand CPS > due to it being single threaded. You probably will need to run multiple > parallel sipp processes for your load test, if not already. > If using an OS with systemd journald for logging, that will be a big > bottleneck in of itself with even small amounts of logging. > In 1.9, I hacked together a module to create a timestamp string with ms > for logging query latencies for diagnostic purposes. There may be a better > out of the box way to do it now. > For children sizing, I would suggest benchmarking with at least 16 > children and then doubling it to compare performance. > Watch the logs for out of memory or defragment messages and bump up shared > memory or package memory if necessary. Package memory is probably going to > be your problem, but it doesn't sound like it is a problem yet. > > BR, > Jon Abrams > > > > > > > > > > > > > > > > > On Wed, Jun 10, 2020 at 3:20 PM Calvin Ellison <[email protected]> > wrote: > >> We've checked our F5 BigIP configuration and added a second database >> server to the pool. Both DBs have been checked for max connections, open >> files, etc. Memcached has been moved to a dedicated server. Using a SIPp >> scenario for load testing from a separate host, things seem to fall apart >> on OpenSIPS around 3,000 CPS with every CPU core at or near 100% and no >> logs indicating fallback to sync/blocking mode. Both databases barely >> noticed the few hundred connections. Does this seem reasonable for a dual >> CPU server with 8 cores and 16 threads? >> >> >> https://ark.intel.com/content/www/us/en/ark/products/47925/intel-xeon-processor-e5620-12m-cache-2-40-ghz-5-86-gt-s-intel-qpi.html >> >> What is the OpenSIPS opinion on Hyper-Threading? >> >> Is there a way to estimate max CPS based on SPECrate, BogoMIPS, or some >> other metric? >> >> I would love to know if my opensips.cfg has any mistakes, omissions, or >> inefficiencies. Is there a person or group who does sanity checks? >> >> What should I be looking at within OpenSIPS during a load test to >> identify bottlenecks? >> >> I'm still looking for guidance on the things below, especially children >> vs timer_partitions: >> >> Is there an established method for fine-tuning these things? >>> shared memory >>> process memory >>> children >>> db_max_async_connections >>> listen=... use_children >>> modparam("tm", "timer_partitions", ?) >> >> >> What else is worth considering? >> >> Regards, >> >> Calvin Ellison >> Senior Voice Operations Engineer >> [email protected] >> >> On Thu, Jun 4, 2020 at 5:18 PM David Villasmil < >> [email protected]> wrote: >> > >> > Maybe you are hitting the max connections? How many connections are >> there when it starts to show those errors? >> _______________________________________________ >> Users mailing list >> [email protected] >> http://lists.opensips.org/cgi-bin/mailman/listinfo/users >> > _______________________________________________ > Users mailing list > [email protected] > http://lists.opensips.org/cgi-bin/mailman/listinfo/users >
_______________________________________________ Users mailing list [email protected] http://lists.opensips.org/cgi-bin/mailman/listinfo/users
