Calvin, feel free to login to the container and change the config for MariaDB, it is located in a single file /etc/my.cnf. I use TokuDB, a new engine that is better than InnoDB. Lately, I shifted to RocksDB, which is even better, designed by Facebook. I have not updated that box because: "if it ain't broke, don't fix it". But I am open to change the engine in the second container, if you so desire.. MariaDB is better than MySQL because it gives us a pool of threads, something that MySQL only gives to paying customers. So I think it should work.
On Thu, Jun 4, 2020 at 3:44 PM Jon Abrams <[email protected]> wrote: > A) Is the LRN database located locally on the OpenSIPs box or is it remote? > B) Have you tried only doing sync database queries? Async introduces some > overhead, and I'm not sure if it causes extra database connections to be > created. When using sync there is a connection per child process that stays > up. > C) Does the database have enough memory to contain the LRN and DNC > datasets fully in memory? The extra latency for the non-cache hits sent to > the database may stack up if the database has to hit disk. > D) How many child processes are you using now? If you are hitting 100% you > may need to increase them. > E) Are your memcached processes using heavy cpu? If you are caching > multiple lists, I've found it helps to use unique memcached instance per > list. > F) Look for memory related log messages. If the memory starts getting > exhausted you will see defrag messages. This will chew up available > computation cycles. > > - Jon Abrams > > > On Thu, Jun 4, 2020 at 2:17 PM Calvin Ellison <[email protected]> > wrote: > >> The scenario is INVITE -> MySQL query -> non-200 final response. No >> calls are connected here, only dipping things like LRN, Do Not Call, >> and Wireless/Landline. A similar service runs on a second port, >> specific to a different kind of traffic and dip. We're using async >> avp_db_query and memcached, with about 3:1 cache hits. >> >> Our target is up to 10,000 CPS across two opensips servers, which are >> dual-CPU Xeon E5620 with 48G RAM. Both are run memcached, and both >> servers are using both memcached to share a distributed cache thanks >> to this: >> >> 'modparam("cachedb_memcached","cachedb_url","memcached:lrn://lrn-d,lrn-e/")'. >> At a glance there are over 200mil total cached items, distributed >> nearly equally. >> >> The issue is that individual child processes start getting suck at >> 100% CPU. Logs indicate connection failures to the MySQL database >> causing children to run in sync mode, and there are warnings about >> delayed timer jobs tm-timer and blcore-expire. Eventually, the service >> becomes unresponsive. Restarting opensips restores service and the >> children return to single-digit CPU utilization, but eventually, >> children get suck again. >> >> I'm not certain if the issue is on the database server, or if the >> opensips servers are overloaded, or if the config is just not right >> yet. >> >> Is there an established method for fine-tuning these things? >> shared memory >> process memory >> children >> db_max_async_connections >> listen=... use_children >> modparam("tm", "timer_partitions", ?) >> >> What else is worth considering? >> Does a child ever return to async mode after running in sync mode? >> How do I know when my servers have reached their limit? >> opensips.cfg is available on request. >> >> version: opensips 2.4.7 (x86_64/linux) >> flags: STATS: On, DISABLE_NAGLE, USE_MCAST, SHM_MMAP, PKG_MALLOC, >> F_MALLOC, FAST_LOCK-ADAPTIVE_WAIT >> ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, >> MAX_URI_SIZE 1024, BUF_SIZE 65535 >> poll method support: poll, epoll, sigio_rt, select. >> git revision: 9e1fcc915 >> main.c compiled on with gcc 7 >> >> *re-built using dpkg-buildpackage including the patch to support DB >> floating point types: >> https://opensips.org/pipermail/users/2020-March/042528.html >> >> $ lsb_release -d >> Description: Ubuntu 18.04.4 LTS >> >> $ uname -a >> Linux TC-521 4.15.0-91-generic #92-Ubuntu SMP Fri Feb 28 11:09:48 UTC >> 2020 x86_64 x86_64 x86_64 GNU/Linux >> >> $ free -mw >> total used free shared buffers >> cache available >> Mem: 48281 1085 337 87 1729 >> 45128 46551 >> >> $ lscpu >> Architecture: x86_64 >> CPU op-mode(s): 32-bit, 64-bit >> Byte Order: Little Endian >> CPU(s): 16 >> On-line CPU(s) list: 0-15 >> Thread(s) per core: 2 >> Core(s) per socket: 4 >> Socket(s): 2 >> NUMA node(s): 2 >> Vendor ID: GenuineIntel >> CPU family: 6 >> Model: 44 >> Model name: Intel(R) Xeon(R) CPU E5620 @ 2.40GHz >> Stepping: 2 >> CPU MHz: 2527.029 >> BogoMIPS: 4788.05 >> Virtualization: VT-x >> L1d cache: 32K >> L1i cache: 32K >> L2 cache: 256K >> L3 cache: 12288K >> NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 >> NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 >> Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr >> pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe >> syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts >> rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq >> dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca >> sse4_1 sse4_2 popcnt aes lahf_lm pti ssbd ibrs ibpb stibp tpr_shadow >> vnmi flexpriority ept vpid dtherm ida arat flush_l1d >> >> Regards, >> >> Calvin Ellison >> Senior Voice Operations Engineer >> [email protected] >> >> _______________________________________________ >> Users mailing list >> [email protected] >> http://lists.opensips.org/cgi-bin/mailman/listinfo/users >> > _______________________________________________ > Users mailing list > [email protected] > http://lists.opensips.org/cgi-bin/mailman/listinfo/users >
_______________________________________________ Users mailing list [email protected] http://lists.opensips.org/cgi-bin/mailman/listinfo/users
