----- Mail original ----- > De: "David Favor" <da...@davidfavor.com> > À: "lxc-users" <lxc-users@lists.linuxcontainers.org> > Envoyé: Lundi 27 Mars 2017 12:55:09 > Objet: Re: [lxc-users] Experience with large number of LXC/LXD containers
> Serge E. Hallyn wrote: >> On Tue, Mar 14, 2017 at 02:29:01AM +0100, Benoit GEORGELIN – Association >> Web4all >> wrote: >>> ----- Mail original ----- >>>> De: “Simos Xenitellis” <simos.li...@googlemail.com> À: “lxc-users” >>>> <lxc-users@lists.linuxcontainers.org> Envoyé: Lundi 13 Mars 2017 20:22:03 >>>> Objet: Re: [lxc-users] Experience with large number of LXC/LXD containers >>>> On >>>> Sun, Mar 12, 2017 at 11:28 PM, Benoit GEORGELIN – Association Web4all >>>> <benoit.george...@web4all.fr> wrote: >>>>> Hi lxc-users , I would like to know if you have any experience with a >>>>> large >>>>> number of LXC/LXD containers ? In term of performance, stability and >>>>> limitation >>>>> . I'm wondering for exemple, if having 100 containers behave the same of >>>>> having >>>>> 1.000 or 10.000 with the same configuration to avoid to talk about >>>>> container >>>>> usage. I have been looking around for a couple of days to found any >>>>> user/admin >>>>> feedback experience but i'm not able to find large deployments Is there >>>>> any >>>>> ressources limits or any maximum number that can be deployed on the same >>>>> node ? >>>>> Beside physical performance of the node, is there any specific behavior >>>>> that a >>>>> large number of LXC/LXD containers can experience ? I'm not aware of any >>>>> test >>>>> or limits that can occurs beside number of process. But I'm sure from >>>>> LXC/LXD >>>>> side it might have some technical contraints ? Maybe on namespace >>>>> availability >>>>> , or any other technical layer used by LXC/LXD I will be interested to >>>>> here >>>>> from your experience or if you have any links/books/story about this large >>>>> deployments >>>> This would be interesting to hear if someone can talk publicly about their >>>> large >>>> deployment. In any case, it should be possible to create, for example, >>>> 1000 web >>>> servers and then try to access each one and check any issues regarding the >>>> response time. Another test would be to install 1000 Wordpress >>>> installations >>>> and check again for the response time and resource usage. Such scripts to >>>> create this massive number of containers would also be helpful to >>>> replicate any >>>> issues in order to solve them. Simos > Been reading this + here's a bit of info. > I've been running LXC since early deployment + now LXD. > There are a few big performance killers related to WordPress. If you keep > these > issues in mind, you'll be good. > 1) I run 100s of sites across many containers on many machines. > My business is private, high speed hosting, so I eat from my efforts. > No theory here. > I target WordPress site speed at 3000+ reqs/second, measured locally > using ab (ApacheBench). This is a crude tool + sufficient, as I issue > 1,000,000 simultaneous 5 thread connections against a server for 30 seconds. > ab -k -t 30 -n 10000000 -c 5 $URL > This will crash most machines, unless they're tuned well. > 2) Memory + CPU. The big killer of performance anywhere is swap thrash. If top > shows swapping for more than a few seconds, likely your system is heading > toward a crash. > Fix: I tend to deploy OVH machines with 128G of memory, as this is enough > memory to handle huge spikes of memory usage across many sites, during > traffic spikes... then recover... > For example, running 100s of sites across many LXD containers, I've had > machines sustain 250,000+ reqs/hour every day for months. > At these traffic levels, <1 core used sustained + 50%ish memory use. > Sites still show 3000+ reqs/sec using ab test above. > 3) Database: I run MariaDB rather than MySQL as it's smokin' fast. > I also relocate /tmp to tmpfs, so temp file i/o runs at memory speed, > rather than disk speed. > This ensures all MariaDB temp select set files (for complex selects) > generate + access at memory speed. > Also PHP session /tmp files run at memory speed. > This is important to me, as many of my clients run large membership > sites. Many are >40K members. This sites performance would circle > the drain if /tmp was on disk. > 4) Disk Thrash: Becomes the killer as traffic increases. > 5) Apache Logging: For several clients I'm currently retuning my Apache > logging > to skip logging of successful serves of - images, css, js, fonts. I'll still > long non-200s, as these need to be debugged. > This can make a huge difference if memory pressure/use forces disk writes to > actually go to disk, rather than kernel filesystem i/o buffers. > Once memory pressure forces physical disk writes, disk i/o starves Apache from > quickly serving uncached content. Very ugly. > Right now I'm doing extensive filesystem testing, to reduce disk thrash during > traffic spikes + related memory pressure. > 6) Net Connection: If you're running 1000s of containers, best also check > adapter > saturation. I use 10Gig adapters + even at extreme traffic levels, they barely > reach 10% saturation. > This means 10Gig adapters are a must for me, as 10% is 1Gig, so using 1Gig > adapters, site speed would begin to throttle, based on adapter saturation, > which would be a bear to debug. > 7) Apache: I've taken setting up Apache to kill off processes, after anywhere > from 10K to 100K requests served. This ensures the kernel can garbage collect > (resource reclamation) which also helps escape swapping. > If you have 100,000s+ Apache processes running, with no kill off, then > eventually > they can potentially eat up a massive amount of memory, which takes a long > time > to reclaim, depending on other MPM config settings. > So… General rule of thumb. Tune your entire WAMPL stack to run out of memory: > WAMPL - WordPress running on Apache + PHP + MariaDB + Linux > If your sites run at memory speed, makes no real difference how many > containers > you run. Possibly context switching might come into play if many of the sites > running were high traffic sites. > If problems occur, just look at your Apache logs across all containers. Move > the > site with highest traffic to another physical machine. > Or, if top shows swapping, add more memory. Hi David, interesting feedback, it's good to know about the details you gave (memory/swap) Happy hosting ;) _______________________________________________ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users