----- Mail original ----- > De: "Benoit GEORGELIN, web4all" <benoit.george...@web4all.fr> > À: "lxc-users" <lxc-users@lists.linuxcontainers.org> > Envoyé: Mardi 28 Mars 2017 11:20:48 > Objet: Re: [lxc-users] Experience with large number of LXC/LXD containers
> ----- Mail original ----- > > De: "David Favor" <da...@davidfavor.com> > > À: "lxc-users" <lxc-users@lists.linuxcontainers.org> > > Envoyé: Lundi 27 Mars 2017 12:55:09 > > Objet: Re: [lxc-users] Experience with large number of LXC/LXD containers > > Serge E. Hallyn wrote: > >> On Tue, Mar 14, 2017 at 02:29:01AM +0100, Benoit GEORGELIN – Association > >> Web4all > >> wrote: > >>> ----- Mail original ----- > >>>> De: “Simos Xenitellis” <simos.li...@googlemail.com> À: “lxc-users” > >>>> <lxc-users@lists.linuxcontainers.org> Envoyé: Lundi 13 Mars 2017 20:22:03 > >>>> Objet: Re: [lxc-users] Experience with large number of LXC/LXD > >>>> containers On > >>>> Sun, Mar 12, 2017 at 11:28 PM, Benoit GEORGELIN – Association Web4all > >>>> <benoit.george...@web4all.fr> wrote: > >>>>> Hi lxc-users , I would like to know if you have any experience with a > >>>>> large > >>>>> number of LXC/LXD containers ? In term of performance, stability and > >>>>> limitation > >>>>> . I'm wondering for exemple, if having 100 containers behave the same > >>>>> of having > >>>>> 1.000 or 10.000 with the same configuration to avoid to talk about > >>>>> container > >>>>> usage. I have been looking around for a couple of days to found any > >>>>> user/admin > >>>>> feedback experience but i'm not able to find large deployments Is there > >>>>> any > >>>>> ressources limits or any maximum number that can be deployed on the > >>>>> same node ? > >>>>> Beside physical performance of the node, is there any specific behavior > >>>>> that a > >>>>> large number of LXC/LXD containers can experience ? I'm not aware of > >>>>> any test > >>>>> or limits that can occurs beside number of process. But I'm sure from > >>>>> LXC/LXD > >>>>> side it might have some technical contraints ? Maybe on namespace > >>>>> availability > >>>>> , or any other technical layer used by LXC/LXD I will be interested to > >>>>> here > >>>>> from your experience or if you have any links/books/story about this > >>>>> large > >>>>> deployments > >>>> This would be interesting to hear if someone can talk publicly about > >>>> their large > >>>> deployment. In any case, it should be possible to create, for example, > >>>> 1000 web > >>>> servers and then try to access each one and check any issues regarding > >>>> the > >>>> response time. Another test would be to install 1000 Wordpress > >>>> installations > >>>> and check again for the response time and resource usage. Such scripts to > >>>> create this massive number of containers would also be helpful to > >>>> replicate any > >>>> issues in order to solve them. Simos > > Been reading this + here's a bit of info. > > I've been running LXC since early deployment + now LXD. > > There are a few big performance killers related to WordPress. If you keep > > these > > issues in mind, you'll be good. > > 1) I run 100s of sites across many containers on many machines. > > My business is private, high speed hosting, so I eat from my efforts. > > No theory here. > > I target WordPress site speed at 3000+ reqs/second, measured locally > > using ab (ApacheBench). This is a crude tool + sufficient, as I issue > > 1,000,000 simultaneous 5 thread connections against a server for 30 seconds. > > ab -k -t 30 -n 10000000 -c 5 $URL > > This will crash most machines, unless they're tuned well. > > 2) Memory + CPU. The big killer of performance anywhere is swap thrash. If > > top > > shows swapping for more than a few seconds, likely your system is heading > > toward a crash. > > Fix: I tend to deploy OVH machines with 128G of memory, as this is enough > > memory to handle huge spikes of memory usage across many sites, during > > traffic spikes... then recover... > > For example, running 100s of sites across many LXD containers, I've had > > machines sustain 250,000+ reqs/hour every day for months. > > At these traffic levels, <1 core used sustained + 50%ish memory use. > > Sites still show 3000+ reqs/sec using ab test above. > > 3) Database: I run MariaDB rather than MySQL as it's smokin' fast. > > I also relocate /tmp to tmpfs, so temp file i/o runs at memory speed, > > rather than disk speed. > > This ensures all MariaDB temp select set files (for complex selects) > > generate + access at memory speed. > > Also PHP session /tmp files run at memory speed. > > This is important to me, as many of my clients run large membership > > sites. Many are >40K members. This sites performance would circle > > the drain if /tmp was on disk. > > 4) Disk Thrash: Becomes the killer as traffic increases. > > 5) Apache Logging: For several clients I'm currently retuning my Apache > > logging > > to skip logging of successful serves of - images, css, js, fonts. I'll still > > long non-200s, as these need to be debugged. > > This can make a huge difference if memory pressure/use forces disk writes to > > actually go to disk, rather than kernel filesystem i/o buffers. > > Once memory pressure forces physical disk writes, disk i/o starves Apache > > from > > quickly serving uncached content. Very ugly. > > Right now I'm doing extensive filesystem testing, to reduce disk thrash > > during > > traffic spikes + related memory pressure. > > 6) Net Connection: If you're running 1000s of containers, best also check > > adapter > > saturation. I use 10Gig adapters + even at extreme traffic levels, they > > barely > > reach 10% saturation. > > This means 10Gig adapters are a must for me, as 10% is 1Gig, so using 1Gig > > adapters, site speed would begin to throttle, based on adapter saturation, > > which would be a bear to debug. > > 7) Apache: I've taken setting up Apache to kill off processes, after > > anywhere > > from 10K to 100K requests served. This ensures the kernel can garbage > > collect > > (resource reclamation) which also helps escape swapping. > > If you have 100,000s+ Apache processes running, with no kill off, then > > eventually > > they can potentially eat up a massive amount of memory, which takes a long > > time > > to reclaim, depending on other MPM config settings. > > So… General rule of thumb. Tune your entire WAMPL stack to run out of > > memory: > > WAMPL - WordPress running on Apache + PHP + MariaDB + Linux > > If your sites run at memory speed, makes no real difference how many > > containers > > you run. Possibly context switching might come into play if many of the > > sites > > running were high traffic sites. > > If problems occur, just look at your Apache logs across all containers. > > Move the > > site with highest traffic to another physical machine. > > Or, if top shows swapping, add more memory. > Hi David, > interesting feedback, it's good to know about the details you gave > (memory/swap) > Happy hosting ;) By any chance, if you were in Montreal today, available for an event about security and LXD large deployment, I missed @stgraber tweet about it ( https://twitter.com/stgraber/status/849106252764520453 ) . That would be nice to share what was the large LXD deployment about :) Thanks! _______________________________________________ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users