On Sat, Dec 10, 2016 at 8:52 AM, Willy Tarreau wrote: > On Fri, Dec 09, 2016 at 08:18:45PM +0100, Pavlos Parissis wrote: > > On 9 December 2016 at 20:07, Apollon Oikonomopoulos wrote: > (...) > > >> > I wonder if a `per-process' keyword would make sense here. I find > > >> > > > >> > bind :443 ssl .... per-process > > >> > > > >> > more concise than 15 or 20 individual bind lines. This would have > the > > >> > same effect as N bind lines, one for each process in the > bind-process > > >> > list. > (...) > > Indeed, that would be nice. I guess it isn't big issue as most of the > > people use a configuration management tool, which does the expansion. > > I find that this is a very good idea. We need to be careful when > implementing > it because it will definitely come with problematic cases, but the idea is > good. In fact, Manu suggested to me in private that using multiple bind > lines is not convenient for him because he loads tons of certificates and > it would require him to load them multiple times (takes more time, eats > more > memory). Something like the above, if properly designed, would solve that > as > well. > > I think we have to think a bit about a reusable implementation because we > also need to implement something comparable for the DNS resolvers so that > there's a per-process socket. In the end I suspect that we'll end up having > a list of FDs instead of a single FD for each process. > > Also recently I noted that the cpu-map statement is boring when you deal > with many processes, and even more when you want to experiment with > different nbproc values because often you have to comment out many lines > and try again with many new ones. Most often we just want to have one > CPU for one process and they have to follow a regular pattern, eg +1 for > the process means +1 for the CPU. But sometimes due to hyperthreading or > NUMA you may need to use +2 or +8. Thus I was thinking we could have an > automatic cpu-set value by using something more or less like this : > > cpu-map 1-10 2+1 > cpu-map 11-20 16+1 > > This would do the same as this : > > cpu-map 1 2 > cpu-map 2 3 > cpu-map 3 4 > ... > cpu-map 10 11 > cpu-map 11 16 > ... > cpu-map 20 25 > > We could also have this : > > cpu-map 1-10 2+2 > > equivalent to : > cpu-map 1 2 > cpu-map 2 4 > cpu-map 3 6 > cpu-map 4 8 > ... > > And maybe we can add a "/X" statement to apply a modulo after the increment > and limit the number of CPUs used in the loop : > > cpu-map 1-7 2+8/14 > > equivalent to : > cpu-map 1 2 > cpu-map 2 10 > cpu-map 3 4 > cpu-map 4 12 > cpu-map 5 6 > cpu-map 6 14 > cpu-map 7 8 > > This can be useful to automatically enable use of some NUMA nodes or not > depending on the nbproc value. > > Maybe others have other ideas, they're welcome. > > Cheers, > Willy
Hi, How about nginx style? nbproc auto + cpu-map auto? +1 on a per-process bind line (or auto). (auto would mean a good enough default setup) As for my multi proc ssl setup in case anyone was wondering: I did a ssl-offload listener that runs on all cores except core0 on each cpu + it's HT sibling. relaying via unix sockets to a frontend that runs on core0 on each cpu and it's HT siblings, so (0,1,28,29 in my case). I haven't configured any "dedicated" cores for network interrupts.. Maybe I should? My idea would be like core 0 + 1 on each cpu + it's HT siblings, which would result 2 different cpu's with 2 cores + 2 HT each. According what Willy thumb rule that would be "enough" (80Gbps or 40Gbps depending how he counted HT virtual cores in his rule of thumb). And then I would exclude these network IRQ cores from haproxy frontend/listen binds. Sounds good as a sane setup? /Elias

