On Sat, Dec 10, 2016 at 8:52 AM, Willy Tarreau wrote:

> On Fri, Dec 09, 2016 at 08:18:45PM +0100, Pavlos Parissis wrote:
> > On 9 December 2016 at 20:07, Apollon Oikonomopoulos wrote:
> (...)
> > >> > I wonder if a `per-process' keyword would make sense here. I find
> > >> >
> > >> >   bind :443 ssl .... per-process
> > >> >
> > >> > more concise than 15 or 20 individual bind lines. This would have
> the
> > >> > same effect as N bind lines, one for each process in the
> bind-process
> > >> > list.
> (...)
> > Indeed, that would be nice. I guess it isn't big issue as most of the
> > people use a configuration management tool, which does the expansion.
>
> I find that this is a very good idea. We need to be careful when
> implementing
> it because it will definitely come with problematic cases, but the idea is
> good. In fact, Manu suggested to me in private that using multiple bind
> lines is not convenient for him because he loads tons of certificates and
> it would require him to load them multiple times (takes more time, eats
> more
> memory). Something like the above, if properly designed, would solve that
> as
> well.
>
> I think we have to think a bit about a reusable implementation because we
> also need to implement something comparable for the DNS resolvers so that
> there's a per-process socket. In the end I suspect that we'll end up having
> a list of FDs instead of a single FD for each process.
>
> Also recently I noted that the cpu-map statement is boring when you deal
> with many processes, and even more when you want to experiment with
> different nbproc values because often you have to comment out many lines
> and try again with many new ones. Most often we just want to have one
> CPU for one process and they have to follow a regular pattern, eg +1 for
> the process means +1 for the CPU. But sometimes due to hyperthreading or
> NUMA you may need to use +2 or +8. Thus I was thinking we could have an
> automatic cpu-set value by using something more or less like this :
>
>     cpu-map  1-10 2+1
>     cpu-map 11-20 16+1
>
> This would do the same as this :
>
>     cpu-map 1 2
>     cpu-map 2 3
>     cpu-map 3 4
>     ...
>     cpu-map 10 11
>     cpu-map 11 16
>     ...
>     cpu-map 20 25
>
> We could also have this :
>
>     cpu-map  1-10 2+2
>
>   equivalent to :
>     cpu-map 1 2
>     cpu-map 2 4
>     cpu-map 3 6
>     cpu-map 4 8
>     ...
>
> And maybe we can add a "/X" statement to apply a modulo after the increment
> and limit the number of CPUs used in the loop :
>
>     cpu-map  1-7 2+8/14
>
>   equivalent to :
>     cpu-map 1 2
>     cpu-map 2 10
>     cpu-map 3 4
>     cpu-map 4 12
>     cpu-map 5 6
>     cpu-map 6 14
>     cpu-map 7 8
>
> This can be useful to automatically enable use of some NUMA nodes or not
> depending on the nbproc value.
>
> Maybe others have other ideas, they're welcome.
>
> Cheers,
> Willy


Hi,

How about nginx style? nbproc auto + cpu-map auto?
+1 on a per-process bind line (or auto).
(auto would mean a good enough default setup)



As for my multi proc ssl setup in case anyone was wondering:
I did a ssl-offload listener that runs on all cores except core0 on each
cpu + it's HT sibling.
relaying via unix sockets to a frontend that runs on core0 on each cpu and
it's HT siblings, so (0,1,28,29 in my case).


I haven't configured any "dedicated" cores for network interrupts.. Maybe I
should?
My idea would be like core 0 + 1 on each cpu + it's HT siblings, which
would result 2 different cpu's with 2 cores + 2 HT each. According what
Willy thumb rule that would be "enough" (80Gbps or 40Gbps depending how he
counted HT virtual cores in his rule of thumb).
And then I would exclude these network IRQ cores from haproxy
frontend/listen binds.
Sounds good as a sane setup?

/Elias

Reply via email to