[SSSD] Re: [RFC] sbus2 integration

Fabiano Fidêncio Sun, 20 May 2018 23:21:06 -0700

On Fri, May 18, 2018 at 9:50 PM, Simo Sorce <s...@redhat.com> wrote:
> On Fri, 2018-05-18 at 16:11 +0200, Sumit Bose wrote:
>> On Fri, May 18, 2018 at 02:33:32PM +0200, Pavel Březina wrote:
>> > Hi folks,
>> > I sent a mail about new sbus implementation (I'll refer to it as sbus2) 
>> > [1].
>
> Sorry Pavel,
> but I need to ask, why a new bus instead of somthing like varlink ?


For those who are not familiar with varlink: https://lwn.net/Articles/742675/

>
>> > Now, I'm integrating it into SSSD. The work is quite difficult since it
>> > touches all parts of SSSD and the changes are usually interconnected but 
>> > I'm
>> > slowly moving towards the goal [2].
>> >
>> > At this moment, I'm trying to take "miminum changes" paths so the code can
>> > be built and function with sbus2, however to take full advantage of it, it
>> > will take further improvements (that will not be very difficult).
>> >
>> > There is one big change that I would like to take though, that needs to be
>> > discussed. It is about how we currently handle sbus connections.
>> >
>> > In current state, monitor and each backend creates a private sbus server.
>> > The current implementation of a private sbus server is not a message bus, 
>> > it
>> > only serves as an address to create point to point nameless connection. 
>> > Thus
>> > each client must maintain several connections:
>> >  - each responder is connected to monitor and to all backends
>> >  - each backend is connected to monitor
>> >  - we have monitor + number of backends private servers
>> >  - each private server maintains about 10 active connections
>> >
>> > This has several disadvantages - there are many connections, we cannot
>> > broadcast signals, if a process wants to talk to other process it needs to
>> > connect to its server and maintain the connection. Since responders do not
>> > currently provider a server, they cannot talk between each other.
>
> This design has a key advantage, a single process going down does not
> affect all other processes communication. How do you recover if the
> "switch-board" goes down during message processing with sbus ?
>
>> > sbus2 implements proper private message bus. So it can work in the same way
>> > as session or system bus. It is a server that maintains the connections,
>> > keep tracks of their names and then routes messages from one connection to
>> > another.
>> >
>> > My idea is to have only one sbus server managed by monitor.
>
> This conflict wth the idea of getting rid of the monitor process, do
> not know if this is currently still pursued but it was brought up over
> and over many times that we might want to use systemd as the "monitor"
> and let socket activation deal with the rest.
>
>> >  Other processes
>> > will connect to this server with a named connection (e.g. sssd.nss,
>> > sssd.backend.dom1, sssd.backend.dom2). We can then send message to this
>> > message bus (only one connection) and set destination to name (e.g. 
>> > sssd.nss
>> > to invalidate memcache). We can also send signals to this bus and it will
>> > broadcast it to all connections that listens to this signals. So, it is
>> > proper way how to do it. It will simplify things and allow us to send
>> > signals and have better IPC in general.
>> >
>> > I know we want to eventually get rid of the monitor, the process would stay
>> > as an sbus server. It would become a single point of failure, but the
>> > process can be restarted automatically by systemd in case of crash.
>> >
>> > Also here is a bonus question - do any of you remember why we use private
>> > server at all?
>>
>> In the very original design there was a "switch-board" process which
>> received a request from one component and forwarded it to the right
>> target. I guess at this time we didn't know a lot about DBus to
>> implement this properly. In the end we thought it was a useless overhead
>> and removed it. I think we didn't thought about signals to all components
>> or the backend sending requests to the frontends.
>>
>> > Why don't we connect to system message bus?
>>
>> Mainly because we do not trust it to handle plain text passwords and
>> other credentials with the needed care.
>
> That and because at some point there was a potential chicken-egg issue
> at startup, and also because we didn't want to handle additional error
> recovery if the system message bus was restarted.
>
> Fundamentally the system message bus is useful only for services
> offering a "public" service, otherwise it is just an overhead, and has
> security implications.
>
>> > I do not see any benefit in having a private server.
>
> There is no way to break into sssd via a bug in the system message bus.
> This is one good reason, aside for the other above.
>
> Fundamentally we needed a private structured messaging system we could
> easily integrate with tevent. The only usable option back then was
> dbus, and given we already had ideas about offering some plugic
> interface over the message bus we went that way so we could later reuse
> the integration.
>
> Today we'd probably go with something a lot more lightweight like
> varlink.
>
>> If I understood you correctly we not only have 'a' private server but 4
>> for a typically minimal setup (monitor, pam, nss, backend).
>>
>> Given your arguments above I think using a private message bus would
>> have benefits. Currently two questions came to my mind. First, what
>> happens to ongoing requests if the monitor dies and is restarted. E.g.
>> If the backend is processing a user lookup request and the monitor is
>> restarted can the backend just send the reply to the freshly stared
>> instance and the nss responder will finally get it? Or is there some
>> state lost which would force the nss responder to resend the request?
>
> How would the responder even know the other side died, is there a way
> for clients to know that services died and all requests in flight need
> to be resent ?
>
>> The second is about the overhead. Do you have any numbers how much
>> longer e.g. the nss responder has to wait e.g. for a backend if offline
>> reply? I would expect that we loose more time at other places,
>> nevertheless it would be good to have some basic understanding about the
>> overhead.
>
> Latency is what e should be worried ab out, one other reason to go with
> direct connections is that you did not have to wait for 3 processes to
> be awake and be scheduled (client/monitor/server) but only 2
> (client/server). On busy machines the latency can be (relatively) quite
> high if an additional process need to be scheduled just to pass a long
> a message.
>
> Simo.
>
>> Thank you for your hard work on sbus2.
>>
>> bye,
>> Sumit
>> >
>> > [1] https://github.com/pbrezina/sbus
>> > [2] https://github.com/pbrezina/sssd/tree/sbus
>> > _______________________________________________
>> > sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org
>> > To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org
>> > Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
>> > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
>> > List Archives: 
>> > https://lists.fedoraproject.org/archives/list/sssd-devel@lists.fedorahosted.org/message/Z7ZSIEX7QAAZAUGCVNLTYDAYEUHOQHY6/
>>
>> _______________________________________________
>> sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org
>> To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org
>> Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
>> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
>> List Archives: 
>> https://lists.fedoraproject.org/archives/list/sssd-devel@lists.fedorahosted.org/message/FMN2C5GXHDIQ6SNPRWWGPSF4EJEFX7PC/
>
> --
> Simo Sorce
> Sr. Principal Software Engineer
> Red Hat, Inc
> _______________________________________________
> sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org
> To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org
> Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: 
> https://lists.fedoraproject.org/archives/list/sssd-devel@lists.fedorahosted.org/message/H4J4OUHYLVCBJNTXLFZJJWRHUTIENVV6/
_______________________________________________
sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org
To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/sssd-devel@lists.fedorahosted.org/message/QT4Z2QXJEYNYIQARGDVDCODFTU27K3V4/

[SSSD] Re: [RFC] sbus2 integration

Reply via email to