On Fri, May 18, 2018 at 9:50 PM, Simo Sorce <s...@redhat.com> wrote: > On Fri, 2018-05-18 at 16:11 +0200, Sumit Bose wrote: >> On Fri, May 18, 2018 at 02:33:32PM +0200, Pavel Březina wrote: >> > Hi folks, >> > I sent a mail about new sbus implementation (I'll refer to it as sbus2) >> > [1]. > > Sorry Pavel, > but I need to ask, why a new bus instead of somthing like varlink ?
For those who are not familiar with varlink: https://lwn.net/Articles/742675/ > >> > Now, I'm integrating it into SSSD. The work is quite difficult since it >> > touches all parts of SSSD and the changes are usually interconnected but >> > I'm >> > slowly moving towards the goal [2]. >> > >> > At this moment, I'm trying to take "miminum changes" paths so the code can >> > be built and function with sbus2, however to take full advantage of it, it >> > will take further improvements (that will not be very difficult). >> > >> > There is one big change that I would like to take though, that needs to be >> > discussed. It is about how we currently handle sbus connections. >> > >> > In current state, monitor and each backend creates a private sbus server. >> > The current implementation of a private sbus server is not a message bus, >> > it >> > only serves as an address to create point to point nameless connection. >> > Thus >> > each client must maintain several connections: >> > - each responder is connected to monitor and to all backends >> > - each backend is connected to monitor >> > - we have monitor + number of backends private servers >> > - each private server maintains about 10 active connections >> > >> > This has several disadvantages - there are many connections, we cannot >> > broadcast signals, if a process wants to talk to other process it needs to >> > connect to its server and maintain the connection. Since responders do not >> > currently provider a server, they cannot talk between each other. > > This design has a key advantage, a single process going down does not > affect all other processes communication. How do you recover if the > "switch-board" goes down during message processing with sbus ? > >> > sbus2 implements proper private message bus. So it can work in the same way >> > as session or system bus. It is a server that maintains the connections, >> > keep tracks of their names and then routes messages from one connection to >> > another. >> > >> > My idea is to have only one sbus server managed by monitor. > > This conflict wth the idea of getting rid of the monitor process, do > not know if this is currently still pursued but it was brought up over > and over many times that we might want to use systemd as the "monitor" > and let socket activation deal with the rest. > >> > Other processes >> > will connect to this server with a named connection (e.g. sssd.nss, >> > sssd.backend.dom1, sssd.backend.dom2). We can then send message to this >> > message bus (only one connection) and set destination to name (e.g. >> > sssd.nss >> > to invalidate memcache). We can also send signals to this bus and it will >> > broadcast it to all connections that listens to this signals. So, it is >> > proper way how to do it. It will simplify things and allow us to send >> > signals and have better IPC in general. >> > >> > I know we want to eventually get rid of the monitor, the process would stay >> > as an sbus server. It would become a single point of failure, but the >> > process can be restarted automatically by systemd in case of crash. >> > >> > Also here is a bonus question - do any of you remember why we use private >> > server at all? >> >> In the very original design there was a "switch-board" process which >> received a request from one component and forwarded it to the right >> target. I guess at this time we didn't know a lot about DBus to >> implement this properly. In the end we thought it was a useless overhead >> and removed it. I think we didn't thought about signals to all components >> or the backend sending requests to the frontends. >> >> > Why don't we connect to system message bus? >> >> Mainly because we do not trust it to handle plain text passwords and >> other credentials with the needed care. > > That and because at some point there was a potential chicken-egg issue > at startup, and also because we didn't want to handle additional error > recovery if the system message bus was restarted. > > Fundamentally the system message bus is useful only for services > offering a "public" service, otherwise it is just an overhead, and has > security implications. > >> > I do not see any benefit in having a private server. > > There is no way to break into sssd via a bug in the system message bus. > This is one good reason, aside for the other above. > > Fundamentally we needed a private structured messaging system we could > easily integrate with tevent. The only usable option back then was > dbus, and given we already had ideas about offering some plugic > interface over the message bus we went that way so we could later reuse > the integration. > > Today we'd probably go with something a lot more lightweight like > varlink. > >> If I understood you correctly we not only have 'a' private server but 4 >> for a typically minimal setup (monitor, pam, nss, backend). >> >> Given your arguments above I think using a private message bus would >> have benefits. Currently two questions came to my mind. First, what >> happens to ongoing requests if the monitor dies and is restarted. E.g. >> If the backend is processing a user lookup request and the monitor is >> restarted can the backend just send the reply to the freshly stared >> instance and the nss responder will finally get it? Or is there some >> state lost which would force the nss responder to resend the request? > > How would the responder even know the other side died, is there a way > for clients to know that services died and all requests in flight need > to be resent ? > >> The second is about the overhead. Do you have any numbers how much >> longer e.g. the nss responder has to wait e.g. for a backend if offline >> reply? I would expect that we loose more time at other places, >> nevertheless it would be good to have some basic understanding about the >> overhead. > > Latency is what e should be worried ab out, one other reason to go with > direct connections is that you did not have to wait for 3 processes to > be awake and be scheduled (client/monitor/server) but only 2 > (client/server). On busy machines the latency can be (relatively) quite > high if an additional process need to be scheduled just to pass a long > a message. > > Simo. > >> Thank you for your hard work on sbus2. >> >> bye, >> Sumit >> > >> > [1] https://github.com/pbrezina/sbus >> > [2] https://github.com/pbrezina/sssd/tree/sbus >> > _______________________________________________ >> > sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org >> > To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org >> > Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html >> > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines >> > List Archives: >> > https://lists.fedoraproject.org/archives/list/sssd-devel@lists.fedorahosted.org/message/Z7ZSIEX7QAAZAUGCVNLTYDAYEUHOQHY6/ >> >> _______________________________________________ >> sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org >> To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org >> Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html >> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines >> List Archives: >> https://lists.fedoraproject.org/archives/list/sssd-devel@lists.fedorahosted.org/message/FMN2C5GXHDIQ6SNPRWWGPSF4EJEFX7PC/ > > -- > Simo Sorce > Sr. Principal Software Engineer > Red Hat, Inc > _______________________________________________ > sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org > To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org > Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/sssd-devel@lists.fedorahosted.org/message/H4J4OUHYLVCBJNTXLFZJJWRHUTIENVV6/ _______________________________________________ sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/sssd-devel@lists.fedorahosted.org/message/QT4Z2QXJEYNYIQARGDVDCODFTU27K3V4/