Re: kdbus: add documentation
On 01/20/2015 09:25 AM, Daniel Mack wrote: > Hi Michael, > > On 01/20/2015 09:09 AM, Michael Kerrisk (man-pages) wrote: >> On 11/30/2014 06:23 PM, Florian Weimer wrote: >>> * David Herrmann: >>> On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer wrote: > * Greg Kroah-Hartman: > >> +7.4 Receiving messages >>> > What happens if this is not possible because the file descriptor limit > of the processes would be exceeded? EMFILE, and the message will not > be received? The message is returned without installing the FDs. This is signaled by EMFILE, but a valid pool offset. >>> >>> Oh. This is really surprising, so it needs documentation. But it's >>> probably better than the alternative (return EMFILE and leave the >>> message stuck, so that you receive it immediately again—this behavior >>> makes non-blocking accept rather difficult to use correctly). >> >> So, was this point in the end explicitly documented? I not >> obvious that it is documented in the revised kdbus.txt that >> Greg K-H sent out 4 days ago. > > No, we've revisited this point and changed the kernel behavior again in > v3. We're no longer returning -EMFILE in this case, but rather set > KDBUS_RECV_RETURN_INCOMPLETE_FDS in a new field in the receive ioctl > struct called 'return_flags'. We believe that's a nicer way of signaling > specific errors. The message will carry -1 for all FDs that failed to > get installed, so the user can actually see which one is missing. > > That's also documented in kdbus.txt, but we missed putting it into the > Changelog - sorry for that. Thanks for the info, Daniel. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
Hi Michael, On 01/20/2015 09:09 AM, Michael Kerrisk (man-pages) wrote: > On 11/30/2014 06:23 PM, Florian Weimer wrote: >> * David Herrmann: >> >>> On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer wrote: * Greg Kroah-Hartman: > +7.4 Receiving messages >> What happens if this is not possible because the file descriptor limit of the processes would be exceeded? EMFILE, and the message will not be received? >>> >>> The message is returned without installing the FDs. This is signaled >>> by EMFILE, but a valid pool offset. >> >> Oh. This is really surprising, so it needs documentation. But it's >> probably better than the alternative (return EMFILE and leave the >> message stuck, so that you receive it immediately again—this behavior >> makes non-blocking accept rather difficult to use correctly). > > So, was this point in the end explicitly documented? I not > obvious that it is documented in the revised kdbus.txt that > Greg K-H sent out 4 days ago. No, we've revisited this point and changed the kernel behavior again in v3. We're no longer returning -EMFILE in this case, but rather set KDBUS_RECV_RETURN_INCOMPLETE_FDS in a new field in the receive ioctl struct called 'return_flags'. We believe that's a nicer way of signaling specific errors. The message will carry -1 for all FDs that failed to get installed, so the user can actually see which one is missing. That's also documented in kdbus.txt, but we missed putting it into the Changelog - sorry for that. Hope this helps, Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
Daniel, David, On 11/30/2014 06:23 PM, Florian Weimer wrote: > * David Herrmann: > >> On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer wrote: >>> * Greg Kroah-Hartman: >>> +7.4 Receiving messages > >>> What happens if this is not possible because the file descriptor limit >>> of the processes would be exceeded? EMFILE, and the message will not >>> be received? >> >> The message is returned without installing the FDs. This is signaled >> by EMFILE, but a valid pool offset. > > Oh. This is really surprising, so it needs documentation. But it's > probably better than the alternative (return EMFILE and leave the > message stuck, so that you receive it immediately again—this behavior > makes non-blocking accept rather difficult to use correctly). So, was this point in the end explicitly documented? I not obvious that it is documented in the revised kdbus.txt that Greg K-H sent out 4 days ago. Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
Daniel, David, On 11/30/2014 06:23 PM, Florian Weimer wrote: * David Herrmann: On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer f...@deneb.enyo.de wrote: * Greg Kroah-Hartman: +7.4 Receiving messages What happens if this is not possible because the file descriptor limit of the processes would be exceeded? EMFILE, and the message will not be received? The message is returned without installing the FDs. This is signaled by EMFILE, but a valid pool offset. Oh. This is really surprising, so it needs documentation. But it's probably better than the alternative (return EMFILE and leave the message stuck, so that you receive it immediately again—this behavior makes non-blocking accept rather difficult to use correctly). So, was this point in the end explicitly documented? I not obvious that it is documented in the revised kdbus.txt that Greg K-H sent out 4 days ago. Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
Hi Michael, On 01/20/2015 09:09 AM, Michael Kerrisk (man-pages) wrote: On 11/30/2014 06:23 PM, Florian Weimer wrote: * David Herrmann: On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer f...@deneb.enyo.de wrote: * Greg Kroah-Hartman: +7.4 Receiving messages What happens if this is not possible because the file descriptor limit of the processes would be exceeded? EMFILE, and the message will not be received? The message is returned without installing the FDs. This is signaled by EMFILE, but a valid pool offset. Oh. This is really surprising, so it needs documentation. But it's probably better than the alternative (return EMFILE and leave the message stuck, so that you receive it immediately again—this behavior makes non-blocking accept rather difficult to use correctly). So, was this point in the end explicitly documented? I not obvious that it is documented in the revised kdbus.txt that Greg K-H sent out 4 days ago. No, we've revisited this point and changed the kernel behavior again in v3. We're no longer returning -EMFILE in this case, but rather set KDBUS_RECV_RETURN_INCOMPLETE_FDS in a new field in the receive ioctl struct called 'return_flags'. We believe that's a nicer way of signaling specific errors. The message will carry -1 for all FDs that failed to get installed, so the user can actually see which one is missing. That's also documented in kdbus.txt, but we missed putting it into the Changelog - sorry for that. Hope this helps, Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
On 01/20/2015 09:25 AM, Daniel Mack wrote: Hi Michael, On 01/20/2015 09:09 AM, Michael Kerrisk (man-pages) wrote: On 11/30/2014 06:23 PM, Florian Weimer wrote: * David Herrmann: On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer f...@deneb.enyo.de wrote: * Greg Kroah-Hartman: +7.4 Receiving messages What happens if this is not possible because the file descriptor limit of the processes would be exceeded? EMFILE, and the message will not be received? The message is returned without installing the FDs. This is signaled by EMFILE, but a valid pool offset. Oh. This is really surprising, so it needs documentation. But it's probably better than the alternative (return EMFILE and leave the message stuck, so that you receive it immediately again—this behavior makes non-blocking accept rather difficult to use correctly). So, was this point in the end explicitly documented? I not obvious that it is documented in the revised kdbus.txt that Greg K-H sent out 4 days ago. No, we've revisited this point and changed the kernel behavior again in v3. We're no longer returning -EMFILE in this case, but rather set KDBUS_RECV_RETURN_INCOMPLETE_FDS in a new field in the receive ioctl struct called 'return_flags'. We believe that's a nicer way of signaling specific errors. The message will carry -1 for all FDs that failed to get installed, so the user can actually see which one is missing. That's also documented in kdbus.txt, but we missed putting it into the Changelog - sorry for that. Thanks for the info, Daniel. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
* David Herrmann: > On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer wrote: >> * Greg Kroah-Hartman: >> >>> +7.4 Receiving messages >> What happens if this is not possible because the file descriptor limit >> of the processes would be exceeded? EMFILE, and the message will not >> be received? > > The message is returned without installing the FDs. This is signaled > by EMFILE, but a valid pool offset. Oh. This is really surprising, so it needs documentation. But it's probably better than the alternative (return EMFILE and leave the message stuck, so that you receive it immediately again—this behavior makes non-blocking accept rather difficult to use correctly). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
* David Herrmann: > poll(2) and friends cannot return data for changed descriptors. I > think a single trap for each KDBUS_CMD_MSG_RECV is acceptable. If this > turns out to be a bottleneck, we can provide bulk-operations in the > future. Anyway, I don't see how a _shared_ pool would change any of > this? I responded to Andy's messages because it seemed to be about generalizing the pool functionality. >> kernel could also queue the data for one specific recipient, >> addressing the same issue that SO_REUSEPORT tries to solve (on poll >> notification, the kernel does not know which recipient will eventually >> retrieve the data, so it has to notify and wake up all of them). > > We already queue data only for the addressed recipients. We *do* know > all recipients of a message at poll-notification time. We only wake up > recipients that actually got a message queued. Exactly, but poll on, say, UDP sockets, does not work this way. What I'm trying to say is that this functionality is interesting for much more than kdbus. > Not sure how this is related to SO_REUSEPORT. Can you elaborate on > your optimizations? Without something like SO_REUSEPORT, it is a bad idea to have multiple threads polling the same socket. The semantics are such that the kernel has to wake *all* the waiting threads, and one of them will eventually pick up the datagram with a separate system call. But the kernel does not know which thread this will be. With SO_REUSEPORT and a separately created socket for each polling thread, the kernel will only signal one poll operation because it assumes that any of the waiting threads will process the datagram, so it's sufficient just to notify one of them. kdbus behaves like the latter, but also saves the need to separately obtain the datagram and related data from the kernel. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
Hi On Sun, Nov 30, 2014 at 9:56 AM, Florian Weimer wrote: > * Greg Kroah-Hartman: > >> +The focus of this document is an overview of the low-level, native kernel >> D-Bus >> +transport called kdbus. Kdbus exposes its functionality via files in a >> +filesystem called 'kdbusfs'. All communication between processes takes place >> +via ioctls on files exposed through the mount point of a kdbusfs. The >> default >> +mount point of kdbusfs is /sys/fs/kdbus. > > Does this mean the bus does not enforce the correctness of the D-Bus > introspection metadata? That's really unfortunate. Classic D-Bus > does not do this, either, and combined with the variety of approaches > used to implement D-Bus endpoints, it makes it really difficult to > figure out what D-Bus services, exactly, a process provides. kdbus operates on the transport-level only. We never touch or look at transferred data. As such, DBus introspection data as defined by org.freedesktop.DBus.Introspectable is not verified by the transport layer. Thanks David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
Hi On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer wrote: > * Greg Kroah-Hartman: > >> +7.4 Receiving messages > >> +Also, if the connection allowed for file descriptor to be passed >> +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be >> +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl >> +returns. The receiving task is obliged to close all of them appropriately. > > What happens if this is not possible because the file descriptor limit > of the processes would be exceeded? EMFILE, and the message will not > be received? The message is returned without installing the FDs. This is signaled by EMFILE, but a valid pool offset. Thanks David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
Hi On Sun, Nov 30, 2014 at 10:08 AM, Florian Weimer wrote: > * Andy Lutomirski: > >> At the risk of opening a can of worms, wouldn't this be much more >> useful if you could share a pool between multiple connections? > > They would also be useful to reduce context switches when receiving > data from all kinds of descriptors. At present, when polling, you > receive notification, and then you have to call into the kernel, > again, to actually fetch the data and associated information. poll(2) and friends cannot return data for changed descriptors. I think a single trap for each KDBUS_CMD_MSG_RECV is acceptable. If this turns out to be a bottleneck, we can provide bulk-operations in the future. Anyway, I don't see how a _shared_ pool would change any of this? > kernel could also queue the data for one specific recipient, > addressing the same issue that SO_REUSEPORT tries to solve (on poll > notification, the kernel does not know which recipient will eventually > retrieve the data, so it has to notify and wake up all of them). We already queue data only for the addressed recipients. We *do* know all recipients of a message at poll-notification time. We only wake up recipients that actually got a message queued. Not sure how this is related to SO_REUSEPORT. Can you elaborate on your optimizations? Thanks David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
* Andy Lutomirski: > At the risk of opening a can of worms, wouldn't this be much more > useful if you could share a pool between multiple connections? They would also be useful to reduce context switches when receiving data from all kinds of descriptors. At present, when polling, you receive notification, and then you have to call into the kernel, again, to actually fetch the data and associated information. The kernel could also queue the data for one specific recipient, addressing the same issue that SO_REUSEPORT tries to solve (on poll notification, the kernel does not know which recipient will eventually retrieve the data, so it has to notify and wake up all of them). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
* Greg Kroah-Hartman: > +7.4 Receiving messages > +Also, if the connection allowed for file descriptor to be passed > +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be > +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl > +returns. The receiving task is obliged to close all of them appropriately. What happens if this is not possible because the file descriptor limit of the processes would be exceeded? EMFILE, and the message will not be received? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
* Greg Kroah-Hartman: > +The focus of this document is an overview of the low-level, native kernel > D-Bus > +transport called kdbus. Kdbus exposes its functionality via files in a > +filesystem called 'kdbusfs'. All communication between processes takes place > +via ioctls on files exposed through the mount point of a kdbusfs. The default > +mount point of kdbusfs is /sys/fs/kdbus. Does this mean the bus does not enforce the correctness of the D-Bus introspection metadata? That's really unfortunate. Classic D-Bus does not do this, either, and combined with the variety of approaches used to implement D-Bus endpoints, it makes it really difficult to figure out what D-Bus services, exactly, a process provides. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
* Greg Kroah-Hartman: +The focus of this document is an overview of the low-level, native kernel D-Bus +transport called kdbus. Kdbus exposes its functionality via files in a +filesystem called 'kdbusfs'. All communication between processes takes place +via ioctls on files exposed through the mount point of a kdbusfs. The default +mount point of kdbusfs is /sys/fs/kdbus. Does this mean the bus does not enforce the correctness of the D-Bus introspection metadata? That's really unfortunate. Classic D-Bus does not do this, either, and combined with the variety of approaches used to implement D-Bus endpoints, it makes it really difficult to figure out what D-Bus services, exactly, a process provides. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
* Greg Kroah-Hartman: +7.4 Receiving messages +Also, if the connection allowed for file descriptor to be passed +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl +returns. The receiving task is obliged to close all of them appropriately. What happens if this is not possible because the file descriptor limit of the processes would be exceeded? EMFILE, and the message will not be received? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
* Andy Lutomirski: At the risk of opening a can of worms, wouldn't this be much more useful if you could share a pool between multiple connections? They would also be useful to reduce context switches when receiving data from all kinds of descriptors. At present, when polling, you receive notification, and then you have to call into the kernel, again, to actually fetch the data and associated information. The kernel could also queue the data for one specific recipient, addressing the same issue that SO_REUSEPORT tries to solve (on poll notification, the kernel does not know which recipient will eventually retrieve the data, so it has to notify and wake up all of them). -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
Hi On Sun, Nov 30, 2014 at 10:08 AM, Florian Weimer f...@deneb.enyo.de wrote: * Andy Lutomirski: At the risk of opening a can of worms, wouldn't this be much more useful if you could share a pool between multiple connections? They would also be useful to reduce context switches when receiving data from all kinds of descriptors. At present, when polling, you receive notification, and then you have to call into the kernel, again, to actually fetch the data and associated information. poll(2) and friends cannot return data for changed descriptors. I think a single trap for each KDBUS_CMD_MSG_RECV is acceptable. If this turns out to be a bottleneck, we can provide bulk-operations in the future. Anyway, I don't see how a _shared_ pool would change any of this? kernel could also queue the data for one specific recipient, addressing the same issue that SO_REUSEPORT tries to solve (on poll notification, the kernel does not know which recipient will eventually retrieve the data, so it has to notify and wake up all of them). We already queue data only for the addressed recipients. We *do* know all recipients of a message at poll-notification time. We only wake up recipients that actually got a message queued. Not sure how this is related to SO_REUSEPORT. Can you elaborate on your optimizations? Thanks David -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
Hi On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer f...@deneb.enyo.de wrote: * Greg Kroah-Hartman: +7.4 Receiving messages +Also, if the connection allowed for file descriptor to be passed +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl +returns. The receiving task is obliged to close all of them appropriately. What happens if this is not possible because the file descriptor limit of the processes would be exceeded? EMFILE, and the message will not be received? The message is returned without installing the FDs. This is signaled by EMFILE, but a valid pool offset. Thanks David -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
Hi On Sun, Nov 30, 2014 at 9:56 AM, Florian Weimer f...@deneb.enyo.de wrote: * Greg Kroah-Hartman: +The focus of this document is an overview of the low-level, native kernel D-Bus +transport called kdbus. Kdbus exposes its functionality via files in a +filesystem called 'kdbusfs'. All communication between processes takes place +via ioctls on files exposed through the mount point of a kdbusfs. The default +mount point of kdbusfs is /sys/fs/kdbus. Does this mean the bus does not enforce the correctness of the D-Bus introspection metadata? That's really unfortunate. Classic D-Bus does not do this, either, and combined with the variety of approaches used to implement D-Bus endpoints, it makes it really difficult to figure out what D-Bus services, exactly, a process provides. kdbus operates on the transport-level only. We never touch or look at transferred data. As such, DBus introspection data as defined by org.freedesktop.DBus.Introspectable is not verified by the transport layer. Thanks David -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
* David Herrmann: poll(2) and friends cannot return data for changed descriptors. I think a single trap for each KDBUS_CMD_MSG_RECV is acceptable. If this turns out to be a bottleneck, we can provide bulk-operations in the future. Anyway, I don't see how a _shared_ pool would change any of this? I responded to Andy's messages because it seemed to be about generalizing the pool functionality. kernel could also queue the data for one specific recipient, addressing the same issue that SO_REUSEPORT tries to solve (on poll notification, the kernel does not know which recipient will eventually retrieve the data, so it has to notify and wake up all of them). We already queue data only for the addressed recipients. We *do* know all recipients of a message at poll-notification time. We only wake up recipients that actually got a message queued. Exactly, but poll on, say, UDP sockets, does not work this way. What I'm trying to say is that this functionality is interesting for much more than kdbus. Not sure how this is related to SO_REUSEPORT. Can you elaborate on your optimizations? Without something like SO_REUSEPORT, it is a bad idea to have multiple threads polling the same socket. The semantics are such that the kernel has to wake *all* the waiting threads, and one of them will eventually pick up the datagram with a separate system call. But the kernel does not know which thread this will be. With SO_REUSEPORT and a separately created socket for each polling thread, the kernel will only signal one poll operation because it assumes that any of the waiting threads will process the datagram, so it's sufficient just to notify one of them. kdbus behaves like the latter, but also saves the need to separately obtain the datagram and related data from the kernel. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
* David Herrmann: On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer f...@deneb.enyo.de wrote: * Greg Kroah-Hartman: +7.4 Receiving messages What happens if this is not possible because the file descriptor limit of the processes would be exceeded? EMFILE, and the message will not be received? The message is returned without installing the FDs. This is signaled by EMFILE, but a valid pool offset. Oh. This is really surprising, so it needs documentation. But it's probably better than the alternative (return EMFILE and leave the message stuck, so that you receive it immediately again—this behavior makes non-blocking accept rather difficult to use correctly). -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
On Wed, Nov 26, 2014 at 7:30 AM, Andy Lutomirski wrote: > Then find a clean way that's gated on having the right /proc access, > which is not guaranteed to exist on all of your eventual users' > systems, and, if that access doesn't exist because the admin or > sandbox designer has sensibly revoked it, then kdbus shouldn't > override them. One idea: add a sysctl that defaults to off that enables these metadata items, and keep it disabled on production systems. Then you get your debugging and everyone else gets unsurprising behavior. --Andy > > --Andy -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
On Wed, Nov 26, 2014 at 3:55 AM, David Herrmann wrote: > Hi > > On Mon, Nov 24, 2014 at 9:57 PM, Andy Lutomirski wrote: >> On Mon, Nov 24, 2014 at 12:16 PM, David Herrmann >> wrote: >>> [snip] > +6.5 Getting information about a connection's bus creator > + > + > +The KDBUS_CMD_BUS_CREATOR_INFO ioctl takes the same struct as > +KDBUS_CMD_CONN_INFO but is used to retrieve information about the > creator of > +the bus the connection is attached to. The metadata returned by this > call is > +collected during the creation of the bus and is never altered > afterwards, so > +it provides pristine information on the task that created the bus, at the > +moment when it did so. What's this for? I understand the need for the creator of busses to be authenticated, but doing it like this mean that anyone who will *fail* authentication can DoS the authentic creator. >>> >>> This returns information on a bus owner, to determine whether a >>> connection is connected to a system, user or session bus. Note that >>> the bus-creator itself is not a valid peer on the bus, so you cannot >>> send messages to them. Which kind of DoS do you have in mind? >> >> I assume that the logic is something like: >> >> connect to bus >> request bus metadata >> if (bus metadata matches expectations) { >> great, trust the bus! >> } else { >> oh crap! >> } > > Uh, no, this is really not the logic that should be assumed. It's more > for code where you want to simply pass a bus fd, and the code knows > nothing about it. Now, the code can derive some information from the > bus fd, like for example who owns it. Then, depending on some of the > creds returned it can determine whether to read configuration file set > A or B and so on. This is particularly useful for all kinds of > unprivileged bus services that end up running on any kind of bus and > need to be able to figure out what they are actually operating on. The logic you've described is more or less the same thing that I described with a process transition in. It's: connect to bus send kdbus fd to another process, which does the rest: request bus metadata if (bus metadata matches expectations) { great, trust the bus! } else { oh crap! (or malfunction or whatever) } ISTM you should have an API to get the *name* of the bus and check that. Except that, if the service you pass that fd to is privileged, then you're completely screwed, because none of this checks that the *domain* is correct. [snip] > > + > +Also, if the connection allowed for file descriptor to be passed > +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be > +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl > +returns. The receiving task is obliged to close all of them > appropriately. This makes it sound like fds are installed at receive time. What prevents resource exhaustion due to having excessive numbers of fds in transit (that are presumably not accounted to anyone)? >>> >>> We have a per-user message accounting for undelivered messages, as >>> well as a maximum number of pending messages per connection on the >>> receiving end. These limits are accounted on a "user<->user" basis, so >>> the limit of a user A will not affect two other users (B and C) >>> talking. >> >> But you can shove tons of fds in a message, and you can have lots of >> messages, and some of the fds can be fds of unix sockets that have fds >> queued up in them, and one of those fds could be the fd to the kdbus >> connection that sent the fd... > > You cannot send kdbus-fds or unix-fds over kdbus, right now. We have > people working on the AF_UNIX gc to make it more generic and include > external types. Until then, we simply prevent recursive fd passing. OK, fair enough. > >> This is not advice as to what to do about it, but I think that it will >> be a problem at some point. >> >> > +11. Policy > +=== > + > +A policy databases restrict the possibilities of connections to own, see > and > +talk to well-known names. It can be associated with a bus (through a > policy > +holder connection) or a custom endpoint. ISTM metadata items on bus names should be replaced with policy that applies to the domain as a whole and governs bus creation. >>> >>> No, well-known names are bound to buses, so a bus is really the right >>> place to hold policy about which process is allowed to claim them. >>> Every user is allowed to create a bus of its own, there's no policy >>> for that, and there shouldn't be. >>> >>> It has nothing to do with metadata items. >> >> But it does -- the creator of the bus binds metadata to that bus at >> creation time. >> >> I think that a better solution would be to have a global policy
Re: kdbus: add documentation
Hi On Mon, Nov 24, 2014 at 9:57 PM, Andy Lutomirski wrote: > On Mon, Nov 24, 2014 at 12:16 PM, David Herrmann > wrote: >> [snip] +6.5 Getting information about a connection's bus creator + + +The KDBUS_CMD_BUS_CREATOR_INFO ioctl takes the same struct as +KDBUS_CMD_CONN_INFO but is used to retrieve information about the creator of +the bus the connection is attached to. The metadata returned by this call is +collected during the creation of the bus and is never altered afterwards, so +it provides pristine information on the task that created the bus, at the +moment when it did so. >>> >>> What's this for? I understand the need for the creator of busses to >>> be authenticated, but doing it like this mean that anyone who will >>> *fail* authentication can DoS the authentic creator. >> >> This returns information on a bus owner, to determine whether a >> connection is connected to a system, user or session bus. Note that >> the bus-creator itself is not a valid peer on the bus, so you cannot >> send messages to them. Which kind of DoS do you have in mind? > > I assume that the logic is something like: > > connect to bus > request bus metadata > if (bus metadata matches expectations) { > great, trust the bus! > } else { > oh crap! > } Uh, no, this is really not the logic that should be assumed. It's more for code where you want to simply pass a bus fd, and the code knows nothing about it. Now, the code can derive some information from the bus fd, like for example who owns it. Then, depending on some of the creds returned it can determine whether to read configuration file set A or B and so on. This is particularly useful for all kinds of unprivileged bus services that end up running on any kind of bus and need to be able to figure out what they are actually operating on. > If I'm understanding it right, then user code only really has two > outcomes: the good case and the "oh crap!" case. The problem is that > "oh crap!" isn't a clean failure -- if it happens, then the > application has just been DoSed, because in that case, one of two > things happened: > > 1. Some policy mismatch means that the legitimate bus owner did create > the bus, but the user application is confused. This will result in > difficult-to-diagnose failures. > > 2. A malicious or confused program created the bus. This is a DoS -- > even the legitimate bus creator can't actually create the bus now. > > So I think that the policy should be applied at the time that the bus > name is claimed, not at the time that someone else tries to use the > bus. IOW, the way that you verify you're talking to the system bus > should be by checking that the bus is called "system", not by checking > that UID 0 created the bus. > [snip] + +Also, if the connection allowed for file descriptor to be passed +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl +returns. The receiving task is obliged to close all of them appropriately. >>> >>> This makes it sound like fds are installed at receive time. What >>> prevents resource exhaustion due to having excessive numbers of fds in >>> transit (that are presumably not accounted to anyone)? >> >> We have a per-user message accounting for undelivered messages, as >> well as a maximum number of pending messages per connection on the >> receiving end. These limits are accounted on a "user<->user" basis, so >> the limit of a user A will not affect two other users (B and C) >> talking. > > But you can shove tons of fds in a message, and you can have lots of > messages, and some of the fds can be fds of unix sockets that have fds > queued up in them, and one of those fds could be the fd to the kdbus > connection that sent the fd... You cannot send kdbus-fds or unix-fds over kdbus, right now. We have people working on the AF_UNIX gc to make it more generic and include external types. Until then, we simply prevent recursive fd passing. > This is not advice as to what to do about it, but I think that it will > be a problem at some point. > > +11. Policy +=== + +A policy databases restrict the possibilities of connections to own, see and +talk to well-known names. It can be associated with a bus (through a policy +holder connection) or a custom endpoint. >>> >>> ISTM metadata items on bus names should be replaced with policy that >>> applies to the domain as a whole and governs bus creation. >> >> No, well-known names are bound to buses, so a bus is really the right >> place to hold policy about which process is allowed to claim them. >> Every user is allowed to create a bus of its own, there's no policy >> for that, and there shouldn't be. >> >> It has
Re: kdbus: add documentation
Hi On Mon, Nov 24, 2014 at 9:57 PM, Andy Lutomirski l...@amacapital.net wrote: On Mon, Nov 24, 2014 at 12:16 PM, David Herrmann dh.herrm...@gmail.com wrote: [snip] +6.5 Getting information about a connection's bus creator + + +The KDBUS_CMD_BUS_CREATOR_INFO ioctl takes the same struct as +KDBUS_CMD_CONN_INFO but is used to retrieve information about the creator of +the bus the connection is attached to. The metadata returned by this call is +collected during the creation of the bus and is never altered afterwards, so +it provides pristine information on the task that created the bus, at the +moment when it did so. What's this for? I understand the need for the creator of busses to be authenticated, but doing it like this mean that anyone who will *fail* authentication can DoS the authentic creator. This returns information on a bus owner, to determine whether a connection is connected to a system, user or session bus. Note that the bus-creator itself is not a valid peer on the bus, so you cannot send messages to them. Which kind of DoS do you have in mind? I assume that the logic is something like: connect to bus request bus metadata if (bus metadata matches expectations) { great, trust the bus! } else { oh crap! } Uh, no, this is really not the logic that should be assumed. It's more for code where you want to simply pass a bus fd, and the code knows nothing about it. Now, the code can derive some information from the bus fd, like for example who owns it. Then, depending on some of the creds returned it can determine whether to read configuration file set A or B and so on. This is particularly useful for all kinds of unprivileged bus services that end up running on any kind of bus and need to be able to figure out what they are actually operating on. If I'm understanding it right, then user code only really has two outcomes: the good case and the oh crap! case. The problem is that oh crap! isn't a clean failure -- if it happens, then the application has just been DoSed, because in that case, one of two things happened: 1. Some policy mismatch means that the legitimate bus owner did create the bus, but the user application is confused. This will result in difficult-to-diagnose failures. 2. A malicious or confused program created the bus. This is a DoS -- even the legitimate bus creator can't actually create the bus now. So I think that the policy should be applied at the time that the bus name is claimed, not at the time that someone else tries to use the bus. IOW, the way that you verify you're talking to the system bus should be by checking that the bus is called system, not by checking that UID 0 created the bus. [snip] + +Also, if the connection allowed for file descriptor to be passed +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl +returns. The receiving task is obliged to close all of them appropriately. This makes it sound like fds are installed at receive time. What prevents resource exhaustion due to having excessive numbers of fds in transit (that are presumably not accounted to anyone)? We have a per-user message accounting for undelivered messages, as well as a maximum number of pending messages per connection on the receiving end. These limits are accounted on a user-user basis, so the limit of a user A will not affect two other users (B and C) talking. But you can shove tons of fds in a message, and you can have lots of messages, and some of the fds can be fds of unix sockets that have fds queued up in them, and one of those fds could be the fd to the kdbus connection that sent the fd... You cannot send kdbus-fds or unix-fds over kdbus, right now. We have people working on the AF_UNIX gc to make it more generic and include external types. Until then, we simply prevent recursive fd passing. This is not advice as to what to do about it, but I think that it will be a problem at some point. +11. Policy +=== + +A policy databases restrict the possibilities of connections to own, see and +talk to well-known names. It can be associated with a bus (through a policy +holder connection) or a custom endpoint. ISTM metadata items on bus names should be replaced with policy that applies to the domain as a whole and governs bus creation. No, well-known names are bound to buses, so a bus is really the right place to hold policy about which process is allowed to claim them. Every user is allowed to create a bus of its own, there's no policy for that, and there shouldn't be. It has nothing to do with metadata items. But it does -- the creator of the bus binds metadata to that bus at creation time. I think that a better solution would be to have a global policy
Re: kdbus: add documentation
On Wed, Nov 26, 2014 at 3:55 AM, David Herrmann dh.herrm...@gmail.com wrote: Hi On Mon, Nov 24, 2014 at 9:57 PM, Andy Lutomirski l...@amacapital.net wrote: On Mon, Nov 24, 2014 at 12:16 PM, David Herrmann dh.herrm...@gmail.com wrote: [snip] +6.5 Getting information about a connection's bus creator + + +The KDBUS_CMD_BUS_CREATOR_INFO ioctl takes the same struct as +KDBUS_CMD_CONN_INFO but is used to retrieve information about the creator of +the bus the connection is attached to. The metadata returned by this call is +collected during the creation of the bus and is never altered afterwards, so +it provides pristine information on the task that created the bus, at the +moment when it did so. What's this for? I understand the need for the creator of busses to be authenticated, but doing it like this mean that anyone who will *fail* authentication can DoS the authentic creator. This returns information on a bus owner, to determine whether a connection is connected to a system, user or session bus. Note that the bus-creator itself is not a valid peer on the bus, so you cannot send messages to them. Which kind of DoS do you have in mind? I assume that the logic is something like: connect to bus request bus metadata if (bus metadata matches expectations) { great, trust the bus! } else { oh crap! } Uh, no, this is really not the logic that should be assumed. It's more for code where you want to simply pass a bus fd, and the code knows nothing about it. Now, the code can derive some information from the bus fd, like for example who owns it. Then, depending on some of the creds returned it can determine whether to read configuration file set A or B and so on. This is particularly useful for all kinds of unprivileged bus services that end up running on any kind of bus and need to be able to figure out what they are actually operating on. The logic you've described is more or less the same thing that I described with a process transition in. It's: connect to bus send kdbus fd to another process, which does the rest: request bus metadata if (bus metadata matches expectations) { great, trust the bus! } else { oh crap! (or malfunction or whatever) } ISTM you should have an API to get the *name* of the bus and check that. Except that, if the service you pass that fd to is privileged, then you're completely screwed, because none of this checks that the *domain* is correct. [snip] + +Also, if the connection allowed for file descriptor to be passed +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl +returns. The receiving task is obliged to close all of them appropriately. This makes it sound like fds are installed at receive time. What prevents resource exhaustion due to having excessive numbers of fds in transit (that are presumably not accounted to anyone)? We have a per-user message accounting for undelivered messages, as well as a maximum number of pending messages per connection on the receiving end. These limits are accounted on a user-user basis, so the limit of a user A will not affect two other users (B and C) talking. But you can shove tons of fds in a message, and you can have lots of messages, and some of the fds can be fds of unix sockets that have fds queued up in them, and one of those fds could be the fd to the kdbus connection that sent the fd... You cannot send kdbus-fds or unix-fds over kdbus, right now. We have people working on the AF_UNIX gc to make it more generic and include external types. Until then, we simply prevent recursive fd passing. OK, fair enough. This is not advice as to what to do about it, but I think that it will be a problem at some point. +11. Policy +=== + +A policy databases restrict the possibilities of connections to own, see and +talk to well-known names. It can be associated with a bus (through a policy +holder connection) or a custom endpoint. ISTM metadata items on bus names should be replaced with policy that applies to the domain as a whole and governs bus creation. No, well-known names are bound to buses, so a bus is really the right place to hold policy about which process is allowed to claim them. Every user is allowed to create a bus of its own, there's no policy for that, and there shouldn't be. It has nothing to do with metadata items. But it does -- the creator of the bus binds metadata to that bus at creation time. I think that a better solution would be to have a global policy that says, for example, to create the bus called 'system', the creator must have selinux label xyz or to create a user bus called uid-1000-privileged-ui-bus the creator must have some cgroup or whatever. Although maybe a better solution
Re: kdbus: add documentation
On Wed, Nov 26, 2014 at 7:30 AM, Andy Lutomirski l...@amacapital.net wrote: Then find a clean way that's gated on having the right /proc access, which is not guaranteed to exist on all of your eventual users' systems, and, if that access doesn't exist because the admin or sandbox designer has sensibly revoked it, then kdbus shouldn't override them. One idea: add a sysctl that defaults to off that enables these metadata items, and keep it disabled on production systems. Then you get your debugging and everyone else gets unsurprising behavior. --Andy --Andy -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
On Mon, Nov 24, 2014 at 12:16 PM, David Herrmann wrote: > Hi Andy! > > On Fri, Nov 21, 2014 at 6:12 PM, Andy Lutomirski wrote: >> On Thu, Nov 20, 2014 at 9:02 PM, Greg Kroah-Hartman >> wrote: >>> From: Daniel Mack >>> >>> kdbus is a system for low-latency, low-overhead, easy to use >>> interprocess communication (IPC). >>> >>> The interface to all functions in this driver is implemented through >>> ioctls on files exposed through the mount point of a kdbusfs. This >>> patch adds detailed documentation about the kernel level API design. >>> >>> Signed-off-by: Daniel Mack >>> Signed-off-by: David Herrmann >>> Signed-off-by: Djalal Harouni >>> Signed-off-by: Greg Kroah-Hartman >>> --- >> >>> + Pool: >>> +Each connection allocates a piece of shmem-backed memory that is used >>> +to receive messages and answers to ioctl command from the kernel. It is >>> +never used to send anything to the kernel. In order to access that >>> memory, >>> +userspace must mmap() it into its task. >>> +See section 12 for more details. >> >> At the risk of opening a can of worms, wouldn't this be much more >> useful if you could share a pool between multiple connections? > > Within a process it could theoretically be possible to share the same > memory pool between multiple connections made by the process. However, > note that normally a process only has a single connection to the bus > open (possibly two, if it opens a connection to both the system and > the user bus). Now, sharing the receiver buffer could certainly be > considered an optimization, but it would have no effect on > "usefulness", though, as just allocating space from a single shared > per-process receiver won't give you any new possibilities... > > We have thought about this, but decided to delay it for now. Shared > pools can easily be added as an extension later on. > > [snip] >>> +KDBUS_CMD_BUS_MAKE, and KDBUS_CMD_ENDPOINT_MAKE take a >>> +struct kdbus_cmd_make argument. >>> + >>> +struct kdbus_cmd_make { >>> + __u64 size; >>> +The overall size of the struct, including its items. >>> + >>> + __u64 flags; >>> +The flags for creation. >>> + >>> +KDBUS_MAKE_ACCESS_GROUP >>> + Make the device file group-accessible >>> + >>> +KDBUS_MAKE_ACCESS_WORLD >>> + Make the device file world-accessible >> >> This thing is a file. What's wrong with using a normal POSIX mode? >> (And what to the read, write, and exec modes do?) > > Domains and buses are directories, endpoints are files. Domains also > create control-files implicitly. > > For kdbus clients there is just access or no access, but not > distinction between read, write and execute access. Due to that we > just break this down to per-group and world access bits, since doing > more is pointless, and we shouldn't allow shoehorning more stuff into > the access mode. > > [snip] >>> + KDBUS_ITEM_CREDS >>> + KDBUS_ITEM_SECLABEL >>> +Privileged bus users may submit these types in order to create >>> +connections with faked credentials. The only real use case for this >>> +is a proxy service which acts on behalf of some other tasks. For a >>> +connection that runs in that mode, the message's metadata items >>> will >>> +be limited to what's specified here. See section 13 for more >>> +information. >> >> This is still confusing. There are multiple places in which metadata >> is attached. Which does this apply to? And why are only creds and >> seclabel listed? > > Yes, and there are multiple places where metadata is *gathered*. This > ioctl creates connections, so only the items that are actually > *gathered* by that ioctl are documented here. These items are not part > of any messages, but are used as identification of the connection > owner (and in this particular case, to allow privileged proxies to > overwrite the items so they can properly proxy a legacy-dbus peer > connection). But don't proxies need to override the per-message metadata, too? This is why I'm confused (and my confusion about what's happening goes down into the code, too). IMO it would be great if all the variables were named things like message_metadata, conn_metadata, bus_metadata, etc. > > [snip] >>> +6.5 Getting information about a connection's bus creator >>> + >>> + >>> +The KDBUS_CMD_BUS_CREATOR_INFO ioctl takes the same struct as >>> +KDBUS_CMD_CONN_INFO but is used to retrieve information about the creator >>> of >>> +the bus the connection is attached to. The metadata returned by this call >>> is >>> +collected during the creation of the bus and is never altered afterwards, >>> so >>> +it provides pristine information on the task that created the bus, at the >>> +moment when it did so. >> >> What's this for? I understand the need for the creator of busses to >> be authenticated, but doing it like this mean that anyone who will >> *fail* authentication can DoS the
Re: kdbus: add documentation
Hi Andy! On Fri, Nov 21, 2014 at 6:12 PM, Andy Lutomirski wrote: > On Thu, Nov 20, 2014 at 9:02 PM, Greg Kroah-Hartman > wrote: >> From: Daniel Mack >> >> kdbus is a system for low-latency, low-overhead, easy to use >> interprocess communication (IPC). >> >> The interface to all functions in this driver is implemented through >> ioctls on files exposed through the mount point of a kdbusfs. This >> patch adds detailed documentation about the kernel level API design. >> >> Signed-off-by: Daniel Mack >> Signed-off-by: David Herrmann >> Signed-off-by: Djalal Harouni >> Signed-off-by: Greg Kroah-Hartman >> --- > >> + Pool: >> +Each connection allocates a piece of shmem-backed memory that is used >> +to receive messages and answers to ioctl command from the kernel. It is >> +never used to send anything to the kernel. In order to access that >> memory, >> +userspace must mmap() it into its task. >> +See section 12 for more details. > > At the risk of opening a can of worms, wouldn't this be much more > useful if you could share a pool between multiple connections? Within a process it could theoretically be possible to share the same memory pool between multiple connections made by the process. However, note that normally a process only has a single connection to the bus open (possibly two, if it opens a connection to both the system and the user bus). Now, sharing the receiver buffer could certainly be considered an optimization, but it would have no effect on "usefulness", though, as just allocating space from a single shared per-process receiver won't give you any new possibilities... We have thought about this, but decided to delay it for now. Shared pools can easily be added as an extension later on. [snip] >> +KDBUS_CMD_BUS_MAKE, and KDBUS_CMD_ENDPOINT_MAKE take a >> +struct kdbus_cmd_make argument. >> + >> +struct kdbus_cmd_make { >> + __u64 size; >> +The overall size of the struct, including its items. >> + >> + __u64 flags; >> +The flags for creation. >> + >> +KDBUS_MAKE_ACCESS_GROUP >> + Make the device file group-accessible >> + >> +KDBUS_MAKE_ACCESS_WORLD >> + Make the device file world-accessible > > This thing is a file. What's wrong with using a normal POSIX mode? > (And what to the read, write, and exec modes do?) Domains and buses are directories, endpoints are files. Domains also create control-files implicitly. For kdbus clients there is just access or no access, but not distinction between read, write and execute access. Due to that we just break this down to per-group and world access bits, since doing more is pointless, and we shouldn't allow shoehorning more stuff into the access mode. [snip] >> + KDBUS_ITEM_CREDS >> + KDBUS_ITEM_SECLABEL >> +Privileged bus users may submit these types in order to create >> +connections with faked credentials. The only real use case for this >> +is a proxy service which acts on behalf of some other tasks. For a >> +connection that runs in that mode, the message's metadata items will >> +be limited to what's specified here. See section 13 for more >> +information. > > This is still confusing. There are multiple places in which metadata > is attached. Which does this apply to? And why are only creds and > seclabel listed? Yes, and there are multiple places where metadata is *gathered*. This ioctl creates connections, so only the items that are actually *gathered* by that ioctl are documented here. These items are not part of any messages, but are used as identification of the connection owner (and in this particular case, to allow privileged proxies to overwrite the items so they can properly proxy a legacy-dbus peer connection). [snip] >> +6.5 Getting information about a connection's bus creator >> + >> + >> +The KDBUS_CMD_BUS_CREATOR_INFO ioctl takes the same struct as >> +KDBUS_CMD_CONN_INFO but is used to retrieve information about the creator of >> +the bus the connection is attached to. The metadata returned by this call is >> +collected during the creation of the bus and is never altered afterwards, so >> +it provides pristine information on the task that created the bus, at the >> +moment when it did so. > > What's this for? I understand the need for the creator of busses to > be authenticated, but doing it like this mean that anyone who will > *fail* authentication can DoS the authentic creator. This returns information on a bus owner, to determine whether a connection is connected to a system, user or session bus. Note that the bus-creator itself is not a valid peer on the bus, so you cannot send messages to them. Which kind of DoS do you have in mind? >> + >> +7.3 Passing of Payload Data >> +--- >> + >> +When connecting to the bus, receivers request a memory pool of a given size, >> +large enough to carry all backlog of data enqueued for
Re: kdbus: add documentation
Hi Andy! On Fri, Nov 21, 2014 at 6:12 PM, Andy Lutomirski l...@amacapital.net wrote: On Thu, Nov 20, 2014 at 9:02 PM, Greg Kroah-Hartman gre...@linuxfoundation.org wrote: From: Daniel Mack dan...@zonque.org kdbus is a system for low-latency, low-overhead, easy to use interprocess communication (IPC). The interface to all functions in this driver is implemented through ioctls on files exposed through the mount point of a kdbusfs. This patch adds detailed documentation about the kernel level API design. Signed-off-by: Daniel Mack dan...@zonque.org Signed-off-by: David Herrmann dh.herrm...@gmail.com Signed-off-by: Djalal Harouni tix...@opendz.org Signed-off-by: Greg Kroah-Hartman gre...@linuxfoundation.org --- + Pool: +Each connection allocates a piece of shmem-backed memory that is used +to receive messages and answers to ioctl command from the kernel. It is +never used to send anything to the kernel. In order to access that memory, +userspace must mmap() it into its task. +See section 12 for more details. At the risk of opening a can of worms, wouldn't this be much more useful if you could share a pool between multiple connections? Within a process it could theoretically be possible to share the same memory pool between multiple connections made by the process. However, note that normally a process only has a single connection to the bus open (possibly two, if it opens a connection to both the system and the user bus). Now, sharing the receiver buffer could certainly be considered an optimization, but it would have no effect on usefulness, though, as just allocating space from a single shared per-process receiver won't give you any new possibilities... We have thought about this, but decided to delay it for now. Shared pools can easily be added as an extension later on. [snip] +KDBUS_CMD_BUS_MAKE, and KDBUS_CMD_ENDPOINT_MAKE take a +struct kdbus_cmd_make argument. + +struct kdbus_cmd_make { + __u64 size; +The overall size of the struct, including its items. + + __u64 flags; +The flags for creation. + +KDBUS_MAKE_ACCESS_GROUP + Make the device file group-accessible + +KDBUS_MAKE_ACCESS_WORLD + Make the device file world-accessible This thing is a file. What's wrong with using a normal POSIX mode? (And what to the read, write, and exec modes do?) Domains and buses are directories, endpoints are files. Domains also create control-files implicitly. For kdbus clients there is just access or no access, but not distinction between read, write and execute access. Due to that we just break this down to per-group and world access bits, since doing more is pointless, and we shouldn't allow shoehorning more stuff into the access mode. [snip] + KDBUS_ITEM_CREDS + KDBUS_ITEM_SECLABEL +Privileged bus users may submit these types in order to create +connections with faked credentials. The only real use case for this +is a proxy service which acts on behalf of some other tasks. For a +connection that runs in that mode, the message's metadata items will +be limited to what's specified here. See section 13 for more +information. This is still confusing. There are multiple places in which metadata is attached. Which does this apply to? And why are only creds and seclabel listed? Yes, and there are multiple places where metadata is *gathered*. This ioctl creates connections, so only the items that are actually *gathered* by that ioctl are documented here. These items are not part of any messages, but are used as identification of the connection owner (and in this particular case, to allow privileged proxies to overwrite the items so they can properly proxy a legacy-dbus peer connection). [snip] +6.5 Getting information about a connection's bus creator + + +The KDBUS_CMD_BUS_CREATOR_INFO ioctl takes the same struct as +KDBUS_CMD_CONN_INFO but is used to retrieve information about the creator of +the bus the connection is attached to. The metadata returned by this call is +collected during the creation of the bus and is never altered afterwards, so +it provides pristine information on the task that created the bus, at the +moment when it did so. What's this for? I understand the need for the creator of busses to be authenticated, but doing it like this mean that anyone who will *fail* authentication can DoS the authentic creator. This returns information on a bus owner, to determine whether a connection is connected to a system, user or session bus. Note that the bus-creator itself is not a valid peer on the bus, so you cannot send messages to them. Which kind of DoS do you have in mind? + +7.3 Passing of Payload Data +--- + +When connecting to the bus, receivers request a memory pool of a given size, +large enough to carry all backlog of data enqueued
Re: kdbus: add documentation
On Mon, Nov 24, 2014 at 12:16 PM, David Herrmann dh.herrm...@gmail.com wrote: Hi Andy! On Fri, Nov 21, 2014 at 6:12 PM, Andy Lutomirski l...@amacapital.net wrote: On Thu, Nov 20, 2014 at 9:02 PM, Greg Kroah-Hartman gre...@linuxfoundation.org wrote: From: Daniel Mack dan...@zonque.org kdbus is a system for low-latency, low-overhead, easy to use interprocess communication (IPC). The interface to all functions in this driver is implemented through ioctls on files exposed through the mount point of a kdbusfs. This patch adds detailed documentation about the kernel level API design. Signed-off-by: Daniel Mack dan...@zonque.org Signed-off-by: David Herrmann dh.herrm...@gmail.com Signed-off-by: Djalal Harouni tix...@opendz.org Signed-off-by: Greg Kroah-Hartman gre...@linuxfoundation.org --- + Pool: +Each connection allocates a piece of shmem-backed memory that is used +to receive messages and answers to ioctl command from the kernel. It is +never used to send anything to the kernel. In order to access that memory, +userspace must mmap() it into its task. +See section 12 for more details. At the risk of opening a can of worms, wouldn't this be much more useful if you could share a pool between multiple connections? Within a process it could theoretically be possible to share the same memory pool between multiple connections made by the process. However, note that normally a process only has a single connection to the bus open (possibly two, if it opens a connection to both the system and the user bus). Now, sharing the receiver buffer could certainly be considered an optimization, but it would have no effect on usefulness, though, as just allocating space from a single shared per-process receiver won't give you any new possibilities... We have thought about this, but decided to delay it for now. Shared pools can easily be added as an extension later on. [snip] +KDBUS_CMD_BUS_MAKE, and KDBUS_CMD_ENDPOINT_MAKE take a +struct kdbus_cmd_make argument. + +struct kdbus_cmd_make { + __u64 size; +The overall size of the struct, including its items. + + __u64 flags; +The flags for creation. + +KDBUS_MAKE_ACCESS_GROUP + Make the device file group-accessible + +KDBUS_MAKE_ACCESS_WORLD + Make the device file world-accessible This thing is a file. What's wrong with using a normal POSIX mode? (And what to the read, write, and exec modes do?) Domains and buses are directories, endpoints are files. Domains also create control-files implicitly. For kdbus clients there is just access or no access, but not distinction between read, write and execute access. Due to that we just break this down to per-group and world access bits, since doing more is pointless, and we shouldn't allow shoehorning more stuff into the access mode. [snip] + KDBUS_ITEM_CREDS + KDBUS_ITEM_SECLABEL +Privileged bus users may submit these types in order to create +connections with faked credentials. The only real use case for this +is a proxy service which acts on behalf of some other tasks. For a +connection that runs in that mode, the message's metadata items will +be limited to what's specified here. See section 13 for more +information. This is still confusing. There are multiple places in which metadata is attached. Which does this apply to? And why are only creds and seclabel listed? Yes, and there are multiple places where metadata is *gathered*. This ioctl creates connections, so only the items that are actually *gathered* by that ioctl are documented here. These items are not part of any messages, but are used as identification of the connection owner (and in this particular case, to allow privileged proxies to overwrite the items so they can properly proxy a legacy-dbus peer connection). But don't proxies need to override the per-message metadata, too? This is why I'm confused (and my confusion about what's happening goes down into the code, too). IMO it would be great if all the variables were named things like message_metadata, conn_metadata, bus_metadata, etc. [snip] +6.5 Getting information about a connection's bus creator + + +The KDBUS_CMD_BUS_CREATOR_INFO ioctl takes the same struct as +KDBUS_CMD_CONN_INFO but is used to retrieve information about the creator of +the bus the connection is attached to. The metadata returned by this call is +collected during the creation of the bus and is never altered afterwards, so +it provides pristine information on the task that created the bus, at the +moment when it did so. What's this for? I understand the need for the creator of busses to be authenticated, but doing it like this mean that anyone who will *fail* authentication can DoS the authentic creator. This returns information on a bus owner, to determine whether
Re: kdbus: add documentation
On Thu, Nov 20, 2014 at 9:02 PM, Greg Kroah-Hartman wrote: > From: Daniel Mack > > kdbus is a system for low-latency, low-overhead, easy to use > interprocess communication (IPC). > > The interface to all functions in this driver is implemented through > ioctls on files exposed through the mount point of a kdbusfs. This > patch adds detailed documentation about the kernel level API design. > > Signed-off-by: Daniel Mack > Signed-off-by: David Herrmann > Signed-off-by: Djalal Harouni > Signed-off-by: Greg Kroah-Hartman > --- > + Pool: > +Each connection allocates a piece of shmem-backed memory that is used > +to receive messages and answers to ioctl command from the kernel. It is > +never used to send anything to the kernel. In order to access that > memory, > +userspace must mmap() it into its task. > +See section 12 for more details. At the risk of opening a can of worms, wouldn't this be much more useful if you could share a pool between multiple connections? > + > + > +4. Items > +=== > + > +To flexibly augment transport structures used by kdbus, data blobs of type > +struct kdbus_item are used. An item has a fixed-sized header that only stores > +the type of the item and the overall size. The total size is variable and is > +in some cases defined by the item type, in other cases, they can be of > +arbitrary length (for instance, a string). > + > +In the external kernel API, items are used for many ioctls to transport > +optional information from userspace to kernelspace. They are also used for > +information stored in a connection's pool, such as messages, name lists or > +requested connection information. > + > +In all such occasions where items are used as part of the kdbus kernel API, > +they are embedded in structs that have an overall size of their own, so there > +can be many of them. > + > +The kernel expects all items to be aligned to 8-byte boundaries. > + > +A simple iterator in userspace would iterate over the items until the items > +have reached the embedding structure's overall size. An example > implementation > +of such an iterator can be found in > tools/testing/selftests/kdbus/kdbus-util.h. It looks like many (all?) item consumers ignore unknown items. This seems like a compatbility problem. Would it be better to have a bit in each item that toggles between "ignore me if you don't recognize me" and "error out if you don't recognize me"? > +KDBUS_CMD_BUS_MAKE, and KDBUS_CMD_ENDPOINT_MAKE take a > +struct kdbus_cmd_make argument. > + > +struct kdbus_cmd_make { > + __u64 size; > +The overall size of the struct, including its items. > + > + __u64 flags; > +The flags for creation. > + > +KDBUS_MAKE_ACCESS_GROUP > + Make the device file group-accessible > + > +KDBUS_MAKE_ACCESS_WORLD > + Make the device file world-accessible This thing is a file. What's wrong with using a normal POSIX mode? (And what to the read, write, and exec modes do?) > + > + > +6.2 Creating connections > + > + > +A connection to a bus is created by opening an endpoint file of a bus and > +becoming an active client with the KDBUS_CMD_HELLO ioctl. Every connected > client > +connection has a unique identifier on the bus and can address messages to > every > +other connection on the same bus by using the peer's connection id as the > +destination. > + > +The KDBUS_CMD_HELLO ioctl takes the following struct as argument. > + > +struct kdbus_cmd_hello { > + __u64 size; > +The overall size of the struct, including all attached items. > + > + __u64 conn_flags; > +Flags to apply to this connection: > + > +KDBUS_HELLO_ACCEPT_FD > + When this flag is set, the connection can be sent file descriptors > + as message payload. If it's not set, any attempt of doing so will > + result in -ECOMM on the sender's side. > + > +KDBUS_HELLO_ACTIVATOR > + Make this connection an activator (see below). With this bit set, > + an item of type KDBUS_ITEM_NAME has to be attached which describes > + the well-known name this connection should be an activator for. > + > +KDBUS_HELLO_POLICY_HOLDER > + Make this connection a policy holder (see below). With this bit set, > + an item of type KDBUS_ITEM_NAME has to be attached which describes > + the well-known name this connection should hold a policy for. > + > +KDBUS_HELLO_MONITOR > + Make this connection an eaves-dropping connection that receives all > + unicast messages sent on the bus. To also receive broadcast messages, > + the connection has to upload appropriate matches as well. > + This flag is only valid for privileged bus connections. > + > + __u64 attach_flags_send; > + Set the bits for metadata this connection permits to be sent to the > + receiving peer. Only metadata items that are both allowed to be sent by > +
Re: kdbus: add documentation
On 21.11.2014 06:02, Greg Kroah-Hartman wrote: > From: Daniel Mack > … > +5.4 Creating buses and endpoints > + > + > +KDBUS_CMD_BUS_MAKE, and KDBUS_CMD_ENDPOINT_MAKE take a > +struct kdbus_cmd_make argument. > + > +struct kdbus_cmd_make { > + __u64 size; > +The overall size of the struct, including its items. > + > + __u64 flags; > +The flags for creation. > + > +KDBUS_MAKE_ACCESS_GROUP > + Make the device file group-accessible device? > + > +KDBUS_MAKE_ACCESS_WORLD > + Make the device file world-accessible device? > + > + __u64 kernel_flags; > +Valid flags for this command, returned by the kernel upon each call. > + > + struct kdbus_item items[0]; > +A list of items, only used for creating custom endpoints. Has specific > +meanings for KDBUS_CMD_BUS_MAKE and KDBUS_CMD_ENDPOINT_MAKE (see above). > +}; > … -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
On 21.11.2014 06:02, Greg Kroah-Hartman wrote: From: Daniel Mack dan...@zonque.org … +5.4 Creating buses and endpoints + + +KDBUS_CMD_BUS_MAKE, and KDBUS_CMD_ENDPOINT_MAKE take a +struct kdbus_cmd_make argument. + +struct kdbus_cmd_make { + __u64 size; +The overall size of the struct, including its items. + + __u64 flags; +The flags for creation. + +KDBUS_MAKE_ACCESS_GROUP + Make the device file group-accessible device? + +KDBUS_MAKE_ACCESS_WORLD + Make the device file world-accessible device? + + __u64 kernel_flags; +Valid flags for this command, returned by the kernel upon each call. + + struct kdbus_item items[0]; +A list of items, only used for creating custom endpoints. Has specific +meanings for KDBUS_CMD_BUS_MAKE and KDBUS_CMD_ENDPOINT_MAKE (see above). +}; … -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
On Thu, Nov 20, 2014 at 9:02 PM, Greg Kroah-Hartman gre...@linuxfoundation.org wrote: From: Daniel Mack dan...@zonque.org kdbus is a system for low-latency, low-overhead, easy to use interprocess communication (IPC). The interface to all functions in this driver is implemented through ioctls on files exposed through the mount point of a kdbusfs. This patch adds detailed documentation about the kernel level API design. Signed-off-by: Daniel Mack dan...@zonque.org Signed-off-by: David Herrmann dh.herrm...@gmail.com Signed-off-by: Djalal Harouni tix...@opendz.org Signed-off-by: Greg Kroah-Hartman gre...@linuxfoundation.org --- + Pool: +Each connection allocates a piece of shmem-backed memory that is used +to receive messages and answers to ioctl command from the kernel. It is +never used to send anything to the kernel. In order to access that memory, +userspace must mmap() it into its task. +See section 12 for more details. At the risk of opening a can of worms, wouldn't this be much more useful if you could share a pool between multiple connections? + + +4. Items +=== + +To flexibly augment transport structures used by kdbus, data blobs of type +struct kdbus_item are used. An item has a fixed-sized header that only stores +the type of the item and the overall size. The total size is variable and is +in some cases defined by the item type, in other cases, they can be of +arbitrary length (for instance, a string). + +In the external kernel API, items are used for many ioctls to transport +optional information from userspace to kernelspace. They are also used for +information stored in a connection's pool, such as messages, name lists or +requested connection information. + +In all such occasions where items are used as part of the kdbus kernel API, +they are embedded in structs that have an overall size of their own, so there +can be many of them. + +The kernel expects all items to be aligned to 8-byte boundaries. + +A simple iterator in userspace would iterate over the items until the items +have reached the embedding structure's overall size. An example implementation +of such an iterator can be found in tools/testing/selftests/kdbus/kdbus-util.h. It looks like many (all?) item consumers ignore unknown items. This seems like a compatbility problem. Would it be better to have a bit in each item that toggles between ignore me if you don't recognize me and error out if you don't recognize me? +KDBUS_CMD_BUS_MAKE, and KDBUS_CMD_ENDPOINT_MAKE take a +struct kdbus_cmd_make argument. + +struct kdbus_cmd_make { + __u64 size; +The overall size of the struct, including its items. + + __u64 flags; +The flags for creation. + +KDBUS_MAKE_ACCESS_GROUP + Make the device file group-accessible + +KDBUS_MAKE_ACCESS_WORLD + Make the device file world-accessible This thing is a file. What's wrong with using a normal POSIX mode? (And what to the read, write, and exec modes do?) + + +6.2 Creating connections + + +A connection to a bus is created by opening an endpoint file of a bus and +becoming an active client with the KDBUS_CMD_HELLO ioctl. Every connected client +connection has a unique identifier on the bus and can address messages to every +other connection on the same bus by using the peer's connection id as the +destination. + +The KDBUS_CMD_HELLO ioctl takes the following struct as argument. + +struct kdbus_cmd_hello { + __u64 size; +The overall size of the struct, including all attached items. + + __u64 conn_flags; +Flags to apply to this connection: + +KDBUS_HELLO_ACCEPT_FD + When this flag is set, the connection can be sent file descriptors + as message payload. If it's not set, any attempt of doing so will + result in -ECOMM on the sender's side. + +KDBUS_HELLO_ACTIVATOR + Make this connection an activator (see below). With this bit set, + an item of type KDBUS_ITEM_NAME has to be attached which describes + the well-known name this connection should be an activator for. + +KDBUS_HELLO_POLICY_HOLDER + Make this connection a policy holder (see below). With this bit set, + an item of type KDBUS_ITEM_NAME has to be attached which describes + the well-known name this connection should hold a policy for. + +KDBUS_HELLO_MONITOR + Make this connection an eaves-dropping connection that receives all + unicast messages sent on the bus. To also receive broadcast messages, + the connection has to upload appropriate matches as well. + This flag is only valid for privileged bus connections. + + __u64 attach_flags_send; + Set the bits for metadata this connection permits to be sent to the + receiving peer. Only metadata items that are both allowed to be sent by
Re: kdbus: add documentation
On Thu, Oct 30, 2014 at 01:20:23PM +0100, Peter Meerwald wrote: > > > kdbus is a system for low-latency, low-overhead, easy to use > > interprocess communication (IPC). > > > > The interface to all functions in this driver is implemented through ioctls > > on /dev nodes. This patch adds detailed documentation about the kernel > > level API design. > > just some typos below Many thanks for the fixes, I've made them all to the file now, it will show up in the next version we send out. greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
On Thu, Oct 30, 2014 at 01:20:23PM +0100, Peter Meerwald wrote: kdbus is a system for low-latency, low-overhead, easy to use interprocess communication (IPC). The interface to all functions in this driver is implemented through ioctls on /dev nodes. This patch adds detailed documentation about the kernel level API design. just some typos below snip Many thanks for the fixes, I've made them all to the file now, it will show up in the next version we send out. greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kdbus: add documentation
> kdbus is a system for low-latency, low-overhead, easy to use > interprocess communication (IPC). > > The interface to all functions in this driver is implemented through ioctls > on /dev nodes. This patch adds detailed documentation about the kernel > level API design. just some typos below > Signed-off-by: Daniel Mack > Signed-off-by: Greg Kroah-Hartman > --- > Documentation/kdbus.txt | 1815 > +++ > 1 file changed, 1815 insertions(+) > create mode 100644 Documentation/kdbus.txt > > diff --git a/Documentation/kdbus.txt b/Documentation/kdbus.txt > new file mode 100644 > index ..ac1a18908976 > --- /dev/null > +++ b/Documentation/kdbus.txt > @@ -0,0 +1,1815 @@ > +D-Bus is a system for powerful, easy to use interprocess communication (IPC). > + > +The focus of this document is an overview of the low-level, native kernel > D-Bus > +transport called kdbus. Kdbus in the kernel acts similar to a device driver, > +all communication between processes take place over special character device takes > +nodes in /dev/kdbus/. > + > +For the general D-Bus protocol specification, the payload format, the > +marshaling, and the communication semantics, please refer to: > + http://dbus.freedesktop.org/doc/dbus-specification.html > + > +For a kdbus specific userspace library implementation please refer to: > + http://cgit.freedesktop.org/systemd/systemd/tree/src/systemd/sd-bus.h > + > +Articles about D-Bus and kdbus: > + http://lwn.net/Articles/580194/ > + > + > +1. Terminology > +=== > + > + Domain: > +A domain is a named object containing a number of buses. A system > +container that contains its own init system and users usually also > +runs in its own kdbus domain. The /dev/kdbus/domain// > +directory shows up inside the domain as /dev/kdbus/. Every domain offers > +its own "control" device node to create new buses or new sub-domains. > +Domains have no connection to each other and cannot see nor talk to > +each other. See section 5 for more details. > + > + Bus: > +A bus is a named object inside a domain. Clients exchange messages > +over a bus. Multiple buses themselves have no connection to each other; > +messages can only be exchanged on the same bus. The default entry point > to > +a bus, where clients establish the connection to, is the "bus" device > node > +/dev/kdbus//bus. > +Common operating system setups create one "system bus" per system, and > one > +"user bus" for every logged-in user. Applications or services may create > +their own private named buses. See section 5 for more details. > + > + Endpoint: > +An endpoint provides the device node to talk to a bus. Opening an > +endpoint creates a new connection to the bus to which the endpoint > belongs. > +Every bus has a default endpoint called "bus". > +A bus can optionally offer additional endpoints with custom names to > +provide a restricted access to the same bus. Custom endpoints carry > +additional policy which can be used to give sandboxed processes only > +a locked-down, limited, filtered access to the same bus. > +See section 5 for more details. > + > + Connection: > +A connection to a bus is created by opening an endpoint device node of > +a bus and becoming an active client with the HELLO exchange. Every > +connected client connection has a unique identifier on the bus and can > +address messages to every other connection on the same bus by using > +the peer's connection id as the destination. > +See section 6 for more details. > + > + Pool: > +Each connection allocates a piece of shmem-backed memory that is used > +to receive messages and answers to ioctl command from the kernel. It is > +never used to send anything to the kernel. In order to access that > memory, > +userspace must mmap() it into its task. > +See section 12 for more details. > + > + Well-known Name: > +A connection can, in addition to its implicit unique connection id, > request > +the ownership of a textual well-known name. Well-known names are noted in > +reverse-domain notation, such as com.example.service1. Connections > offering > +a service on a bus are usually reached by its well-known name. The > analogy > +of connection id and well-known name is an IP address and a DNS name > +associated with that address. > + > + Message: > +Connections can exchange messages with other connections by addressing > +the peers with their connection id or well-known name. A message consists > +of a message header with kernel-specific information on how to route the > +message, and the message payload, which is a logical byte stream of > +arbitrary size. Messages can carry additional file descriptors to be > passed > +from one connection to
Re: kdbus: add documentation
kdbus is a system for low-latency, low-overhead, easy to use interprocess communication (IPC). The interface to all functions in this driver is implemented through ioctls on /dev nodes. This patch adds detailed documentation about the kernel level API design. just some typos below Signed-off-by: Daniel Mack dan...@zonque.org Signed-off-by: Greg Kroah-Hartman gre...@linuxfoundation.org --- Documentation/kdbus.txt | 1815 +++ 1 file changed, 1815 insertions(+) create mode 100644 Documentation/kdbus.txt diff --git a/Documentation/kdbus.txt b/Documentation/kdbus.txt new file mode 100644 index ..ac1a18908976 --- /dev/null +++ b/Documentation/kdbus.txt @@ -0,0 +1,1815 @@ +D-Bus is a system for powerful, easy to use interprocess communication (IPC). + +The focus of this document is an overview of the low-level, native kernel D-Bus +transport called kdbus. Kdbus in the kernel acts similar to a device driver, +all communication between processes take place over special character device takes +nodes in /dev/kdbus/. + +For the general D-Bus protocol specification, the payload format, the +marshaling, and the communication semantics, please refer to: + http://dbus.freedesktop.org/doc/dbus-specification.html + +For a kdbus specific userspace library implementation please refer to: + http://cgit.freedesktop.org/systemd/systemd/tree/src/systemd/sd-bus.h + +Articles about D-Bus and kdbus: + http://lwn.net/Articles/580194/ + + +1. Terminology +=== + + Domain: +A domain is a named object containing a number of buses. A system +container that contains its own init system and users usually also +runs in its own kdbus domain. The /dev/kdbus/domain/container-name/ +directory shows up inside the domain as /dev/kdbus/. Every domain offers +its own control device node to create new buses or new sub-domains. +Domains have no connection to each other and cannot see nor talk to +each other. See section 5 for more details. + + Bus: +A bus is a named object inside a domain. Clients exchange messages +over a bus. Multiple buses themselves have no connection to each other; +messages can only be exchanged on the same bus. The default entry point to +a bus, where clients establish the connection to, is the bus device node +/dev/kdbus/bus name/bus. +Common operating system setups create one system bus per system, and one +user bus for every logged-in user. Applications or services may create +their own private named buses. See section 5 for more details. + + Endpoint: +An endpoint provides the device node to talk to a bus. Opening an +endpoint creates a new connection to the bus to which the endpoint belongs. +Every bus has a default endpoint called bus. +A bus can optionally offer additional endpoints with custom names to +provide a restricted access to the same bus. Custom endpoints carry +additional policy which can be used to give sandboxed processes only +a locked-down, limited, filtered access to the same bus. +See section 5 for more details. + + Connection: +A connection to a bus is created by opening an endpoint device node of +a bus and becoming an active client with the HELLO exchange. Every +connected client connection has a unique identifier on the bus and can +address messages to every other connection on the same bus by using +the peer's connection id as the destination. +See section 6 for more details. + + Pool: +Each connection allocates a piece of shmem-backed memory that is used +to receive messages and answers to ioctl command from the kernel. It is +never used to send anything to the kernel. In order to access that memory, +userspace must mmap() it into its task. +See section 12 for more details. + + Well-known Name: +A connection can, in addition to its implicit unique connection id, request +the ownership of a textual well-known name. Well-known names are noted in +reverse-domain notation, such as com.example.service1. Connections offering +a service on a bus are usually reached by its well-known name. The analogy +of connection id and well-known name is an IP address and a DNS name +associated with that address. + + Message: +Connections can exchange messages with other connections by addressing +the peers with their connection id or well-known name. A message consists +of a message header with kernel-specific information on how to route the +message, and the message payload, which is a logical byte stream of +arbitrary size. Messages can carry additional file descriptors to be passed +from one connection to another. Every connection can specify which set of +