Re: [RFC v1 02/14] bus1: provide stub cdev /dev/bus1

2016-10-29 Thread Arnd Bergmann
On Thursday 27 October 2016, Tom Gundersen wrote:
> On Thu, Oct 27, 2016 at 11:11 AM, Arnd Bergmann  wrote:
> > On Thursday, October 27, 2016 1:54:05 AM CEST Tom Gundersen wrote:
> >> On Thu, Oct 27, 2016 at 1:19 AM, Andy Lutomirski  
> >> wrote:
> >> > This may have been covered elsewhere, but could this use syscalls 
> >> > instead?
> >>
> >> Yes, syscalls would work essentially the same. For now, we are using a
> >> cdev as it makes it a lot more convenient to develop and test as an
> >> out-of-tree module, but that could be changed easily before the final
> >> submission, if that's what we want.
> >
> >
> > Generally speaking, I think syscalls would be appropriate here, and put
> > bus1 into a similar category as the other ipc interfaces (shm, msg, sem,
> > mqueue, ...).
> 
> Could you elaborate on why you think syscalls would be more
> appropriate than ioctls?

Linus already answered this, but I'd also add that core kernel
features just make sense to be syscalls, rather than stuffing
them in a random device driver.

> > - Have a mountable file system, and use open() on that to create
> >   connections. Advantages are that it's fairly easy to have one
> >   instance per fs-namespace, and you can have user-defined naming
> >   of objects in the file system.
> 
> Note that currently we only have one object (/dev/bus1) and each fd is
> disconnected from anything else on creation, so not sure what benefits
> a filesystem (or several instances of it) would give?

I have not tried to understand some of the main concepts of bus1,
so I simply assumed that there was some way of looking up handles
of other instances. Using a file system gives you a natural way
to look up resources by name the way we do e.g. for mq_open(),
and it lets you easy decide whether containers should share
a view of the same namespace by mounting the same instance of
the file system into them or having separate instances.

If you don't ever need to look up a handle by name in bus1, using
a mountable file system would not help you.

Arnd


Re: [RFC v1 02/14] bus1: provide stub cdev /dev/bus1

2016-10-29 Thread Arnd Bergmann
On Thursday 27 October 2016, Tom Gundersen wrote:
> On Thu, Oct 27, 2016 at 11:11 AM, Arnd Bergmann  wrote:
> > On Thursday, October 27, 2016 1:54:05 AM CEST Tom Gundersen wrote:
> >> On Thu, Oct 27, 2016 at 1:19 AM, Andy Lutomirski  
> >> wrote:
> >> > This may have been covered elsewhere, but could this use syscalls 
> >> > instead?
> >>
> >> Yes, syscalls would work essentially the same. For now, we are using a
> >> cdev as it makes it a lot more convenient to develop and test as an
> >> out-of-tree module, but that could be changed easily before the final
> >> submission, if that's what we want.
> >
> >
> > Generally speaking, I think syscalls would be appropriate here, and put
> > bus1 into a similar category as the other ipc interfaces (shm, msg, sem,
> > mqueue, ...).
> 
> Could you elaborate on why you think syscalls would be more
> appropriate than ioctls?

Linus already answered this, but I'd also add that core kernel
features just make sense to be syscalls, rather than stuffing
them in a random device driver.

> > - Have a mountable file system, and use open() on that to create
> >   connections. Advantages are that it's fairly easy to have one
> >   instance per fs-namespace, and you can have user-defined naming
> >   of objects in the file system.
> 
> Note that currently we only have one object (/dev/bus1) and each fd is
> disconnected from anything else on creation, so not sure what benefits
> a filesystem (or several instances of it) would give?

I have not tried to understand some of the main concepts of bus1,
so I simply assumed that there was some way of looking up handles
of other instances. Using a file system gives you a natural way
to look up resources by name the way we do e.g. for mq_open(),
and it lets you easy decide whether containers should share
a view of the same namespace by mounting the same instance of
the file system into them or having separate instances.

If you don't ever need to look up a handle by name in bus1, using
a mountable file system would not help you.

Arnd


Re: [RFC v1 02/14] bus1: provide stub cdev /dev/bus1

2016-10-27 Thread Tom Gundersen
On Thu, Oct 27, 2016 at 6:37 PM, Linus Torvalds
 wrote:
> On Thu, Oct 27, 2016 at 8:25 AM, Tom Gundersen  wrote:
>>
>> Could you elaborate on why you think syscalls would be more
>> appropriate than ioctls?
>
> ioctl's tend to be a horrid mess both for things like compat.but also
> for things like system call tracing and filtering (ie BPF).
>
> The compat mess is fixable by making sure you always use 64-bit fields
> rather than pointers everywhere and everything is aligned.

This we do.

> The
> tracing and filtering one not so much.

Got it. Thanks.

Tom


Re: [RFC v1 02/14] bus1: provide stub cdev /dev/bus1

2016-10-27 Thread Tom Gundersen
On Thu, Oct 27, 2016 at 6:37 PM, Linus Torvalds
 wrote:
> On Thu, Oct 27, 2016 at 8:25 AM, Tom Gundersen  wrote:
>>
>> Could you elaborate on why you think syscalls would be more
>> appropriate than ioctls?
>
> ioctl's tend to be a horrid mess both for things like compat.but also
> for things like system call tracing and filtering (ie BPF).
>
> The compat mess is fixable by making sure you always use 64-bit fields
> rather than pointers everywhere and everything is aligned.

This we do.

> The
> tracing and filtering one not so much.

Got it. Thanks.

Tom


Re: [RFC v1 02/14] bus1: provide stub cdev /dev/bus1

2016-10-27 Thread Linus Torvalds
On Thu, Oct 27, 2016 at 8:25 AM, Tom Gundersen  wrote:
>
> Could you elaborate on why you think syscalls would be more
> appropriate than ioctls?

ioctl's tend to be a horrid mess both for things like compat.but also
for things like system call tracing and filtering (ie BPF).

The compat mess is fixable by making sure you always use 64-bit fields
rather than pointers everywhere and everything is aligned.  The
tracing and filtering one not so much.

Linus


Re: [RFC v1 02/14] bus1: provide stub cdev /dev/bus1

2016-10-27 Thread Linus Torvalds
On Thu, Oct 27, 2016 at 8:25 AM, Tom Gundersen  wrote:
>
> Could you elaborate on why you think syscalls would be more
> appropriate than ioctls?

ioctl's tend to be a horrid mess both for things like compat.but also
for things like system call tracing and filtering (ie BPF).

The compat mess is fixable by making sure you always use 64-bit fields
rather than pointers everywhere and everything is aligned.  The
tracing and filtering one not so much.

Linus


Re: [RFC v1 02/14] bus1: provide stub cdev /dev/bus1

2016-10-27 Thread Tom Gundersen
On Thu, Oct 27, 2016 at 11:11 AM, Arnd Bergmann  wrote:
> On Thursday, October 27, 2016 1:54:05 AM CEST Tom Gundersen wrote:
>> On Thu, Oct 27, 2016 at 1:19 AM, Andy Lutomirski  wrote:
>> > This may have been covered elsewhere, but could this use syscalls instead?
>>
>> Yes, syscalls would work essentially the same. For now, we are using a
>> cdev as it makes it a lot more convenient to develop and test as an
>> out-of-tree module, but that could be changed easily before the final
>> submission, if that's what we want.
>
>
> Generally speaking, I think syscalls would be appropriate here, and put
> bus1 into a similar category as the other ipc interfaces (shm, msg, sem,
> mqueue, ...).

Could you elaborate on why you think syscalls would be more
appropriate than ioctls?

> However, syscall API design is nontrivial, and will require a bit of
> work to come to a set of syscalls that is fairly compact but also
> extensible enough. I think it makes sense to go through the exercise
> of working out what the syscall interface would end up looking like,
> and then make a decision.
>
> There is currently a set of file operations:
>
> @@ -48,7 +90,11 @@ const struct file_operations bus1_fops = {
> .owner  = THIS_MODULE,
> .open   = bus1_fop_open,
> .release= bus1_fop_release,
> +   .poll   = bus1_fop_poll,
> .llseek = noop_llseek,
> +   .mmap   = bus1_fop_mmap,
> +   .unlocked_ioctl = bus1_peer_ioctl,
> +   .compat_ioctl   = bus1_peer_ioctl,
> .show_fdinfo= bus1_fop_show_fdinfo,
>  };
>
> and then another set of ioctls:
>
> +enum {
> +   BUS1_CMD_PEER_DISCONNECT= _IOWR(BUS1_IOCTL_MAGIC, 0x00,
> +   __u64),
> +   BUS1_CMD_PEER_QUERY = _IOWR(BUS1_IOCTL_MAGIC, 0x01,
> +   struct bus1_cmd_peer_reset),
> +   BUS1_CMD_PEER_RESET = _IOWR(BUS1_IOCTL_MAGIC, 0x02,
> +   struct bus1_cmd_peer_reset),
> +   BUS1_CMD_HANDLE_RELEASE = _IOWR(BUS1_IOCTL_MAGIC, 0x10,
> +   __u64),
> +   BUS1_CMD_HANDLE_TRANSFER= _IOWR(BUS1_IOCTL_MAGIC, 0x11,
> +   struct bus1_cmd_handle_transfer),
> +   BUS1_CMD_NODES_DESTROY  = _IOWR(BUS1_IOCTL_MAGIC, 0x20,
> +   struct bus1_cmd_nodes_destroy),
> +   BUS1_CMD_SLICE_RELEASE  = _IOWR(BUS1_IOCTL_MAGIC, 0x30,
> +   __u64),
> +   BUS1_CMD_SEND   = _IOWR(BUS1_IOCTL_MAGIC, 0x40,
> +   struct bus1_cmd_send),
> +   BUS1_CMD_RECV   = _IOWR(BUS1_IOCTL_MAGIC, 0x50,
> +   struct bus1_cmd_recv),
> +};
>
> I think there is no alternative to having some sort of file descriptor
> with the basic operations you have above, but there is a question of
> how to get that file descriptor if the ioctls get changed to a syscall,
> the basic options being:

I could see the point of wanting a syscall to get the fd (your second
option below), but as I said, not sure I see why we would want to use
syscalls instead of ioctls.

> - Keep using a chardev. This works, but feels a little odd to me,
>   and I can't think of any other interfaces combining syscalls with
>   a chardev.
>
> - Have one syscall that returns an open file descriptor, replacing
>   the fops->open() function. One advantage is that you can pass
>   additional arguments in that you can't have with open.
>   An example for this would be mqueue_open().

If we are going to change it, this might makes sense to me. It would
allow you to get the fd without having to have access to some
character device.

> - Have a mountable file system, and use open() on that to create
>   connections. Advantages are that it's fairly easy to have one
>   instance per fs-namespace, and you can have user-defined naming
>   of objects in the file system.

Note that currently we only have one object (/dev/bus1) and each fd is
disconnected from anything else on creation, so not sure what benefits
a filesystem (or several instances of it) would give?

> For the other operations, the obvious translation would be to
> turn each ioctl command into one syscall, but that may not always
> be the best representation. One limitation is that you cannot
> generally have more than six 'long' arguments on a lot of
> architectures, and passing 'u64' arguments to syscalls is awkward.
>
> For some of the commands, the transformation would be straightforward
> if we assume that the 'u64' arguments can actually be 'long',
> I guess like this:
>
> +struct bus1_cmd_handle_transfer {
> +   __u64 flags;
> +   __u64 src_handle;
> +   __u64 

Re: [RFC v1 02/14] bus1: provide stub cdev /dev/bus1

2016-10-27 Thread Tom Gundersen
On Thu, Oct 27, 2016 at 11:11 AM, Arnd Bergmann  wrote:
> On Thursday, October 27, 2016 1:54:05 AM CEST Tom Gundersen wrote:
>> On Thu, Oct 27, 2016 at 1:19 AM, Andy Lutomirski  wrote:
>> > This may have been covered elsewhere, but could this use syscalls instead?
>>
>> Yes, syscalls would work essentially the same. For now, we are using a
>> cdev as it makes it a lot more convenient to develop and test as an
>> out-of-tree module, but that could be changed easily before the final
>> submission, if that's what we want.
>
>
> Generally speaking, I think syscalls would be appropriate here, and put
> bus1 into a similar category as the other ipc interfaces (shm, msg, sem,
> mqueue, ...).

Could you elaborate on why you think syscalls would be more
appropriate than ioctls?

> However, syscall API design is nontrivial, and will require a bit of
> work to come to a set of syscalls that is fairly compact but also
> extensible enough. I think it makes sense to go through the exercise
> of working out what the syscall interface would end up looking like,
> and then make a decision.
>
> There is currently a set of file operations:
>
> @@ -48,7 +90,11 @@ const struct file_operations bus1_fops = {
> .owner  = THIS_MODULE,
> .open   = bus1_fop_open,
> .release= bus1_fop_release,
> +   .poll   = bus1_fop_poll,
> .llseek = noop_llseek,
> +   .mmap   = bus1_fop_mmap,
> +   .unlocked_ioctl = bus1_peer_ioctl,
> +   .compat_ioctl   = bus1_peer_ioctl,
> .show_fdinfo= bus1_fop_show_fdinfo,
>  };
>
> and then another set of ioctls:
>
> +enum {
> +   BUS1_CMD_PEER_DISCONNECT= _IOWR(BUS1_IOCTL_MAGIC, 0x00,
> +   __u64),
> +   BUS1_CMD_PEER_QUERY = _IOWR(BUS1_IOCTL_MAGIC, 0x01,
> +   struct bus1_cmd_peer_reset),
> +   BUS1_CMD_PEER_RESET = _IOWR(BUS1_IOCTL_MAGIC, 0x02,
> +   struct bus1_cmd_peer_reset),
> +   BUS1_CMD_HANDLE_RELEASE = _IOWR(BUS1_IOCTL_MAGIC, 0x10,
> +   __u64),
> +   BUS1_CMD_HANDLE_TRANSFER= _IOWR(BUS1_IOCTL_MAGIC, 0x11,
> +   struct bus1_cmd_handle_transfer),
> +   BUS1_CMD_NODES_DESTROY  = _IOWR(BUS1_IOCTL_MAGIC, 0x20,
> +   struct bus1_cmd_nodes_destroy),
> +   BUS1_CMD_SLICE_RELEASE  = _IOWR(BUS1_IOCTL_MAGIC, 0x30,
> +   __u64),
> +   BUS1_CMD_SEND   = _IOWR(BUS1_IOCTL_MAGIC, 0x40,
> +   struct bus1_cmd_send),
> +   BUS1_CMD_RECV   = _IOWR(BUS1_IOCTL_MAGIC, 0x50,
> +   struct bus1_cmd_recv),
> +};
>
> I think there is no alternative to having some sort of file descriptor
> with the basic operations you have above, but there is a question of
> how to get that file descriptor if the ioctls get changed to a syscall,
> the basic options being:

I could see the point of wanting a syscall to get the fd (your second
option below), but as I said, not sure I see why we would want to use
syscalls instead of ioctls.

> - Keep using a chardev. This works, but feels a little odd to me,
>   and I can't think of any other interfaces combining syscalls with
>   a chardev.
>
> - Have one syscall that returns an open file descriptor, replacing
>   the fops->open() function. One advantage is that you can pass
>   additional arguments in that you can't have with open.
>   An example for this would be mqueue_open().

If we are going to change it, this might makes sense to me. It would
allow you to get the fd without having to have access to some
character device.

> - Have a mountable file system, and use open() on that to create
>   connections. Advantages are that it's fairly easy to have one
>   instance per fs-namespace, and you can have user-defined naming
>   of objects in the file system.

Note that currently we only have one object (/dev/bus1) and each fd is
disconnected from anything else on creation, so not sure what benefits
a filesystem (or several instances of it) would give?

> For the other operations, the obvious translation would be to
> turn each ioctl command into one syscall, but that may not always
> be the best representation. One limitation is that you cannot
> generally have more than six 'long' arguments on a lot of
> architectures, and passing 'u64' arguments to syscalls is awkward.
>
> For some of the commands, the transformation would be straightforward
> if we assume that the 'u64' arguments can actually be 'long',
> I guess like this:
>
> +struct bus1_cmd_handle_transfer {
> +   __u64 flags;
> +   __u64 src_handle;
> +   __u64 dst_fd;
> +   __u64 dst_handle;
> 

Re: [RFC v1 02/14] bus1: provide stub cdev /dev/bus1

2016-10-27 Thread Arnd Bergmann
On Thursday, October 27, 2016 1:54:05 AM CEST Tom Gundersen wrote:
> On Thu, Oct 27, 2016 at 1:19 AM, Andy Lutomirski  wrote:
> > This may have been covered elsewhere, but could this use syscalls instead?
> 
> Yes, syscalls would work essentially the same. For now, we are using a
> cdev as it makes it a lot more convenient to develop and test as an
> out-of-tree module, but that could be changed easily before the final
> submission, if that's what we want.


Generally speaking, I think syscalls would be appropriate here, and put
bus1 into a similar category as the other ipc interfaces (shm, msg, sem,
mqueue, ...).

However, syscall API design is nontrivial, and will require a bit of
work to come to a set of syscalls that is fairly compact but also
extensible enough. I think it makes sense to go through the exercise
of working out what the syscall interface would end up looking like,
and then make a decision.

There is currently a set of file operations:

@@ -48,7 +90,11 @@ const struct file_operations bus1_fops = {
.owner  = THIS_MODULE,
.open   = bus1_fop_open,
.release= bus1_fop_release,
+   .poll   = bus1_fop_poll,
.llseek = noop_llseek,
+   .mmap   = bus1_fop_mmap,
+   .unlocked_ioctl = bus1_peer_ioctl,
+   .compat_ioctl   = bus1_peer_ioctl,
.show_fdinfo= bus1_fop_show_fdinfo,
 };

and then another set of ioctls:

+enum {
+   BUS1_CMD_PEER_DISCONNECT= _IOWR(BUS1_IOCTL_MAGIC, 0x00,
+   __u64),
+   BUS1_CMD_PEER_QUERY = _IOWR(BUS1_IOCTL_MAGIC, 0x01,
+   struct bus1_cmd_peer_reset),
+   BUS1_CMD_PEER_RESET = _IOWR(BUS1_IOCTL_MAGIC, 0x02,
+   struct bus1_cmd_peer_reset),
+   BUS1_CMD_HANDLE_RELEASE = _IOWR(BUS1_IOCTL_MAGIC, 0x10,
+   __u64),
+   BUS1_CMD_HANDLE_TRANSFER= _IOWR(BUS1_IOCTL_MAGIC, 0x11,
+   struct bus1_cmd_handle_transfer),
+   BUS1_CMD_NODES_DESTROY  = _IOWR(BUS1_IOCTL_MAGIC, 0x20,
+   struct bus1_cmd_nodes_destroy),
+   BUS1_CMD_SLICE_RELEASE  = _IOWR(BUS1_IOCTL_MAGIC, 0x30,
+   __u64),
+   BUS1_CMD_SEND   = _IOWR(BUS1_IOCTL_MAGIC, 0x40,
+   struct bus1_cmd_send),
+   BUS1_CMD_RECV   = _IOWR(BUS1_IOCTL_MAGIC, 0x50,
+   struct bus1_cmd_recv),
+};

I think there is no alternative to having some sort of file descriptor
with the basic operations you have above, but there is a question of
how to get that file descriptor if the ioctls get changed to a syscall,
the basic options being:

- Keep using a chardev. This works, but feels a little odd to me,
  and I can't think of any other interfaces combining syscalls with
  a chardev.

- Have one syscall that returns an open file descriptor, replacing
  the fops->open() function. One advantage is that you can pass
  additional arguments in that you can't have with open.
  An example for this would be mqueue_open().

- Have a mountable file system, and use open() on that to create
  connections. Advantages are that it's fairly easy to have one
  instance per fs-namespace, and you can have user-defined naming
  of objects in the file system.

For the other operations, the obvious translation would be to
turn each ioctl command into one syscall, but that may not always
be the best representation. One limitation is that you cannot
generally have more than six 'long' arguments on a lot of
architectures, and passing 'u64' arguments to syscalls is awkward.

For some of the commands, the transformation would be straightforward
if we assume that the 'u64' arguments can actually be 'long',
I guess like this:

+struct bus1_cmd_handle_transfer {
+   __u64 flags;
+   __u64 src_handle;
+   __u64 dst_fd;
+   __u64 dst_handle;
+} __attribute__((__aligned__(8)));

long bus1_handle_transfer(int fd, unsigned long handle,
int dst_fd, unsigned long *dst_handle, unsigned int flags);

+struct bus1_cmd_nodes_destroy {
+   __u64 flags;
+   __u64 ptr_nodes;
+   __u64 n_nodes;
+} __attribute__((__aligned__(8)));

long bus1_nodes_destroy(int fd, u64 *ptr_nodes,
long n_nodes, unsigned int flags);

However, the peer_reset would exceed the 6-argument limit when you count
the initial file descriptor even if you assume that 'flags' can be
made 32-bit:

+struct bus1_cmd_peer_reset {
+   __u64 flags;
+   __u64 peer_flags;
+   __u32 max_slices;
+   __u32 max_handles;
+   __u32 max_inflight_bytes;
+   __u32 max_inflight_fds;
+} __attribute__((__aligned__(8)));

Re: [RFC v1 02/14] bus1: provide stub cdev /dev/bus1

2016-10-27 Thread Arnd Bergmann
On Thursday, October 27, 2016 1:54:05 AM CEST Tom Gundersen wrote:
> On Thu, Oct 27, 2016 at 1:19 AM, Andy Lutomirski  wrote:
> > This may have been covered elsewhere, but could this use syscalls instead?
> 
> Yes, syscalls would work essentially the same. For now, we are using a
> cdev as it makes it a lot more convenient to develop and test as an
> out-of-tree module, but that could be changed easily before the final
> submission, if that's what we want.


Generally speaking, I think syscalls would be appropriate here, and put
bus1 into a similar category as the other ipc interfaces (shm, msg, sem,
mqueue, ...).

However, syscall API design is nontrivial, and will require a bit of
work to come to a set of syscalls that is fairly compact but also
extensible enough. I think it makes sense to go through the exercise
of working out what the syscall interface would end up looking like,
and then make a decision.

There is currently a set of file operations:

@@ -48,7 +90,11 @@ const struct file_operations bus1_fops = {
.owner  = THIS_MODULE,
.open   = bus1_fop_open,
.release= bus1_fop_release,
+   .poll   = bus1_fop_poll,
.llseek = noop_llseek,
+   .mmap   = bus1_fop_mmap,
+   .unlocked_ioctl = bus1_peer_ioctl,
+   .compat_ioctl   = bus1_peer_ioctl,
.show_fdinfo= bus1_fop_show_fdinfo,
 };

and then another set of ioctls:

+enum {
+   BUS1_CMD_PEER_DISCONNECT= _IOWR(BUS1_IOCTL_MAGIC, 0x00,
+   __u64),
+   BUS1_CMD_PEER_QUERY = _IOWR(BUS1_IOCTL_MAGIC, 0x01,
+   struct bus1_cmd_peer_reset),
+   BUS1_CMD_PEER_RESET = _IOWR(BUS1_IOCTL_MAGIC, 0x02,
+   struct bus1_cmd_peer_reset),
+   BUS1_CMD_HANDLE_RELEASE = _IOWR(BUS1_IOCTL_MAGIC, 0x10,
+   __u64),
+   BUS1_CMD_HANDLE_TRANSFER= _IOWR(BUS1_IOCTL_MAGIC, 0x11,
+   struct bus1_cmd_handle_transfer),
+   BUS1_CMD_NODES_DESTROY  = _IOWR(BUS1_IOCTL_MAGIC, 0x20,
+   struct bus1_cmd_nodes_destroy),
+   BUS1_CMD_SLICE_RELEASE  = _IOWR(BUS1_IOCTL_MAGIC, 0x30,
+   __u64),
+   BUS1_CMD_SEND   = _IOWR(BUS1_IOCTL_MAGIC, 0x40,
+   struct bus1_cmd_send),
+   BUS1_CMD_RECV   = _IOWR(BUS1_IOCTL_MAGIC, 0x50,
+   struct bus1_cmd_recv),
+};

I think there is no alternative to having some sort of file descriptor
with the basic operations you have above, but there is a question of
how to get that file descriptor if the ioctls get changed to a syscall,
the basic options being:

- Keep using a chardev. This works, but feels a little odd to me,
  and I can't think of any other interfaces combining syscalls with
  a chardev.

- Have one syscall that returns an open file descriptor, replacing
  the fops->open() function. One advantage is that you can pass
  additional arguments in that you can't have with open.
  An example for this would be mqueue_open().

- Have a mountable file system, and use open() on that to create
  connections. Advantages are that it's fairly easy to have one
  instance per fs-namespace, and you can have user-defined naming
  of objects in the file system.

For the other operations, the obvious translation would be to
turn each ioctl command into one syscall, but that may not always
be the best representation. One limitation is that you cannot
generally have more than six 'long' arguments on a lot of
architectures, and passing 'u64' arguments to syscalls is awkward.

For some of the commands, the transformation would be straightforward
if we assume that the 'u64' arguments can actually be 'long',
I guess like this:

+struct bus1_cmd_handle_transfer {
+   __u64 flags;
+   __u64 src_handle;
+   __u64 dst_fd;
+   __u64 dst_handle;
+} __attribute__((__aligned__(8)));

long bus1_handle_transfer(int fd, unsigned long handle,
int dst_fd, unsigned long *dst_handle, unsigned int flags);

+struct bus1_cmd_nodes_destroy {
+   __u64 flags;
+   __u64 ptr_nodes;
+   __u64 n_nodes;
+} __attribute__((__aligned__(8)));

long bus1_nodes_destroy(int fd, u64 *ptr_nodes,
long n_nodes, unsigned int flags);

However, the peer_reset would exceed the 6-argument limit when you count
the initial file descriptor even if you assume that 'flags' can be
made 32-bit:

+struct bus1_cmd_peer_reset {
+   __u64 flags;
+   __u64 peer_flags;
+   __u32 max_slices;
+   __u32 max_handles;
+   __u32 max_inflight_bytes;
+   __u32 max_inflight_fds;
+} __attribute__((__aligned__(8)));

maybe something 

Re: [RFC v1 02/14] bus1: provide stub cdev /dev/bus1

2016-10-26 Thread Tom Gundersen
On Thu, Oct 27, 2016 at 1:19 AM, Andy Lutomirski  wrote:
> This may have been covered elsewhere, but could this use syscalls instead?

Yes, syscalls would work essentially the same. For now, we are using a
cdev as it makes it a lot more convenient to develop and test as an
out-of-tree module, but that could be changed easily before the final
submission, if that's what we want.

Cheers,

Tom


Re: [RFC v1 02/14] bus1: provide stub cdev /dev/bus1

2016-10-26 Thread Tom Gundersen
On Thu, Oct 27, 2016 at 1:19 AM, Andy Lutomirski  wrote:
> This may have been covered elsewhere, but could this use syscalls instead?

Yes, syscalls would work essentially the same. For now, we are using a
cdev as it makes it a lot more convenient to develop and test as an
out-of-tree module, but that could be changed easily before the final
submission, if that's what we want.

Cheers,

Tom


Re: [RFC v1 02/14] bus1: provide stub cdev /dev/bus1

2016-10-26 Thread Andy Lutomirski
On Oct 26, 2016 12:21 PM, "David Herrmann"  wrote:
>
> From: Tom Gundersen 
>
> Add the CONFIG_BUS1 option to enable the bus1 kernel messaging bus. If
> enabled, provide the bus1.ko module with a stub cdev /dev/bus1. So far
> it does not expose any API, but the full intended uapi is provided in
> include/uapi/linux/bus1.h already.
>

This may have been covered elsewhere, but could this use syscalls instead?


Re: [RFC v1 02/14] bus1: provide stub cdev /dev/bus1

2016-10-26 Thread Andy Lutomirski
On Oct 26, 2016 12:21 PM, "David Herrmann"  wrote:
>
> From: Tom Gundersen 
>
> Add the CONFIG_BUS1 option to enable the bus1 kernel messaging bus. If
> enabled, provide the bus1.ko module with a stub cdev /dev/bus1. So far
> it does not expose any API, but the full intended uapi is provided in
> include/uapi/linux/bus1.h already.
>

This may have been covered elsewhere, but could this use syscalls instead?


[RFC v1 02/14] bus1: provide stub cdev /dev/bus1

2016-10-26 Thread David Herrmann
From: Tom Gundersen 

Add the CONFIG_BUS1 option to enable the bus1 kernel messaging bus. If
enabled, provide the bus1.ko module with a stub cdev /dev/bus1. So far
it does not expose any API, but the full intended uapi is provided in
include/uapi/linux/bus1.h already.

Signed-off-by: Tom Gundersen 
Signed-off-by: David Herrmann 
---
 include/uapi/linux/bus1.h | 138 ++
 init/Kconfig  |  17 ++
 ipc/Makefile  |   1 +
 ipc/bus1/Makefile |   6 ++
 ipc/bus1/main.c   |  80 +++
 ipc/bus1/main.h   |  74 +
 ipc/bus1/tests.c  |  19 +++
 ipc/bus1/tests.h  |  32 +++
 8 files changed, 367 insertions(+)
 create mode 100644 include/uapi/linux/bus1.h
 create mode 100644 ipc/bus1/Makefile
 create mode 100644 ipc/bus1/main.c
 create mode 100644 ipc/bus1/main.h
 create mode 100644 ipc/bus1/tests.c
 create mode 100644 ipc/bus1/tests.h

diff --git a/include/uapi/linux/bus1.h b/include/uapi/linux/bus1.h
new file mode 100644
index 000..8ec3357
--- /dev/null
+++ b/include/uapi/linux/bus1.h
@@ -0,0 +1,138 @@
+#ifndef _UAPI_LINUX_BUS1_H
+#define _UAPI_LINUX_BUS1_H
+
+/*
+ * Copyright (C) 2013-2016 Red Hat, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include 
+#include 
+
+#define BUS1_FD_MAX(256)
+
+#define BUS1_IOCTL_MAGIC   0x96
+#define BUS1_HANDLE_INVALID((__u64)-1)
+#define BUS1_OFFSET_INVALID((__u64)-1)
+
+enum {
+   BUS1_HANDLE_FLAG_MANAGED= 1ULL <<  0,
+   BUS1_HANDLE_FLAG_REMOTE = 1ULL <<  1,
+};
+
+enum {
+   BUS1_PEER_FLAG_WANT_SECCTX  = 1ULL <<  0,
+};
+
+enum {
+   BUS1_PEER_RESET_FLAG_FLUSH  = 1ULL <<  0,
+   BUS1_PEER_RESET_FLAG_FLUSH_SEED = 1ULL <<  1,
+};
+
+struct bus1_cmd_peer_reset {
+   __u64 flags;
+   __u64 peer_flags;
+   __u32 max_slices;
+   __u32 max_handles;
+   __u32 max_inflight_bytes;
+   __u32 max_inflight_fds;
+} __attribute__((__aligned__(8)));
+
+struct bus1_cmd_handle_transfer {
+   __u64 flags;
+   __u64 src_handle;
+   __u64 dst_fd;
+   __u64 dst_handle;
+} __attribute__((__aligned__(8)));
+
+enum {
+   BUS1_NODES_DESTROY_FLAG_RELEASE_HANDLES = 1ULL <<  0,
+};
+
+struct bus1_cmd_nodes_destroy {
+   __u64 flags;
+   __u64 ptr_nodes;
+   __u64 n_nodes;
+} __attribute__((__aligned__(8)));
+
+enum {
+   BUS1_SEND_FLAG_CONTINUE = 1ULL <<  0,
+   BUS1_SEND_FLAG_SEED = 1ULL <<  1,
+};
+
+struct bus1_cmd_send {
+   __u64 flags;
+   __u64 ptr_destinations;
+   __u64 ptr_errors;
+   __u64 n_destinations;
+   __u64 ptr_vecs;
+   __u64 n_vecs;
+   __u64 ptr_handles;
+   __u64 n_handles;
+   __u64 ptr_fds;
+   __u64 n_fds;
+} __attribute__((__aligned__(8)));
+
+enum {
+   BUS1_RECV_FLAG_PEEK = 1ULL <<  0,
+   BUS1_RECV_FLAG_SEED = 1ULL <<  1,
+   BUS1_RECV_FLAG_INSTALL_FDS  = 1ULL <<  2,
+};
+
+enum {
+   BUS1_MSG_NONE,
+   BUS1_MSG_DATA,
+   BUS1_MSG_NODE_DESTROY,
+   BUS1_MSG_NODE_RELEASE,
+};
+
+enum {
+   BUS1_MSG_FLAG_HAS_SECCTX= 1ULL <<  0,
+   BUS1_MSG_FLAG_CONTINUE  = 1ULL <<  1,
+};
+
+struct bus1_cmd_recv {
+   __u64 flags;
+   __u64 max_offset;
+   struct {
+   __u64 type;
+   __u64 flags;
+   __u64 destination;
+   __u32 uid;
+   __u32 gid;
+   __u32 pid;
+   __u32 tid;
+   __u64 offset;
+   __u64 n_bytes;
+   __u64 n_handles;
+   __u64 n_fds;
+   __u64 n_secctx;
+   } __attribute__((__aligned__(8))) msg;
+} __attribute__((__aligned__(8)));
+
+enum {
+   BUS1_CMD_PEER_DISCONNECT= _IOWR(BUS1_IOCTL_MAGIC, 0x00,
+   __u64),
+   BUS1_CMD_PEER_QUERY = _IOWR(BUS1_IOCTL_MAGIC, 0x01,
+   struct bus1_cmd_peer_reset),
+   BUS1_CMD_PEER_RESET = _IOWR(BUS1_IOCTL_MAGIC, 0x02,
+   struct bus1_cmd_peer_reset),
+   BUS1_CMD_HANDLE_RELEASE = _IOWR(BUS1_IOCTL_MAGIC, 0x10,
+   __u64),
+   

[RFC v1 02/14] bus1: provide stub cdev /dev/bus1

2016-10-26 Thread David Herrmann
From: Tom Gundersen 

Add the CONFIG_BUS1 option to enable the bus1 kernel messaging bus. If
enabled, provide the bus1.ko module with a stub cdev /dev/bus1. So far
it does not expose any API, but the full intended uapi is provided in
include/uapi/linux/bus1.h already.

Signed-off-by: Tom Gundersen 
Signed-off-by: David Herrmann 
---
 include/uapi/linux/bus1.h | 138 ++
 init/Kconfig  |  17 ++
 ipc/Makefile  |   1 +
 ipc/bus1/Makefile |   6 ++
 ipc/bus1/main.c   |  80 +++
 ipc/bus1/main.h   |  74 +
 ipc/bus1/tests.c  |  19 +++
 ipc/bus1/tests.h  |  32 +++
 8 files changed, 367 insertions(+)
 create mode 100644 include/uapi/linux/bus1.h
 create mode 100644 ipc/bus1/Makefile
 create mode 100644 ipc/bus1/main.c
 create mode 100644 ipc/bus1/main.h
 create mode 100644 ipc/bus1/tests.c
 create mode 100644 ipc/bus1/tests.h

diff --git a/include/uapi/linux/bus1.h b/include/uapi/linux/bus1.h
new file mode 100644
index 000..8ec3357
--- /dev/null
+++ b/include/uapi/linux/bus1.h
@@ -0,0 +1,138 @@
+#ifndef _UAPI_LINUX_BUS1_H
+#define _UAPI_LINUX_BUS1_H
+
+/*
+ * Copyright (C) 2013-2016 Red Hat, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include 
+#include 
+
+#define BUS1_FD_MAX(256)
+
+#define BUS1_IOCTL_MAGIC   0x96
+#define BUS1_HANDLE_INVALID((__u64)-1)
+#define BUS1_OFFSET_INVALID((__u64)-1)
+
+enum {
+   BUS1_HANDLE_FLAG_MANAGED= 1ULL <<  0,
+   BUS1_HANDLE_FLAG_REMOTE = 1ULL <<  1,
+};
+
+enum {
+   BUS1_PEER_FLAG_WANT_SECCTX  = 1ULL <<  0,
+};
+
+enum {
+   BUS1_PEER_RESET_FLAG_FLUSH  = 1ULL <<  0,
+   BUS1_PEER_RESET_FLAG_FLUSH_SEED = 1ULL <<  1,
+};
+
+struct bus1_cmd_peer_reset {
+   __u64 flags;
+   __u64 peer_flags;
+   __u32 max_slices;
+   __u32 max_handles;
+   __u32 max_inflight_bytes;
+   __u32 max_inflight_fds;
+} __attribute__((__aligned__(8)));
+
+struct bus1_cmd_handle_transfer {
+   __u64 flags;
+   __u64 src_handle;
+   __u64 dst_fd;
+   __u64 dst_handle;
+} __attribute__((__aligned__(8)));
+
+enum {
+   BUS1_NODES_DESTROY_FLAG_RELEASE_HANDLES = 1ULL <<  0,
+};
+
+struct bus1_cmd_nodes_destroy {
+   __u64 flags;
+   __u64 ptr_nodes;
+   __u64 n_nodes;
+} __attribute__((__aligned__(8)));
+
+enum {
+   BUS1_SEND_FLAG_CONTINUE = 1ULL <<  0,
+   BUS1_SEND_FLAG_SEED = 1ULL <<  1,
+};
+
+struct bus1_cmd_send {
+   __u64 flags;
+   __u64 ptr_destinations;
+   __u64 ptr_errors;
+   __u64 n_destinations;
+   __u64 ptr_vecs;
+   __u64 n_vecs;
+   __u64 ptr_handles;
+   __u64 n_handles;
+   __u64 ptr_fds;
+   __u64 n_fds;
+} __attribute__((__aligned__(8)));
+
+enum {
+   BUS1_RECV_FLAG_PEEK = 1ULL <<  0,
+   BUS1_RECV_FLAG_SEED = 1ULL <<  1,
+   BUS1_RECV_FLAG_INSTALL_FDS  = 1ULL <<  2,
+};
+
+enum {
+   BUS1_MSG_NONE,
+   BUS1_MSG_DATA,
+   BUS1_MSG_NODE_DESTROY,
+   BUS1_MSG_NODE_RELEASE,
+};
+
+enum {
+   BUS1_MSG_FLAG_HAS_SECCTX= 1ULL <<  0,
+   BUS1_MSG_FLAG_CONTINUE  = 1ULL <<  1,
+};
+
+struct bus1_cmd_recv {
+   __u64 flags;
+   __u64 max_offset;
+   struct {
+   __u64 type;
+   __u64 flags;
+   __u64 destination;
+   __u32 uid;
+   __u32 gid;
+   __u32 pid;
+   __u32 tid;
+   __u64 offset;
+   __u64 n_bytes;
+   __u64 n_handles;
+   __u64 n_fds;
+   __u64 n_secctx;
+   } __attribute__((__aligned__(8))) msg;
+} __attribute__((__aligned__(8)));
+
+enum {
+   BUS1_CMD_PEER_DISCONNECT= _IOWR(BUS1_IOCTL_MAGIC, 0x00,
+   __u64),
+   BUS1_CMD_PEER_QUERY = _IOWR(BUS1_IOCTL_MAGIC, 0x01,
+   struct bus1_cmd_peer_reset),
+   BUS1_CMD_PEER_RESET = _IOWR(BUS1_IOCTL_MAGIC, 0x02,
+   struct bus1_cmd_peer_reset),
+   BUS1_CMD_HANDLE_RELEASE = _IOWR(BUS1_IOCTL_MAGIC, 0x10,
+   __u64),
+   BUS1_CMD_HANDLE_TRANSFER= _IOWR(BUS1_IOCTL_MAGIC, 0x11,
+