Re: kdbus: to merge or not to merge?

2015-08-10 Thread Linus Torvalds
On Sun, Aug 9, 2015 at 3:11 PM, Daniel Mack  wrote:
>
> The kdbus implementation is actually comparable to two tasks X and Y
> which both have their own buffer file open and mmap()ed, and they both
> pass their FD to the other side. If X now writes to Y's file, and that
> is causing a page fault, X is accounted for it, correct?

No.

With shared memory, there's no particularly obvious accounting rules.
In particular, when somebody maps an already allocated page, it's
basically a no-op from a memory allocation standpoint.

The whole "this is equivalent to the user space deamon" argument is
bogus. Shared memory is very very different from just sending messages
(copying the buffers) and is generally much harder to get a handle on.
And thats' what you should be comparing to.

The old "communicate over a unix domain socket" had pretty clear
accounting rules, and while unix domain sockets have some horribly
nasty issues (most are about passing fd's around) that isn't one of
them.

Anyway, the real issue for me here is that Andy is reporting all these
actual real problems that happen in practice, and the developer
replies are dismissing them on totally irrelevant grounds ("this
should be equivalent to something entirely different that nobody ever
does" or "well, people could opt out, even if they didn't" yadda yadda
yadda).

For example, the whole "tasks X and Y communicate over shmem" is
irrelevant. Normally, when people write those kinds of applications,
they are just regular applications. If they have issues, nobody else
cares. Andy's concern is about one of X/Y being a system daemon and
tricking it into doing bad things ends up effectively killing the
system - whether the *kernel* is alive or not and did the right thing
is almost entirely immaterial.

So please. When Andy sends a bug report with a exploit that kills his
system, just stop responding with irrelevant theoretical arguments. It
is not appropriate.  Instead, acknowledge the problem and work on
fixing it, none of this "but but but it's all the same" crap.

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-10 Thread Linus Torvalds
On Sun, Aug 9, 2015 at 3:11 PM, Daniel Mack dan...@zonque.org wrote:

 The kdbus implementation is actually comparable to two tasks X and Y
 which both have their own buffer file open and mmap()ed, and they both
 pass their FD to the other side. If X now writes to Y's file, and that
 is causing a page fault, X is accounted for it, correct?

No.

With shared memory, there's no particularly obvious accounting rules.
In particular, when somebody maps an already allocated page, it's
basically a no-op from a memory allocation standpoint.

The whole this is equivalent to the user space deamon argument is
bogus. Shared memory is very very different from just sending messages
(copying the buffers) and is generally much harder to get a handle on.
And thats' what you should be comparing to.

The old communicate over a unix domain socket had pretty clear
accounting rules, and while unix domain sockets have some horribly
nasty issues (most are about passing fd's around) that isn't one of
them.

Anyway, the real issue for me here is that Andy is reporting all these
actual real problems that happen in practice, and the developer
replies are dismissing them on totally irrelevant grounds (this
should be equivalent to something entirely different that nobody ever
does or well, people could opt out, even if they didn't yadda yadda
yadda).

For example, the whole tasks X and Y communicate over shmem is
irrelevant. Normally, when people write those kinds of applications,
they are just regular applications. If they have issues, nobody else
cares. Andy's concern is about one of X/Y being a system daemon and
tricking it into doing bad things ends up effectively killing the
system - whether the *kernel* is alive or not and did the right thing
is almost entirely immaterial.

So please. When Andy sends a bug report with a exploit that kills his
system, just stop responding with irrelevant theoretical arguments. It
is not appropriate.  Instead, acknowledge the problem and work on
fixing it, none of this but but but it's all the same crap.

 Linus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-09 Thread David Lang

On Sun, 9 Aug 2015, Greg Kroah-Hartman wrote:


The issue is with userspace clients opting in to receive all
NameOwnerChanged messages on the bus, which is not a good idea as they
constantly get woken up and process them, which is why the CPU was
pegged.  This issue should now be fixed in Rawhide for some of the
packages we found that were doing this. Maintainers of other packages
have been informed.  End result, no one has ever really tested sending
"bad" messages to the current system as all existing dbus users try to
be "good actors", thanks to Andy's testing, these apps should all now
become much more robust.


Does it require elevated privileges to opt to receive all NameOwnerChanged 
messages on the bus? Is it the default unless the apps opt for something more 
restrictive? or is it somewhere in between?


I was under the impression that the days of writing system-level stuff that 
assumes that all userspace apps are going to 'play nice' went out a decade or 
more ago. It's fine if the userspace app can kill itself, or possibly even the 
user it's running as, but being able to kill apps running as other users, let 
alone the whole system is a problem nowdays.


It may be able to happen in a default system, but this is why cgroups and 
namespaces have been created, to give the system admin the ability to limit the 
resources that any one app can consume. Introducing a new mechanism that allows 
one user to consume resources allocated to another and kill the system without 
providing a kernel level mechanism to limit the damage (as opposed to fixing 
individual apps) seems rather short-sighted at best.


David Lang


Re: kdbus: to merge or not to merge?

2015-08-09 Thread Andy Lutomirski
On Sun, Aug 9, 2015 at 3:11 PM, Daniel Mack  wrote:
>
> Internally, the connection pool is simply a shmem backed file. From the
> context of the HELLO ioctl, we are calling into shmem_file_setup(), so
> the file is eventually owned by the task which created the bus task
> connecting to the bus. One reason why we do the shmem file allocation in
> the kernel and on behalf of a the userspace task is that we clear the
> VM_MAYWRITE bit to prevent the task from writing to the pool through its
> mapped buffer. We also do not set VM_NORESERVE, so the entire buffer is
> pre-accounted for the task that created the connection.

I don't have access to the system I've been using for testing right
now, but I wonder how the kdbus pool stack up against the entire rest
of memory allocations for the average desktop process.

>
> The pool implementation uses an r/b tree to organize the buffer into
> slices. Those slices can be kept by userspace as long as the parsing
> implementation needs to have access to them. When finished, the slices
> are freed. A simple ring buffer cannot cope with the gaps that emerge by
> that.
>
> When a connection buffer is written to, it is done from the context of
> another task which calls into the kdbus code through one of the ioctls.
> The memcg implementation should hence charge the task that acts as
> writer, which is maybe not ideal but can be changed easily with some
> addition to the internal APIs. We omitted it for the current version,
> which is non-intrusive with regards to other kernel subsystems.
>

This has at least the following weakness.  I can very easily get
systemd to write to my shmem-backed pool: simply subscribe to one of
its broadcasts.  If I cause such a write to be very slow
(intentionally or otherwise), then PID 1 blocks.

If you change the memcg code to charge me instead of PID 1 (as it
should IMO), then the problem gets worse.

> The kdbus implementation is actually comparable to two tasks X and Y
> which both have their own buffer file open and mmap()ed, and they both
> pass their FD to the other side. If X now writes to Y's file, and that
> is causing a page fault, X is accounted for it, correct?

If PID 1 accepted a memfd from me (even a properly sealed one) and
wrote to it, I would wonder whether it were actually a good idea.

Does this scheme have any actual measurable advantage over the
traditional model of a small non-paged buffer in the kernel (i.e. the
way sockets work) with explicit userspace memfd use as appropriate?

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-09 Thread Daniel Mack
On 08/09/2015 09:00 PM, Greg Kroah-Hartman wrote:
> In chatting with Daniel on IRC, he is writing up a summary of how the
> kdbus memory pools work in more detail, and he said he would sent that
> out in a day or so, so that everyone can review.

Yes, let me quickly describe again how the kdbus pool logic works.

Every bus connection (peer) owns a buffer which is used in order to
receive payloads. Such payloads are either messages sent from other
connections, notifications or returned answer structures in return of
query commands (name lists, etc).

In order to avoid the kernel having to maintaining an internal buffer
the connections then read from with an extra command, we decided to let
the connections own their buffer directly, so they can mmap() the memory
into their task. Allocating a local buffer to collect asynchronous
messages is what they would need to do anyway, so we implemented a
short-cut that allows the kernel to directly access the memory and write
to it. The size of this buffer pool is configured by each connection
individually, during the HELLO call, so the kernel interface is as
flexible as any other memory allocation scheme the kernel provides and
is subject to the same limits.

Internally, the connection pool is simply a shmem backed file. From the
context of the HELLO ioctl, we are calling into shmem_file_setup(), so
the file is eventually owned by the task which created the bus task
connecting to the bus. One reason why we do the shmem file allocation in
the kernel and on behalf of a the userspace task is that we clear the
VM_MAYWRITE bit to prevent the task from writing to the pool through its
mapped buffer. We also do not set VM_NORESERVE, so the entire buffer is
pre-accounted for the task that created the connection.

The pool implementation uses an r/b tree to organize the buffer into
slices. Those slices can be kept by userspace as long as the parsing
implementation needs to have access to them. When finished, the slices
are freed. A simple ring buffer cannot cope with the gaps that emerge by
that.

When a connection buffer is written to, it is done from the context of
another task which calls into the kdbus code through one of the ioctls.
The memcg implementation should hence charge the task that acts as
writer, which is maybe not ideal but can be changed easily with some
addition to the internal APIs. We omitted it for the current version,
which is non-intrusive with regards to other kernel subsystems.

The kdbus implementation is actually comparable to two tasks X and Y
which both have their own buffer file open and mmap()ed, and they both
pass their FD to the other side. If X now writes to Y's file, and that
is causing a page fault, X is accounted for it, correct?

The kernel does *not* do any memory allocation to buffer payload, and
all other allocations (for instance, to keep around the internal state
of a connection, names etc) are subject to conservatively chosen
limitations. There is no unbounded memory allocation in kdbus that I am
aware of. If there was, it would clearly be a bug.

Addressing the point Andy made earlier: yes, due to memory
overcommitment, OOM situations may happen with certain patterns, but the
kernel should have the same measures to deal with them that it already
has with other types of shared userspace memory. Right?

Hope that all makes sense, we're open to discussions around the desired
accounting details. I've copied linux-mm to let more people have a look
into this again.


Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-09 Thread Greg Kroah-Hartman
On Fri, Aug 07, 2015 at 06:26:31PM +0300, Linus Torvalds wrote:
> User space memory allocation is not AT ALL the same thing as kdbus.
> Kernel allocations are very very different from user allocations. We
> have reasonable, fairly tested, and generic models for handling user
> space memory allocation issues - limiting, debugging, failing, and
> handling catastrophes (ie oom). And no, even that doesn't always work
> perfectly, but at least there is a *lot* of support for it, and this
> is not some special case.

The memory in this case is a shmem file that is created by the kernel,
but on behalf of the bus client task, which will eventually own it. As
discussed with the mm developers, the same logic for accounting, OOM
handling, etc. applies to the kdbus shmem buffers, as they are written
to from the context of another task.  If this is mistaken, then yes, you
are right, and the code will have to be changed.

> This discussion has been full of kdbus people ignoring Andy saying "it
> worked with the user space version, it killed the machine with kdbus".
> And now people trying to claim the issues are the same. HELL NO.

Andy found some great bugs with regards to flooding the bus with
requests, which has not been ignored at all.  The same issue is present
in dbus today, but the kdbus code runs faster and more messages were
being sent than the current userspace dbus daemon, so the machine
becomes unresponsive easier.

The issue is with userspace clients opting in to receive all
NameOwnerChanged messages on the bus, which is not a good idea as they
constantly get woken up and process them, which is why the CPU was
pegged.  This issue should now be fixed in Rawhide for some of the
packages we found that were doing this. Maintainers of other packages
have been informed.  End result, no one has ever really tested sending
"bad" messages to the current system as all existing dbus users try to
be "good actors", thanks to Andy's testing, these apps should all now
become much more robust.

In chatting with Daniel on IRC, he is writing up a summary of how the
kdbus memory pools work in more detail, and he said he would sent that
out in a day or so, so that everyone can review.

thanks,

greg k-h

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-09 Thread Greg Kroah-Hartman
On Fri, Aug 07, 2015 at 06:26:31PM +0300, Linus Torvalds wrote:
 User space memory allocation is not AT ALL the same thing as kdbus.
 Kernel allocations are very very different from user allocations. We
 have reasonable, fairly tested, and generic models for handling user
 space memory allocation issues - limiting, debugging, failing, and
 handling catastrophes (ie oom). And no, even that doesn't always work
 perfectly, but at least there is a *lot* of support for it, and this
 is not some special case.

The memory in this case is a shmem file that is created by the kernel,
but on behalf of the bus client task, which will eventually own it. As
discussed with the mm developers, the same logic for accounting, OOM
handling, etc. applies to the kdbus shmem buffers, as they are written
to from the context of another task.  If this is mistaken, then yes, you
are right, and the code will have to be changed.

 This discussion has been full of kdbus people ignoring Andy saying it
 worked with the user space version, it killed the machine with kdbus.
 And now people trying to claim the issues are the same. HELL NO.

Andy found some great bugs with regards to flooding the bus with
requests, which has not been ignored at all.  The same issue is present
in dbus today, but the kdbus code runs faster and more messages were
being sent than the current userspace dbus daemon, so the machine
becomes unresponsive easier.

The issue is with userspace clients opting in to receive all
NameOwnerChanged messages on the bus, which is not a good idea as they
constantly get woken up and process them, which is why the CPU was
pegged.  This issue should now be fixed in Rawhide for some of the
packages we found that were doing this. Maintainers of other packages
have been informed.  End result, no one has ever really tested sending
bad messages to the current system as all existing dbus users try to
be good actors, thanks to Andy's testing, these apps should all now
become much more robust.

In chatting with Daniel on IRC, he is writing up a summary of how the
kdbus memory pools work in more detail, and he said he would sent that
out in a day or so, so that everyone can review.

thanks,

greg k-h

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-09 Thread Daniel Mack
On 08/09/2015 09:00 PM, Greg Kroah-Hartman wrote:
 In chatting with Daniel on IRC, he is writing up a summary of how the
 kdbus memory pools work in more detail, and he said he would sent that
 out in a day or so, so that everyone can review.

Yes, let me quickly describe again how the kdbus pool logic works.

Every bus connection (peer) owns a buffer which is used in order to
receive payloads. Such payloads are either messages sent from other
connections, notifications or returned answer structures in return of
query commands (name lists, etc).

In order to avoid the kernel having to maintaining an internal buffer
the connections then read from with an extra command, we decided to let
the connections own their buffer directly, so they can mmap() the memory
into their task. Allocating a local buffer to collect asynchronous
messages is what they would need to do anyway, so we implemented a
short-cut that allows the kernel to directly access the memory and write
to it. The size of this buffer pool is configured by each connection
individually, during the HELLO call, so the kernel interface is as
flexible as any other memory allocation scheme the kernel provides and
is subject to the same limits.

Internally, the connection pool is simply a shmem backed file. From the
context of the HELLO ioctl, we are calling into shmem_file_setup(), so
the file is eventually owned by the task which created the bus task
connecting to the bus. One reason why we do the shmem file allocation in
the kernel and on behalf of a the userspace task is that we clear the
VM_MAYWRITE bit to prevent the task from writing to the pool through its
mapped buffer. We also do not set VM_NORESERVE, so the entire buffer is
pre-accounted for the task that created the connection.

The pool implementation uses an r/b tree to organize the buffer into
slices. Those slices can be kept by userspace as long as the parsing
implementation needs to have access to them. When finished, the slices
are freed. A simple ring buffer cannot cope with the gaps that emerge by
that.

When a connection buffer is written to, it is done from the context of
another task which calls into the kdbus code through one of the ioctls.
The memcg implementation should hence charge the task that acts as
writer, which is maybe not ideal but can be changed easily with some
addition to the internal APIs. We omitted it for the current version,
which is non-intrusive with regards to other kernel subsystems.

The kdbus implementation is actually comparable to two tasks X and Y
which both have their own buffer file open and mmap()ed, and they both
pass their FD to the other side. If X now writes to Y's file, and that
is causing a page fault, X is accounted for it, correct?

The kernel does *not* do any memory allocation to buffer payload, and
all other allocations (for instance, to keep around the internal state
of a connection, names etc) are subject to conservatively chosen
limitations. There is no unbounded memory allocation in kdbus that I am
aware of. If there was, it would clearly be a bug.

Addressing the point Andy made earlier: yes, due to memory
overcommitment, OOM situations may happen with certain patterns, but the
kernel should have the same measures to deal with them that it already
has with other types of shared userspace memory. Right?

Hope that all makes sense, we're open to discussions around the desired
accounting details. I've copied linux-mm to let more people have a look
into this again.


Thanks,
Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-09 Thread Andy Lutomirski
On Sun, Aug 9, 2015 at 3:11 PM, Daniel Mack dan...@zonque.org wrote:

 Internally, the connection pool is simply a shmem backed file. From the
 context of the HELLO ioctl, we are calling into shmem_file_setup(), so
 the file is eventually owned by the task which created the bus task
 connecting to the bus. One reason why we do the shmem file allocation in
 the kernel and on behalf of a the userspace task is that we clear the
 VM_MAYWRITE bit to prevent the task from writing to the pool through its
 mapped buffer. We also do not set VM_NORESERVE, so the entire buffer is
 pre-accounted for the task that created the connection.

I don't have access to the system I've been using for testing right
now, but I wonder how the kdbus pool stack up against the entire rest
of memory allocations for the average desktop process.


 The pool implementation uses an r/b tree to organize the buffer into
 slices. Those slices can be kept by userspace as long as the parsing
 implementation needs to have access to them. When finished, the slices
 are freed. A simple ring buffer cannot cope with the gaps that emerge by
 that.

 When a connection buffer is written to, it is done from the context of
 another task which calls into the kdbus code through one of the ioctls.
 The memcg implementation should hence charge the task that acts as
 writer, which is maybe not ideal but can be changed easily with some
 addition to the internal APIs. We omitted it for the current version,
 which is non-intrusive with regards to other kernel subsystems.


This has at least the following weakness.  I can very easily get
systemd to write to my shmem-backed pool: simply subscribe to one of
its broadcasts.  If I cause such a write to be very slow
(intentionally or otherwise), then PID 1 blocks.

If you change the memcg code to charge me instead of PID 1 (as it
should IMO), then the problem gets worse.

 The kdbus implementation is actually comparable to two tasks X and Y
 which both have their own buffer file open and mmap()ed, and they both
 pass their FD to the other side. If X now writes to Y's file, and that
 is causing a page fault, X is accounted for it, correct?

If PID 1 accepted a memfd from me (even a properly sealed one) and
wrote to it, I would wonder whether it were actually a good idea.

Does this scheme have any actual measurable advantage over the
traditional model of a small non-paged buffer in the kernel (i.e. the
way sockets work) with explicit userspace memfd use as appropriate?

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-09 Thread David Lang

On Sun, 9 Aug 2015, Greg Kroah-Hartman wrote:


The issue is with userspace clients opting in to receive all
NameOwnerChanged messages on the bus, which is not a good idea as they
constantly get woken up and process them, which is why the CPU was
pegged.  This issue should now be fixed in Rawhide for some of the
packages we found that were doing this. Maintainers of other packages
have been informed.  End result, no one has ever really tested sending
bad messages to the current system as all existing dbus users try to
be good actors, thanks to Andy's testing, these apps should all now
become much more robust.


Does it require elevated privileges to opt to receive all NameOwnerChanged 
messages on the bus? Is it the default unless the apps opt for something more 
restrictive? or is it somewhere in between?


I was under the impression that the days of writing system-level stuff that 
assumes that all userspace apps are going to 'play nice' went out a decade or 
more ago. It's fine if the userspace app can kill itself, or possibly even the 
user it's running as, but being able to kill apps running as other users, let 
alone the whole system is a problem nowdays.


It may be able to happen in a default system, but this is why cgroups and 
namespaces have been created, to give the system admin the ability to limit the 
resources that any one app can consume. Introducing a new mechanism that allows 
one user to consume resources allocated to another and kill the system without 
providing a kernel level mechanism to limit the damage (as opposed to fixing 
individual apps) seems rather short-sighted at best.


David Lang


Re: kdbus: to merge or not to merge?

2015-08-07 Thread cee1
2015-08-07 2:43 GMT+08:00 Andy Lutomirski :
> On Thu, Aug 6, 2015 at 11:14 AM, Daniel Mack  wrote:
>> On 08/06/2015 05:21 PM, Andy Lutomirski wrote:
>>> Maybe gdbus really does use kdbus already, but on
>>> very brief inspection it looked like it didn't at least on my test VM.
>>
>> No, it's not in any released version yet. The patches for that are being
>> worked on though and look promising.
>>
>>> If the client buffers on !EPOLLOUT and has a monster buffer, then
>>> that's the client's problem.
>>>
>>> If every single program has a monster buffer, then it's everyone's
>>> problem, and the size of the problem gets multiplied by the number of
>>> programs.
>>
>> The size of the memory pool of a bus client is chosen by the client
>> itself individually during the HELLO call. It's pretty much the same as
>> if the client allocated the buffer itself, except that the kernel does
>> it on their behalf.
>>
>> Also note that kdbus features a peer-to-peer based quota accounting
>> logic, so a single bus connection can not DOS another one by filling its
>> buffer.
>
> I haven't looked at the quota code at all.
>
> Nonetheless, it looks like the slice logic (aside: it looks *way* more
> complicated than necessary -- what's wrong with circular buffers)
> will, under most (but not all!) workloads, concentrate access to a
> smallish fraction of the pool.  This is IMO bad, since it means that
> most of the time most of the pool will remain uncommitted.  If, at
> some point, something causes the access pattern to change and hit all
> the pages (even just once), suddenly all of the pools get committed,
> and your memory usage blows up.
>
> Again, please stop blaming the clients.  In practice, kdbus is a
> system involving the kernel, systemd, sd-bus, and other stuff, mostly
> written by the same people.  If kdbus gets merged and it survives but
> half the clients blow up and peoples' systems fall over, that's not
> okay.

Any comments about the questions mentioned by Andy?
In KDBUS, sender writes a page of receiver's tmpfs space, may either
helps receiver to escape its memcg limitation, or incurs receiver's
limitation?

Also, I'm curious about similar problems in these cases:
1. A UNIX domain Server (SOCK_STREAM or SOCK_DGRAM) replies to its
Clients, but some clients consume the messages __too slow__, will the
server block? Or can it serve other clients instead of blocking?

2. Open netlink sockets of NETLINK_KOBJECT_UEVENT, but some processes
consume uevent __too slow__, and uevent is continually triggered. Will
the system block? Or those processes finally lost some uevents?

3. Watch a directory via inotify, but some processes consume events
__too slow__, and file operations is continually performed against the
directory. Will the system block? Or those processes finally lost some
events?



-- 
Regards,

- cee1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-07 Thread Andy Lutomirski
On Fri, Aug 7, 2015 at 7:40 AM, Daniel Mack  wrote:
> On 08/06/2015 08:43 PM, Andy Lutomirski wrote:
>> Nonetheless, it looks like the slice logic (aside: it looks *way* more
>> complicated than necessary -- what's wrong with circular buffers)
>> will, under most (but not all!) workloads, concentrate access to a
>> smallish fraction of the pool.  This is IMO bad, since it means that
>> most of the time most of the pool will remain uncommitted.  If, at
>> some point, something causes the access pattern to change and hit all
>> the pages (even just once), suddenly all of the pools get committed,
>> and your memory usage blows up.
>
> That's a general problem with memory overcommitment, and not specific to
> kdbus. IOW: You'd have the same problem with a similar logic implemented
> in userspace, right?
>

Sure, except that, if it's in userspace and it starts causing
problems, then userspace can fix it without running into kernel ABI
stability issues.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-07 Thread Daniel Mack
On 08/06/2015 08:43 PM, Andy Lutomirski wrote:
> Nonetheless, it looks like the slice logic (aside: it looks *way* more
> complicated than necessary -- what's wrong with circular buffers)
> will, under most (but not all!) workloads, concentrate access to a
> smallish fraction of the pool.  This is IMO bad, since it means that
> most of the time most of the pool will remain uncommitted.  If, at
> some point, something causes the access pattern to change and hit all
> the pages (even just once), suddenly all of the pools get committed,
> and your memory usage blows up.

That's a general problem with memory overcommitment, and not specific to
kdbus. IOW: You'd have the same problem with a similar logic implemented
in userspace, right?


Daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-07 Thread cee1
2015-08-07 2:43 GMT+08:00 Andy Lutomirski l...@amacapital.net:
 On Thu, Aug 6, 2015 at 11:14 AM, Daniel Mack dan...@zonque.org wrote:
 On 08/06/2015 05:21 PM, Andy Lutomirski wrote:
 Maybe gdbus really does use kdbus already, but on
 very brief inspection it looked like it didn't at least on my test VM.

 No, it's not in any released version yet. The patches for that are being
 worked on though and look promising.

 If the client buffers on !EPOLLOUT and has a monster buffer, then
 that's the client's problem.

 If every single program has a monster buffer, then it's everyone's
 problem, and the size of the problem gets multiplied by the number of
 programs.

 The size of the memory pool of a bus client is chosen by the client
 itself individually during the HELLO call. It's pretty much the same as
 if the client allocated the buffer itself, except that the kernel does
 it on their behalf.

 Also note that kdbus features a peer-to-peer based quota accounting
 logic, so a single bus connection can not DOS another one by filling its
 buffer.

 I haven't looked at the quota code at all.

 Nonetheless, it looks like the slice logic (aside: it looks *way* more
 complicated than necessary -- what's wrong with circular buffers)
 will, under most (but not all!) workloads, concentrate access to a
 smallish fraction of the pool.  This is IMO bad, since it means that
 most of the time most of the pool will remain uncommitted.  If, at
 some point, something causes the access pattern to change and hit all
 the pages (even just once), suddenly all of the pools get committed,
 and your memory usage blows up.

 Again, please stop blaming the clients.  In practice, kdbus is a
 system involving the kernel, systemd, sd-bus, and other stuff, mostly
 written by the same people.  If kdbus gets merged and it survives but
 half the clients blow up and peoples' systems fall over, that's not
 okay.

Any comments about the questions mentioned by Andy?
In KDBUS, sender writes a page of receiver's tmpfs space, may either
helps receiver to escape its memcg limitation, or incurs receiver's
limitation?

Also, I'm curious about similar problems in these cases:
1. A UNIX domain Server (SOCK_STREAM or SOCK_DGRAM) replies to its
Clients, but some clients consume the messages __too slow__, will the
server block? Or can it serve other clients instead of blocking?

2. Open netlink sockets of NETLINK_KOBJECT_UEVENT, but some processes
consume uevent __too slow__, and uevent is continually triggered. Will
the system block? Or those processes finally lost some uevents?

3. Watch a directory via inotify, but some processes consume events
__too slow__, and file operations is continually performed against the
directory. Will the system block? Or those processes finally lost some
events?



-- 
Regards,

- cee1
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-07 Thread Andy Lutomirski
On Fri, Aug 7, 2015 at 7:40 AM, Daniel Mack dan...@zonque.org wrote:
 On 08/06/2015 08:43 PM, Andy Lutomirski wrote:
 Nonetheless, it looks like the slice logic (aside: it looks *way* more
 complicated than necessary -- what's wrong with circular buffers)
 will, under most (but not all!) workloads, concentrate access to a
 smallish fraction of the pool.  This is IMO bad, since it means that
 most of the time most of the pool will remain uncommitted.  If, at
 some point, something causes the access pattern to change and hit all
 the pages (even just once), suddenly all of the pools get committed,
 and your memory usage blows up.

 That's a general problem with memory overcommitment, and not specific to
 kdbus. IOW: You'd have the same problem with a similar logic implemented
 in userspace, right?


Sure, except that, if it's in userspace and it starts causing
problems, then userspace can fix it without running into kernel ABI
stability issues.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-07 Thread Daniel Mack
On 08/06/2015 08:43 PM, Andy Lutomirski wrote:
 Nonetheless, it looks like the slice logic (aside: it looks *way* more
 complicated than necessary -- what's wrong with circular buffers)
 will, under most (but not all!) workloads, concentrate access to a
 smallish fraction of the pool.  This is IMO bad, since it means that
 most of the time most of the pool will remain uncommitted.  If, at
 some point, something causes the access pattern to change and hit all
 the pages (even just once), suddenly all of the pools get committed,
 and your memory usage blows up.

That's a general problem with memory overcommitment, and not specific to
kdbus. IOW: You'd have the same problem with a similar logic implemented
in userspace, right?


Daniel

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-06 Thread Andy Lutomirski
On Thu, Aug 6, 2015 at 11:14 AM, Daniel Mack  wrote:
> On 08/06/2015 05:21 PM, Andy Lutomirski wrote:
>> Maybe gdbus really does use kdbus already, but on
>> very brief inspection it looked like it didn't at least on my test VM.
>
> No, it's not in any released version yet. The patches for that are being
> worked on though and look promising.
>
>> If the client buffers on !EPOLLOUT and has a monster buffer, then
>> that's the client's problem.
>>
>> If every single program has a monster buffer, then it's everyone's
>> problem, and the size of the problem gets multiplied by the number of
>> programs.
>
> The size of the memory pool of a bus client is chosen by the client
> itself individually during the HELLO call. It's pretty much the same as
> if the client allocated the buffer itself, except that the kernel does
> it on their behalf.
>
> Also note that kdbus features a peer-to-peer based quota accounting
> logic, so a single bus connection can not DOS another one by filling its
> buffer.

I haven't looked at the quota code at all.

Nonetheless, it looks like the slice logic (aside: it looks *way* more
complicated than necessary -- what's wrong with circular buffers)
will, under most (but not all!) workloads, concentrate access to a
smallish fraction of the pool.  This is IMO bad, since it means that
most of the time most of the pool will remain uncommitted.  If, at
some point, something causes the access pattern to change and hit all
the pages (even just once), suddenly all of the pools get committed,
and your memory usage blows up.

Again, please stop blaming the clients.  In practice, kdbus is a
system involving the kernel, systemd, sd-bus, and other stuff, mostly
written by the same people.  If kdbus gets merged and it survives but
half the clients blow up and peoples' systems fall over, that's not
okay.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-06 Thread Daniel Mack
On 08/06/2015 05:21 PM, Andy Lutomirski wrote:
> Maybe gdbus really does use kdbus already, but on
> very brief inspection it looked like it didn't at least on my test VM.

No, it's not in any released version yet. The patches for that are being
worked on though and look promising.

> If the client buffers on !EPOLLOUT and has a monster buffer, then
> that's the client's problem.
> 
> If every single program has a monster buffer, then it's everyone's
> problem, and the size of the problem gets multiplied by the number of
> programs.

The size of the memory pool of a bus client is chosen by the client
itself individually during the HELLO call. It's pretty much the same as
if the client allocated the buffer itself, except that the kernel does
it on their behalf.

Also note that kdbus features a peer-to-peer based quota accounting
logic, so a single bus connection can not DOS another one by filling its
buffer.


Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-06 Thread Daniel Mack
On 08/06/2015 05:27 PM, Andy Lutomirski wrote:
>> In DBus (both kdbus and DBus1), such matches are installed on the
>> > NameOwnerChanged signal, and they can be either specific to a single ID,
>> > or broad, which will make them match on any ID. There's actually no
>> > reason for applications to install unspecific matches, but if they do,
>> > they will of course get what they asked for, and are woken up on every
>> > ID that is added to or removed from the bus. What you're seeing in your
>> > system profile is that some applications misbehave and install
>> > unspecific matches when they shouldn't. That's a userspace bug that
>> > needs fixing. Two candidates were actually in the systemd code base
>> > (logind and PID1), and both are now patched.
>
> Can you point me at the patch?

  https://github.com/systemd/systemd/pull/876
  https://github.com/systemd/systemd/pull/887

firewalld and possibly some other applications in the Fedora default
install use python-slip, a convenience library that currently
unconditionally installs the broad matches. I filed a bug with patches here:

  https://fedorahosted.org/python-slip/ticket/2


And I filed more bugs for some GNOME components.


Thanks,
Daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-06 Thread Andy Lutomirski
On Thu, Aug 6, 2015 at 12:06 AM, Daniel Mack  wrote:
> Hi Andy,
>
> On 08/05/2015 02:18 AM, Andy Lutomirski wrote:
>> I added the missing sd_bus_unref call.
>>
>> With userspace dbus, my program takes 95% CPU and dbus-daemon takes
>> 88% CPU or so.
>>
>> With kdbus, I see abuse-bus (my test), systemd-journald,
>> systemd-bus-proxy, auditd, gnome-shell, mission-control, sedispatch,
>> firewalld, polkitd, NetworkManager, systemd, avahi-daemon, audisp,
>> abrt-dump-jour* (whatever it's called -- it truncated), upowerd, and
>> systemd-logind all taking tons of CPU.  I've listed them in decreasing
>> order of amount of CPU burned -- the top several are taking about as
>> much as is possible.  Load average is over 13.  That's if I run it
>> from a text console while I'm logged in to gnome in a different VT.
>
> That's right, I can reproduce this here. To explain what's going on, let
> me provide some background.
>
> Every time a client connects to kdbus, a new ID is assigned to the
> connection, and other connections which have previously subscribed to
> notifications of type KDBUS_ITEM_ID_ADD or _REMOVE get a notification
> and are woken up so they can dispatch it. By default, no such matches
> exists, applicaions have to explicitly opt-in if they are interested in
> these events.
>
> In DBus (both kdbus and DBus1), such matches are installed on the
> NameOwnerChanged signal, and they can be either specific to a single ID,
> or broad, which will make them match on any ID. There's actually no
> reason for applications to install unspecific matches, but if they do,
> they will of course get what they asked for, and are woken up on every
> ID that is added to or removed from the bus. What you're seeing in your
> system profile is that some applications misbehave and install
> unspecific matches when they shouldn't. That's a userspace bug that
> needs fixing. Two candidates were actually in the systemd code base
> (logind and PID1), and both are now patched.

Can you point me at the patch?

It sounds like that will reduce the scalability issue with this
particular test from whatever userspace overhead exists * number of
clients to just the overhead of looping over all clients and their
matches in the kernel.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-06 Thread Andy Lutomirski
On Aug 6, 2015 1:04 AM, "David Herrmann"  wrote:
> >  Given that all existing prototype userspace that I'm aware of
> > (systemd and its consumers) apparently opts in, I don't really care
> > that the feature is opt-in.
>
> This is just plain wrong. Out of the dozens of dbus applications, you
> found like 9 which are buggy? Two of them are already fixed, the
> maintainers of the other ones notified.
> I'd be interested where you got this notion that "all existing
> prototype userspace [...] opts in".
>

I would say instead that, out of one in-use kdbus library, I found one
that was buggy.  Maybe gdbus really does use kdbus already, but on
very brief inspection it looked like it didn't at least on my test VM.

>
> > Also, you haven't addressed the memory usage issues --
>
> ..because it doesn't change anything. If your IPC is message based and
> async, _someone_ needs to buffer. I don't see the difference between
> buffering locally on !EPOLLOUT or buffering in a shmem pool. In both
> cases, clients have control over the buffer size. If you disagree,
> please _elaborate_.

If the client buffers on !EPOLLOUT and has a monster buffer, then
that's the client's problem.

If every single program has a monster buffer, then it's everyone's
problem, and the size of the problem gets multiplied by the number of
programs.

Also, sensible clients that produce bulk data will throttle on
!EPOLLOUT rather than blindly buffering, but that's not an option when
the huge buffer is on the receiver's end.  Read up on "bufferbloat".

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-06 Thread Martin Steigerwald
Am Donnerstag, 6. August 2015, 10:04:57 schrieb David Herrmann:
> >  Given that all existing prototype userspace that I'm aware of
> >
> > (systemd and its consumers) apparently opts in, I don't really care
> > that the feature is opt-in.
> 
> This is just plain wrong. Out of the dozens of dbus applications, you
> found like 9 which are buggy? Two of them are already fixed, the
> maintainers of the other ones notified.
> I'd be interested where you got this notion that "all existing
> prototype userspace [...] opts in".

But these few can create the issues Andy described?

Sure, one can argue I can setup a stress or stress-ng command line invocation 
as root user that will basically grind a Linux system to a halt – and in a way 
I consider this to be a bug in the kernel as well, but one that exists since a 
long time. But a GUI application running as a user?

How about some robustness regarding what you see as bugs in userspace here?

I think "The bug is not mine" is exactly the same language we have seen here 
before. If the kernel relies on bug-free userspace applications in order to do 
its job properly I think it has robustness issues. One certainly wouldn´t want 
this with any mission critical realtime OS. I think it is the kernel that 
should be in control.

Thanks,
-- 
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-06 Thread David Herrmann
Hi

On Wed, Aug 5, 2015 at 10:11 PM, Andy Lutomirski  wrote:
> On Wed, Aug 5, 2015 at 12:10 AM, David Herrmann  wrote:
>> Hi
>>
>> On Tue, Aug 4, 2015 at 4:47 PM, Andy Lutomirski  wrote:
>>> On Tue, Aug 4, 2015 at 7:09 AM, David Herrmann  
>>> wrote:
 This is a bug in the proxy (which is already fixed).
>>>
>>> Should I expect to see it in Rawhide soon?
>>
>> Use this workaround until it does:
>>
>> $ DBUS_SYSTEM_BUS_ADDRESS="kernel:path=/sys/fs/kdbus/0-system/bus"
>> ./your-binary
>>
>
> Which binary is supposed to be run like that?

Your test.

>>> Anyway, the broadcasts that I intended to exercise were
>>> KDBUS_ITEM_ID_REMOVE.  Those appear to be broadcast to everyone,
>>> irrespective of "policy", so long as the "match" thingy allows it.
>>
>> Matches are opt-in, not opt-out. Nobody will get this message unless
>> they opt in.
>>
>
> And what opts in?  Either something's broken, or there's a different
> scalabilty problem, or a whole pile of kdbus-using programs in Fedora
> Rawhide do, in fact, opt in.

See Daniel's explanation. If applications subscribe to all
notifications, they get what they asked for. I recommend filing bug
reports for the applications in question.

>  Given that all existing prototype userspace that I'm aware of
> (systemd and its consumers) apparently opts in, I don't really care
> that the feature is opt-in.

This is just plain wrong. Out of the dozens of dbus applications, you
found like 9 which are buggy? Two of them are already fixed, the
maintainers of the other ones notified.
I'd be interested where you got this notion that "all existing
prototype userspace [...] opts in".

> Also, given things like this:
>
> commit d27c8057699d164648b7d8c1559fa6529998f89d
> Author: David Herrmann 
> Date:   Tue May 26 09:30:14 2015 +0200
>
> kdbus: forward ID notifications to everyone
>
> it really does seem to me that the point of these ID notifications is
> for everyone to get them.

It's not. This patch just opens the policy so everyone can see those
notifications. By default, it's not delivered to anyone.

> Also, you haven't addressed the memory usage issues --

..because it doesn't change anything. If your IPC is message based and
async, _someone_ needs to buffer. I don't see the difference between
buffering locally on !EPOLLOUT or buffering in a shmem pool. In both
cases, clients have control over the buffer size. If you disagree,
please _elaborate_.

Thanks
David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-06 Thread Daniel Mack
Hi Andy,

On 08/05/2015 02:18 AM, Andy Lutomirski wrote:
> I added the missing sd_bus_unref call.
> 
> With userspace dbus, my program takes 95% CPU and dbus-daemon takes
> 88% CPU or so.
> 
> With kdbus, I see abuse-bus (my test), systemd-journald,
> systemd-bus-proxy, auditd, gnome-shell, mission-control, sedispatch,
> firewalld, polkitd, NetworkManager, systemd, avahi-daemon, audisp,
> abrt-dump-jour* (whatever it's called -- it truncated), upowerd, and
> systemd-logind all taking tons of CPU.  I've listed them in decreasing
> order of amount of CPU burned -- the top several are taking about as
> much as is possible.  Load average is over 13.  That's if I run it
> from a text console while I'm logged in to gnome in a different VT.

That's right, I can reproduce this here. To explain what's going on, let
me provide some background.

Every time a client connects to kdbus, a new ID is assigned to the
connection, and other connections which have previously subscribed to
notifications of type KDBUS_ITEM_ID_ADD or _REMOVE get a notification
and are woken up so they can dispatch it. By default, no such matches
exists, applicaions have to explicitly opt-in if they are interested in
these events.

In DBus (both kdbus and DBus1), such matches are installed on the
NameOwnerChanged signal, and they can be either specific to a single ID,
or broad, which will make them match on any ID. There's actually no
reason for applications to install unspecific matches, but if they do,
they will of course get what they asked for, and are woken up on every
ID that is added to or removed from the bus. What you're seeing in your
system profile is that some applications misbehave and install
unspecific matches when they shouldn't. That's a userspace bug that
needs fixing. Two candidates were actually in the systemd code base
(logind and PID1), and both are now patched.

Note that these applications are actually affected on both DBus1 and
kdbus. The reason you didn't see them trip up in your test is that
sd_bus_open() behaves differently in the two worlds. In kdbus, it will
immediately call into the kernel and register a new connection, hence
triggering the behavior described above. On DBus1, however, the HELLO
message will not be transmitted to the daemon until the first message is
sent, so no ID is assigned, and no notifications are sent. When
augmenting the test program a little so it reads its own ID on the bus,
you'll see similar behavior on DBus1 as well, but the bottleneck in this
case is the daemon, which significantly mitigates the load caused by
other tasks.

So, to wrap it up: you've triggered an existing userspace bug. The
userspace components under our control have now been fixed, and we'll
talk to other people to make them aware of the issue too. However, these
issues are not directly related to kdbus, but rather show more impact as
a side-effect now.

You've raised a valid point here. Thanks a lot for providing this test,
much appreciated!


Daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-06 Thread Daniel Mack
Hi Andy,

On 08/05/2015 02:18 AM, Andy Lutomirski wrote:
 I added the missing sd_bus_unref call.
 
 With userspace dbus, my program takes 95% CPU and dbus-daemon takes
 88% CPU or so.
 
 With kdbus, I see abuse-bus (my test), systemd-journald,
 systemd-bus-proxy, auditd, gnome-shell, mission-control, sedispatch,
 firewalld, polkitd, NetworkManager, systemd, avahi-daemon, audisp,
 abrt-dump-jour* (whatever it's called -- it truncated), upowerd, and
 systemd-logind all taking tons of CPU.  I've listed them in decreasing
 order of amount of CPU burned -- the top several are taking about as
 much as is possible.  Load average is over 13.  That's if I run it
 from a text console while I'm logged in to gnome in a different VT.

That's right, I can reproduce this here. To explain what's going on, let
me provide some background.

Every time a client connects to kdbus, a new ID is assigned to the
connection, and other connections which have previously subscribed to
notifications of type KDBUS_ITEM_ID_ADD or _REMOVE get a notification
and are woken up so they can dispatch it. By default, no such matches
exists, applicaions have to explicitly opt-in if they are interested in
these events.

In DBus (both kdbus and DBus1), such matches are installed on the
NameOwnerChanged signal, and they can be either specific to a single ID,
or broad, which will make them match on any ID. There's actually no
reason for applications to install unspecific matches, but if they do,
they will of course get what they asked for, and are woken up on every
ID that is added to or removed from the bus. What you're seeing in your
system profile is that some applications misbehave and install
unspecific matches when they shouldn't. That's a userspace bug that
needs fixing. Two candidates were actually in the systemd code base
(logind and PID1), and both are now patched.

Note that these applications are actually affected on both DBus1 and
kdbus. The reason you didn't see them trip up in your test is that
sd_bus_open() behaves differently in the two worlds. In kdbus, it will
immediately call into the kernel and register a new connection, hence
triggering the behavior described above. On DBus1, however, the HELLO
message will not be transmitted to the daemon until the first message is
sent, so no ID is assigned, and no notifications are sent. When
augmenting the test program a little so it reads its own ID on the bus,
you'll see similar behavior on DBus1 as well, but the bottleneck in this
case is the daemon, which significantly mitigates the load caused by
other tasks.

So, to wrap it up: you've triggered an existing userspace bug. The
userspace components under our control have now been fixed, and we'll
talk to other people to make them aware of the issue too. However, these
issues are not directly related to kdbus, but rather show more impact as
a side-effect now.

You've raised a valid point here. Thanks a lot for providing this test,
much appreciated!


Daniel

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-06 Thread Daniel Mack
On 08/06/2015 05:21 PM, Andy Lutomirski wrote:
 Maybe gdbus really does use kdbus already, but on
 very brief inspection it looked like it didn't at least on my test VM.

No, it's not in any released version yet. The patches for that are being
worked on though and look promising.

 If the client buffers on !EPOLLOUT and has a monster buffer, then
 that's the client's problem.
 
 If every single program has a monster buffer, then it's everyone's
 problem, and the size of the problem gets multiplied by the number of
 programs.

The size of the memory pool of a bus client is chosen by the client
itself individually during the HELLO call. It's pretty much the same as
if the client allocated the buffer itself, except that the kernel does
it on their behalf.

Also note that kdbus features a peer-to-peer based quota accounting
logic, so a single bus connection can not DOS another one by filling its
buffer.


Thanks,
Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-06 Thread Andy Lutomirski
On Thu, Aug 6, 2015 at 11:14 AM, Daniel Mack dan...@zonque.org wrote:
 On 08/06/2015 05:21 PM, Andy Lutomirski wrote:
 Maybe gdbus really does use kdbus already, but on
 very brief inspection it looked like it didn't at least on my test VM.

 No, it's not in any released version yet. The patches for that are being
 worked on though and look promising.

 If the client buffers on !EPOLLOUT and has a monster buffer, then
 that's the client's problem.

 If every single program has a monster buffer, then it's everyone's
 problem, and the size of the problem gets multiplied by the number of
 programs.

 The size of the memory pool of a bus client is chosen by the client
 itself individually during the HELLO call. It's pretty much the same as
 if the client allocated the buffer itself, except that the kernel does
 it on their behalf.

 Also note that kdbus features a peer-to-peer based quota accounting
 logic, so a single bus connection can not DOS another one by filling its
 buffer.

I haven't looked at the quota code at all.

Nonetheless, it looks like the slice logic (aside: it looks *way* more
complicated than necessary -- what's wrong with circular buffers)
will, under most (but not all!) workloads, concentrate access to a
smallish fraction of the pool.  This is IMO bad, since it means that
most of the time most of the pool will remain uncommitted.  If, at
some point, something causes the access pattern to change and hit all
the pages (even just once), suddenly all of the pools get committed,
and your memory usage blows up.

Again, please stop blaming the clients.  In practice, kdbus is a
system involving the kernel, systemd, sd-bus, and other stuff, mostly
written by the same people.  If kdbus gets merged and it survives but
half the clients blow up and peoples' systems fall over, that's not
okay.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-06 Thread Daniel Mack
On 08/06/2015 05:27 PM, Andy Lutomirski wrote:
 In DBus (both kdbus and DBus1), such matches are installed on the
  NameOwnerChanged signal, and they can be either specific to a single ID,
  or broad, which will make them match on any ID. There's actually no
  reason for applications to install unspecific matches, but if they do,
  they will of course get what they asked for, and are woken up on every
  ID that is added to or removed from the bus. What you're seeing in your
  system profile is that some applications misbehave and install
  unspecific matches when they shouldn't. That's a userspace bug that
  needs fixing. Two candidates were actually in the systemd code base
  (logind and PID1), and both are now patched.

 Can you point me at the patch?

  https://github.com/systemd/systemd/pull/876
  https://github.com/systemd/systemd/pull/887

firewalld and possibly some other applications in the Fedora default
install use python-slip, a convenience library that currently
unconditionally installs the broad matches. I filed a bug with patches here:

  https://fedorahosted.org/python-slip/ticket/2


And I filed more bugs for some GNOME components.


Thanks,
Daniel

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-06 Thread David Herrmann
Hi

On Wed, Aug 5, 2015 at 10:11 PM, Andy Lutomirski l...@amacapital.net wrote:
 On Wed, Aug 5, 2015 at 12:10 AM, David Herrmann dh.herrm...@gmail.com wrote:
 Hi

 On Tue, Aug 4, 2015 at 4:47 PM, Andy Lutomirski l...@amacapital.net wrote:
 On Tue, Aug 4, 2015 at 7:09 AM, David Herrmann dh.herrm...@gmail.com 
 wrote:
 This is a bug in the proxy (which is already fixed).

 Should I expect to see it in Rawhide soon?

 Use this workaround until it does:

 $ DBUS_SYSTEM_BUS_ADDRESS=kernel:path=/sys/fs/kdbus/0-system/bus
 ./your-binary


 Which binary is supposed to be run like that?

Your test.

 Anyway, the broadcasts that I intended to exercise were
 KDBUS_ITEM_ID_REMOVE.  Those appear to be broadcast to everyone,
 irrespective of policy, so long as the match thingy allows it.

 Matches are opt-in, not opt-out. Nobody will get this message unless
 they opt in.


 And what opts in?  Either something's broken, or there's a different
 scalabilty problem, or a whole pile of kdbus-using programs in Fedora
 Rawhide do, in fact, opt in.

See Daniel's explanation. If applications subscribe to all
notifications, they get what they asked for. I recommend filing bug
reports for the applications in question.

  Given that all existing prototype userspace that I'm aware of
 (systemd and its consumers) apparently opts in, I don't really care
 that the feature is opt-in.

This is just plain wrong. Out of the dozens of dbus applications, you
found like 9 which are buggy? Two of them are already fixed, the
maintainers of the other ones notified.
I'd be interested where you got this notion that all existing
prototype userspace [...] opts in.

 Also, given things like this:

 commit d27c8057699d164648b7d8c1559fa6529998f89d
 Author: David Herrmann dh.herrm...@gmail.com
 Date:   Tue May 26 09:30:14 2015 +0200

 kdbus: forward ID notifications to everyone

 it really does seem to me that the point of these ID notifications is
 for everyone to get them.

It's not. This patch just opens the policy so everyone can see those
notifications. By default, it's not delivered to anyone.

 Also, you haven't addressed the memory usage issues --

..because it doesn't change anything. If your IPC is message based and
async, _someone_ needs to buffer. I don't see the difference between
buffering locally on !EPOLLOUT or buffering in a shmem pool. In both
cases, clients have control over the buffer size. If you disagree,
please _elaborate_.

Thanks
David
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-06 Thread Andy Lutomirski
On Aug 6, 2015 1:04 AM, David Herrmann dh.herrm...@gmail.com wrote:
   Given that all existing prototype userspace that I'm aware of
  (systemd and its consumers) apparently opts in, I don't really care
  that the feature is opt-in.

 This is just plain wrong. Out of the dozens of dbus applications, you
 found like 9 which are buggy? Two of them are already fixed, the
 maintainers of the other ones notified.
 I'd be interested where you got this notion that all existing
 prototype userspace [...] opts in.


I would say instead that, out of one in-use kdbus library, I found one
that was buggy.  Maybe gdbus really does use kdbus already, but on
very brief inspection it looked like it didn't at least on my test VM.


  Also, you haven't addressed the memory usage issues --

 ..because it doesn't change anything. If your IPC is message based and
 async, _someone_ needs to buffer. I don't see the difference between
 buffering locally on !EPOLLOUT or buffering in a shmem pool. In both
 cases, clients have control over the buffer size. If you disagree,
 please _elaborate_.

If the client buffers on !EPOLLOUT and has a monster buffer, then
that's the client's problem.

If every single program has a monster buffer, then it's everyone's
problem, and the size of the problem gets multiplied by the number of
programs.

Also, sensible clients that produce bulk data will throttle on
!EPOLLOUT rather than blindly buffering, but that's not an option when
the huge buffer is on the receiver's end.  Read up on bufferbloat.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-06 Thread Martin Steigerwald
Am Donnerstag, 6. August 2015, 10:04:57 schrieb David Herrmann:
   Given that all existing prototype userspace that I'm aware of
 
  (systemd and its consumers) apparently opts in, I don't really care
  that the feature is opt-in.
 
 This is just plain wrong. Out of the dozens of dbus applications, you
 found like 9 which are buggy? Two of them are already fixed, the
 maintainers of the other ones notified.
 I'd be interested where you got this notion that all existing
 prototype userspace [...] opts in.

But these few can create the issues Andy described?

Sure, one can argue I can setup a stress or stress-ng command line invocation 
as root user that will basically grind a Linux system to a halt – and in a way 
I consider this to be a bug in the kernel as well, but one that exists since a 
long time. But a GUI application running as a user?

How about some robustness regarding what you see as bugs in userspace here?

I think The bug is not mine is exactly the same language we have seen here 
before. If the kernel relies on bug-free userspace applications in order to do 
its job properly I think it has robustness issues. One certainly wouldn´t want 
this with any mission critical realtime OS. I think it is the kernel that 
should be in control.

Thanks,
-- 
Martin
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-06 Thread Andy Lutomirski
On Thu, Aug 6, 2015 at 12:06 AM, Daniel Mack dan...@zonque.org wrote:
 Hi Andy,

 On 08/05/2015 02:18 AM, Andy Lutomirski wrote:
 I added the missing sd_bus_unref call.

 With userspace dbus, my program takes 95% CPU and dbus-daemon takes
 88% CPU or so.

 With kdbus, I see abuse-bus (my test), systemd-journald,
 systemd-bus-proxy, auditd, gnome-shell, mission-control, sedispatch,
 firewalld, polkitd, NetworkManager, systemd, avahi-daemon, audisp,
 abrt-dump-jour* (whatever it's called -- it truncated), upowerd, and
 systemd-logind all taking tons of CPU.  I've listed them in decreasing
 order of amount of CPU burned -- the top several are taking about as
 much as is possible.  Load average is over 13.  That's if I run it
 from a text console while I'm logged in to gnome in a different VT.

 That's right, I can reproduce this here. To explain what's going on, let
 me provide some background.

 Every time a client connects to kdbus, a new ID is assigned to the
 connection, and other connections which have previously subscribed to
 notifications of type KDBUS_ITEM_ID_ADD or _REMOVE get a notification
 and are woken up so they can dispatch it. By default, no such matches
 exists, applicaions have to explicitly opt-in if they are interested in
 these events.

 In DBus (both kdbus and DBus1), such matches are installed on the
 NameOwnerChanged signal, and they can be either specific to a single ID,
 or broad, which will make them match on any ID. There's actually no
 reason for applications to install unspecific matches, but if they do,
 they will of course get what they asked for, and are woken up on every
 ID that is added to or removed from the bus. What you're seeing in your
 system profile is that some applications misbehave and install
 unspecific matches when they shouldn't. That's a userspace bug that
 needs fixing. Two candidates were actually in the systemd code base
 (logind and PID1), and both are now patched.

Can you point me at the patch?

It sounds like that will reduce the scalability issue with this
particular test from whatever userspace overhead exists * number of
clients to just the overhead of looping over all clients and their
matches in the kernel.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-05 Thread Andy Lutomirski
On Wed, Aug 5, 2015 at 12:10 AM, David Herrmann  wrote:
> Hi
>
> On Tue, Aug 4, 2015 at 4:47 PM, Andy Lutomirski  wrote:
>> On Tue, Aug 4, 2015 at 7:09 AM, David Herrmann  wrote:
>>> This is a bug in the proxy (which is already fixed).
>>
>> Should I expect to see it in Rawhide soon?
>
> Use this workaround until it does:
>
> $ DBUS_SYSTEM_BUS_ADDRESS="kernel:path=/sys/fs/kdbus/0-system/bus"
> ./your-binary
>

Which binary is supposed to be run like that?

>> Anyway, the broadcasts that I intended to exercise were
>> KDBUS_ITEM_ID_REMOVE.  Those appear to be broadcast to everyone,
>> irrespective of "policy", so long as the "match" thingy allows it.
>
> Matches are opt-in, not opt-out. Nobody will get this message unless
> they opt in.
>

And what opts in?  Either something's broken, or there's a different
scalabilty problem, or a whole pile of kdbus-using programs in Fedora
Rawhide do, in fact, opt in.

My interest in instrumenting kdbus and systemd to figure out the exact
mechanism by which my tiny test case causes my system to freeze is
near zero.  I bet I'm actually right about the mechanism, but that's
sort of beside the point.  It freezes, so /something's/ wrong.  The
only real relevance of my suspicion about the failure mode is that I
think it's a design issue that isn't going to be easy to fix.

>
>> So yes, as far as I can tell, kdbus really does track object lifetime
>> by broadcasting every single destruction event to every single
>> receiver (subject to caveats above) and pokes the data into every
>> receiver's tmpfs space.
>
> Broadcast reception is opt-in.

I've pointed out several times that there a feature in kdbus that
doesn't work well and I get told that the problematic feature is
opt-in.  Given that all existing prototype userspace that I'm aware of
(systemd and its consumers) apparently opts in, I don't really care
that the feature is opt-in.

Also, given things like this:

commit d27c8057699d164648b7d8c1559fa6529998f89d
Author: David Herrmann 
Date:   Tue May 26 09:30:14 2015 +0200

kdbus: forward ID notifications to everyone

it really does seem to me that the point of these ID notifications is
for everyone to get them.

Also, you haven't addressed the memory usage issues -- I don't see how
a full kdbus-using desktop system can be expected to fit into RAM on
anything short of the biggest and beefiest laptops.  I also don't see
how a kdbus-using xdg-app-happy kdbus-using system (with
correspondingly many pools) will fit into RAM on even the biggest
laptops.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-05 Thread David Herrmann
Hi

On Tue, Aug 4, 2015 at 4:47 PM, Andy Lutomirski  wrote:
> On Tue, Aug 4, 2015 at 7:09 AM, David Herrmann  wrote:
>> This is a bug in the proxy (which is already fixed).
>
> Should I expect to see it in Rawhide soon?

Use this workaround until it does:

$ DBUS_SYSTEM_BUS_ADDRESS="kernel:path=/sys/fs/kdbus/0-system/bus"
./your-binary

> Anyway, the broadcasts that I intended to exercise were
> KDBUS_ITEM_ID_REMOVE.  Those appear to be broadcast to everyone,
> irrespective of "policy", so long as the "match" thingy allows it.

Matches are opt-in, not opt-out. Nobody will get this message unless
they opt in.

> The bloom filter thing won't help at all according to the docs: bloom
> filters don't apply to kernel-generated notifications.

Bloom filters apply to message payloads. Kernel notifications do not
carry a message payload. Message metadata can be filtered for
explicitly (without false-positives).

> So yes, as far as I can tell, kdbus really does track object lifetime
> by broadcasting every single destruction event to every single
> receiver (subject to caveats above) and pokes the data into every
> receiver's tmpfs space.

Broadcast reception is opt-in.

Thanks
David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-05 Thread David Herrmann
Hi

On Tue, Aug 4, 2015 at 4:47 PM, Andy Lutomirski l...@amacapital.net wrote:
 On Tue, Aug 4, 2015 at 7:09 AM, David Herrmann dh.herrm...@gmail.com wrote:
 This is a bug in the proxy (which is already fixed).

 Should I expect to see it in Rawhide soon?

Use this workaround until it does:

$ DBUS_SYSTEM_BUS_ADDRESS=kernel:path=/sys/fs/kdbus/0-system/bus
./your-binary

 Anyway, the broadcasts that I intended to exercise were
 KDBUS_ITEM_ID_REMOVE.  Those appear to be broadcast to everyone,
 irrespective of policy, so long as the match thingy allows it.

Matches are opt-in, not opt-out. Nobody will get this message unless
they opt in.

 The bloom filter thing won't help at all according to the docs: bloom
 filters don't apply to kernel-generated notifications.

Bloom filters apply to message payloads. Kernel notifications do not
carry a message payload. Message metadata can be filtered for
explicitly (without false-positives).

 So yes, as far as I can tell, kdbus really does track object lifetime
 by broadcasting every single destruction event to every single
 receiver (subject to caveats above) and pokes the data into every
 receiver's tmpfs space.

Broadcast reception is opt-in.

Thanks
David
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-05 Thread Andy Lutomirski
On Wed, Aug 5, 2015 at 12:10 AM, David Herrmann dh.herrm...@gmail.com wrote:
 Hi

 On Tue, Aug 4, 2015 at 4:47 PM, Andy Lutomirski l...@amacapital.net wrote:
 On Tue, Aug 4, 2015 at 7:09 AM, David Herrmann dh.herrm...@gmail.com wrote:
 This is a bug in the proxy (which is already fixed).

 Should I expect to see it in Rawhide soon?

 Use this workaround until it does:

 $ DBUS_SYSTEM_BUS_ADDRESS=kernel:path=/sys/fs/kdbus/0-system/bus
 ./your-binary


Which binary is supposed to be run like that?

 Anyway, the broadcasts that I intended to exercise were
 KDBUS_ITEM_ID_REMOVE.  Those appear to be broadcast to everyone,
 irrespective of policy, so long as the match thingy allows it.

 Matches are opt-in, not opt-out. Nobody will get this message unless
 they opt in.


And what opts in?  Either something's broken, or there's a different
scalabilty problem, or a whole pile of kdbus-using programs in Fedora
Rawhide do, in fact, opt in.

My interest in instrumenting kdbus and systemd to figure out the exact
mechanism by which my tiny test case causes my system to freeze is
near zero.  I bet I'm actually right about the mechanism, but that's
sort of beside the point.  It freezes, so /something's/ wrong.  The
only real relevance of my suspicion about the failure mode is that I
think it's a design issue that isn't going to be easy to fix.


 So yes, as far as I can tell, kdbus really does track object lifetime
 by broadcasting every single destruction event to every single
 receiver (subject to caveats above) and pokes the data into every
 receiver's tmpfs space.

 Broadcast reception is opt-in.

I've pointed out several times that there a feature in kdbus that
doesn't work well and I get told that the problematic feature is
opt-in.  Given that all existing prototype userspace that I'm aware of
(systemd and its consumers) apparently opts in, I don't really care
that the feature is opt-in.

Also, given things like this:

commit d27c8057699d164648b7d8c1559fa6529998f89d
Author: David Herrmann dh.herrm...@gmail.com
Date:   Tue May 26 09:30:14 2015 +0200

kdbus: forward ID notifications to everyone

it really does seem to me that the point of these ID notifications is
for everyone to get them.

Also, you haven't addressed the memory usage issues -- I don't see how
a full kdbus-using desktop system can be expected to fit into RAM on
anything short of the biggest and beefiest laptops.  I also don't see
how a kdbus-using xdg-app-happy kdbus-using system (with
correspondingly many pools) will fit into RAM on even the biggest
laptops.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-04 Thread Andy Lutomirski
On Tue, Aug 4, 2015 at 7:47 AM, Andy Lutomirski  wrote:
> On Tue, Aug 4, 2015 at 7:09 AM, David Herrmann  wrote:
>> Hi
>>
>> On Tue, Aug 4, 2015 at 3:46 PM, Linus Torvalds
>>  wrote:
>>> On Tue, Aug 4, 2015 at 1:58 AM, David Herrmann  
>>> wrote:

 You lack a call to sd_bus_unref() here.
>>>
>>> I assume it was intentional. Why would Andy talk about "scaling" otherwise?
>
> It was actually an error.  I assumed that, since the user version
> worked fine (at least for as long as I ran it) and the kernel version
> didn't (killed X and left a blinking cursor, no visible log messages
> even when run from a text console, and no obvious OOM recovery after a
> long wait) that it was a kdbus issue or issue with other kdbus
> clients.
>
> I'll play with it more today.
>

I added the missing sd_bus_unref call.

With userspace dbus, my program takes 95% CPU and dbus-daemon takes
88% CPU or so.

With kdbus, I see abuse-bus (my test), systemd-journald,
systemd-bus-proxy, auditd, gnome-shell, mission-control, sedispatch,
firewalld, polkitd, NetworkManager, systemd, avahi-daemon, audisp,
abrt-dump-jour* (whatever it's called -- it truncated), upowerd, and
systemd-logind all taking tons of CPU.  I've listed them in decreasing
order of amount of CPU burned -- the top several are taking about as
much as is possible.  Load average is over 13.  That's if I run it
from a text console while I'm logged in to gnome in a different VT.

If I run the program from a graphical terminal, everything freezes so
hard that the cursor doesn't even make it to the next line when I hit
enter.

So I still claim that kdbus doesn't scale.  I'm not even just saying
that it doesn't scale to large systems -- somewhat to my surprise, it
doesn't even seem to scale well enough for a mostly empty Rawhide
workstation system running just a graphical terminal.  And I didn't
even try to find stress tests more interesting than connecting and
disconnecting in a loop.

FWIW, the old test (without the unref) appeared to be allocating 16M
of mapped kdbus pool every iteration, which seems unlikely to have
helped matters.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-04 Thread Andy Lutomirski
On Tue, Aug 4, 2015 at 7:09 AM, David Herrmann  wrote:
> Hi
>
> On Tue, Aug 4, 2015 at 3:46 PM, Linus Torvalds
>  wrote:
>> On Tue, Aug 4, 2015 at 1:58 AM, David Herrmann  wrote:
>>>
>>> You lack a call to sd_bus_unref() here.
>>
>> I assume it was intentional. Why would Andy talk about "scaling" otherwise?

It was actually an error.  I assumed that, since the user version
worked fine (at least for as long as I ran it) and the kernel version
didn't (killed X and left a blinking cursor, no visible log messages
even when run from a text console, and no obvious OOM recovery after a
long wait) that it was a kdbus issue or issue with other kdbus
clients.

I'll play with it more today.

>>
>> And the worry was why the kdbus version killed the machine, but the
>> userspace version did not. That's a rather big difference, and not a
>> good one.
>
> Neither test 'kills' the machine:
>
>  * The userspace version will be killed by the OOM killer after about
> 20s running (depending how much memory you have).

Not on my system.  Maybe too much memory?

>
>  * The kernel version runs for 1024 iterations (maximum kdbus
> connections per user) and then produces errors.
>
> In fact, the kernel version is even more stable than the user-space
> version, and bails out much earlier. Run it on a VT and everything
> works just fine.


On my system, everything died as described above.

>
> The only issue you get with kdbus is the compat-bus-daemon, which
> assert()s as a side-effect of accept4() failing. In other words, the
> compat bus-daemon gets ENFILE if you open that many connections, then
> assert()s and thus kills all other proxy connections. This has the
> side effect, that Xorg loses access to your graphics device and thus
> your screen 'freezes'. Also networkmanager bails out and stops network
> connections.

Ah, interesting.

>
> This is a bug in the proxy (which is already fixed).

Should I expect to see it in Rawhide soon?

Anyway, the broadcasts that I intended to exercise were
KDBUS_ITEM_ID_REMOVE.  Those appear to be broadcast to everyone,
irrespective of "policy", so long as the "match" thingy allows it.  As
far as I can tell, that's the default behavior (i.e. receivers accept
KDBUS_DST_ID_BROADCAST), but even if it's not default, we'll still
fail to scale as long as the number of receivers accepting
KDBUS_DST_ID_BROADCAST grows as systems become more kdbus-integrated.

The bloom filter thing won't help at all according to the docs: bloom
filters don't apply to kernel-generated notifications.

So yes, as far as I can tell, kdbus really does track object lifetime
by broadcasting every single destruction event to every single
receiver (subject to caveats above) and pokes the data into every
receiver's tmpfs space (via kdbus_bus_broadcast ->
kdbus_conn_entry_insert -> lots of other stuff -> vfs_iter_write).  At
that point, there's well over a gigabyte of tmpfs space that can be
scribbled on (and thus committed and thus needs to be read) by rogue
broadcasters even on Rawhide, and Rawhide seems to have barely started
converting all the kdbus clients from using the proxy to using kdbus
directly.

IIUC, once gdbus switches over to using kdbus directly, with current
buffer sizing, the average laptop will have more kdbus pool tmpfs
space mapped than total RAM.  I still don't see how this will work
well.

I guess my test didn't exercise what I meant it to.  I wrote it,
userspace survived (on my system) and kdbus didn't.  Apparently I blew
up the bus proxy, not the pool mechanism.  Next time I'll try to
better characterize exactly what it is I'm doing to my poor VM...

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-04 Thread David Herrmann
Hi

On Tue, Aug 4, 2015 at 3:46 PM, Linus Torvalds
 wrote:
> On Tue, Aug 4, 2015 at 1:58 AM, David Herrmann  wrote:
>>
>> You lack a call to sd_bus_unref() here.
>
> I assume it was intentional. Why would Andy talk about "scaling" otherwise?
>
> And the worry was why the kdbus version killed the machine, but the
> userspace version did not. That's a rather big difference, and not a
> good one.

Neither test 'kills' the machine:

 * The userspace version will be killed by the OOM killer after about
20s running (depending how much memory you have).

 * The kernel version runs for 1024 iterations (maximum kdbus
connections per user) and then produces errors.

In fact, the kernel version is even more stable than the user-space
version, and bails out much earlier. Run it on a VT and everything
works just fine.

The only issue you get with kdbus is the compat-bus-daemon, which
assert()s as a side-effect of accept4() failing. In other words, the
compat bus-daemon gets ENFILE if you open that many connections, then
assert()s and thus kills all other proxy connections. This has the
side effect, that Xorg loses access to your graphics device and thus
your screen 'freezes'. Also networkmanager bails out and stops network
connections.

This is a bug in the proxy (which is already fixed).

Thanks
David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-04 Thread Linus Torvalds
On Tue, Aug 4, 2015 at 1:58 AM, David Herrmann  wrote:
>
> You lack a call to sd_bus_unref() here.

I assume it was intentional. Why would Andy talk about "scaling" otherwise?

And the worry was why the kdbus version killed the machine, but the
userspace version did not. That's a rather big difference, and not a
good one.

Possibly the kdbus version ends up not just allocating user space
memory (which we should handle gracefully), but kernel allocations too
(which absolutely have to be explicitly resource-managed).

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-04 Thread David Herrmann
Hi

On Tue, Aug 4, 2015 at 1:02 AM, Andy Lutomirski  wrote:
>I got Fedora
> Rawhide working under kdbus (thanks, everyone!), and I ran this little
> program:
>
> #include 
> #include 
>
> int main(int argc, char *argv[])
> {
> while (1) {
> sd_bus *bus;
> if (sd_bus_open_system() < 0) {
> /* warn("sd_bus_open_system"); */
> continue;
> }
> sd_bus_close(bus);

You lack a call to sd_bus_unref() here. Without it, your loop contains:

while (1)
malloc(1024);

This simple malloc-loop already hogs your system. If I add the
required call to _unref(), your tool runs smoothly on my machine.

> }
> }
>
> under both userspace dbus and under kdbus.  Userspace dbus burns some
> CPU -- no big deal.  I expected kdbus to fail to scale and burn a
> disproportionate amount of CPU (because I don't see how it /can/
> scale).  Instead it fell over completely.  I didn't bother debugging
> it, but offhand I'd guess that the system OOMed and didn't come back.

I cannot see the relation to kdbus.

> On very brief inspection, Rawhide seems to have a lot of kdbus
> connections with 16MiB of mapped tmpfs stuff each.  (53 of them
> mapped, and I don't know how many exist with tmpfs backing but aren't
> mapped.  Presumably the number only goes up as the degree of reliance
> on the userspace proxy goes down.

What does this have to do with the proxy? Why would resource
consumption go *up* as the proxy users decline? Please elaborate.

> I don't know of any deployed
> systems that solve it by broadcasting the lifetime of everything to
> everyone and relying on those broadcasts going through, though.

Luckily, kdbus does not do this.

Thanks
David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-04 Thread David Herrmann
Hi

On Tue, Aug 4, 2015 at 1:02 AM, Andy Lutomirski l...@kernel.org wrote:
I got Fedora
 Rawhide working under kdbus (thanks, everyone!), and I ran this little
 program:

 #include systemd/sd-bus.h
 #include err.h

 int main(int argc, char *argv[])
 {
 while (1) {
 sd_bus *bus;
 if (sd_bus_open_system(bus)  0) {
 /* warn(sd_bus_open_system); */
 continue;
 }
 sd_bus_close(bus);

You lack a call to sd_bus_unref() here. Without it, your loop contains:

while (1)
malloc(1024);

This simple malloc-loop already hogs your system. If I add the
required call to _unref(), your tool runs smoothly on my machine.

 }
 }

 under both userspace dbus and under kdbus.  Userspace dbus burns some
 CPU -- no big deal.  I expected kdbus to fail to scale and burn a
 disproportionate amount of CPU (because I don't see how it /can/
 scale).  Instead it fell over completely.  I didn't bother debugging
 it, but offhand I'd guess that the system OOMed and didn't come back.

I cannot see the relation to kdbus.

 On very brief inspection, Rawhide seems to have a lot of kdbus
 connections with 16MiB of mapped tmpfs stuff each.  (53 of them
 mapped, and I don't know how many exist with tmpfs backing but aren't
 mapped.  Presumably the number only goes up as the degree of reliance
 on the userspace proxy goes down.

What does this have to do with the proxy? Why would resource
consumption go *up* as the proxy users decline? Please elaborate.

 I don't know of any deployed
 systems that solve it by broadcasting the lifetime of everything to
 everyone and relying on those broadcasts going through, though.

Luckily, kdbus does not do this.

Thanks
David
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-04 Thread David Herrmann
Hi

On Tue, Aug 4, 2015 at 3:46 PM, Linus Torvalds
torva...@linux-foundation.org wrote:
 On Tue, Aug 4, 2015 at 1:58 AM, David Herrmann dh.herrm...@gmail.com wrote:

 You lack a call to sd_bus_unref() here.

 I assume it was intentional. Why would Andy talk about scaling otherwise?

 And the worry was why the kdbus version killed the machine, but the
 userspace version did not. That's a rather big difference, and not a
 good one.

Neither test 'kills' the machine:

 * The userspace version will be killed by the OOM killer after about
20s running (depending how much memory you have).

 * The kernel version runs for 1024 iterations (maximum kdbus
connections per user) and then produces errors.

In fact, the kernel version is even more stable than the user-space
version, and bails out much earlier. Run it on a VT and everything
works just fine.

The only issue you get with kdbus is the compat-bus-daemon, which
assert()s as a side-effect of accept4() failing. In other words, the
compat bus-daemon gets ENFILE if you open that many connections, then
assert()s and thus kills all other proxy connections. This has the
side effect, that Xorg loses access to your graphics device and thus
your screen 'freezes'. Also networkmanager bails out and stops network
connections.

This is a bug in the proxy (which is already fixed).

Thanks
David
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-04 Thread Linus Torvalds
On Tue, Aug 4, 2015 at 1:58 AM, David Herrmann dh.herrm...@gmail.com wrote:

 You lack a call to sd_bus_unref() here.

I assume it was intentional. Why would Andy talk about scaling otherwise?

And the worry was why the kdbus version killed the machine, but the
userspace version did not. That's a rather big difference, and not a
good one.

Possibly the kdbus version ends up not just allocating user space
memory (which we should handle gracefully), but kernel allocations too
(which absolutely have to be explicitly resource-managed).

   Linus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-04 Thread Andy Lutomirski
On Tue, Aug 4, 2015 at 7:09 AM, David Herrmann dh.herrm...@gmail.com wrote:
 Hi

 On Tue, Aug 4, 2015 at 3:46 PM, Linus Torvalds
 torva...@linux-foundation.org wrote:
 On Tue, Aug 4, 2015 at 1:58 AM, David Herrmann dh.herrm...@gmail.com wrote:

 You lack a call to sd_bus_unref() here.

 I assume it was intentional. Why would Andy talk about scaling otherwise?

It was actually an error.  I assumed that, since the user version
worked fine (at least for as long as I ran it) and the kernel version
didn't (killed X and left a blinking cursor, no visible log messages
even when run from a text console, and no obvious OOM recovery after a
long wait) that it was a kdbus issue or issue with other kdbus
clients.

I'll play with it more today.


 And the worry was why the kdbus version killed the machine, but the
 userspace version did not. That's a rather big difference, and not a
 good one.

 Neither test 'kills' the machine:

  * The userspace version will be killed by the OOM killer after about
 20s running (depending how much memory you have).

Not on my system.  Maybe too much memory?


  * The kernel version runs for 1024 iterations (maximum kdbus
 connections per user) and then produces errors.

 In fact, the kernel version is even more stable than the user-space
 version, and bails out much earlier. Run it on a VT and everything
 works just fine.


On my system, everything died as described above.


 The only issue you get with kdbus is the compat-bus-daemon, which
 assert()s as a side-effect of accept4() failing. In other words, the
 compat bus-daemon gets ENFILE if you open that many connections, then
 assert()s and thus kills all other proxy connections. This has the
 side effect, that Xorg loses access to your graphics device and thus
 your screen 'freezes'. Also networkmanager bails out and stops network
 connections.

Ah, interesting.


 This is a bug in the proxy (which is already fixed).

Should I expect to see it in Rawhide soon?

Anyway, the broadcasts that I intended to exercise were
KDBUS_ITEM_ID_REMOVE.  Those appear to be broadcast to everyone,
irrespective of policy, so long as the match thingy allows it.  As
far as I can tell, that's the default behavior (i.e. receivers accept
KDBUS_DST_ID_BROADCAST), but even if it's not default, we'll still
fail to scale as long as the number of receivers accepting
KDBUS_DST_ID_BROADCAST grows as systems become more kdbus-integrated.

The bloom filter thing won't help at all according to the docs: bloom
filters don't apply to kernel-generated notifications.

So yes, as far as I can tell, kdbus really does track object lifetime
by broadcasting every single destruction event to every single
receiver (subject to caveats above) and pokes the data into every
receiver's tmpfs space (via kdbus_bus_broadcast -
kdbus_conn_entry_insert - lots of other stuff - vfs_iter_write).  At
that point, there's well over a gigabyte of tmpfs space that can be
scribbled on (and thus committed and thus needs to be read) by rogue
broadcasters even on Rawhide, and Rawhide seems to have barely started
converting all the kdbus clients from using the proxy to using kdbus
directly.

IIUC, once gdbus switches over to using kdbus directly, with current
buffer sizing, the average laptop will have more kdbus pool tmpfs
space mapped than total RAM.  I still don't see how this will work
well.

I guess my test didn't exercise what I meant it to.  I wrote it,
userspace survived (on my system) and kdbus didn't.  Apparently I blew
up the bus proxy, not the pool mechanism.  Next time I'll try to
better characterize exactly what it is I'm doing to my poor VM...

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-04 Thread Andy Lutomirski
On Tue, Aug 4, 2015 at 7:47 AM, Andy Lutomirski l...@amacapital.net wrote:
 On Tue, Aug 4, 2015 at 7:09 AM, David Herrmann dh.herrm...@gmail.com wrote:
 Hi

 On Tue, Aug 4, 2015 at 3:46 PM, Linus Torvalds
 torva...@linux-foundation.org wrote:
 On Tue, Aug 4, 2015 at 1:58 AM, David Herrmann dh.herrm...@gmail.com 
 wrote:

 You lack a call to sd_bus_unref() here.

 I assume it was intentional. Why would Andy talk about scaling otherwise?

 It was actually an error.  I assumed that, since the user version
 worked fine (at least for as long as I ran it) and the kernel version
 didn't (killed X and left a blinking cursor, no visible log messages
 even when run from a text console, and no obvious OOM recovery after a
 long wait) that it was a kdbus issue or issue with other kdbus
 clients.

 I'll play with it more today.


I added the missing sd_bus_unref call.

With userspace dbus, my program takes 95% CPU and dbus-daemon takes
88% CPU or so.

With kdbus, I see abuse-bus (my test), systemd-journald,
systemd-bus-proxy, auditd, gnome-shell, mission-control, sedispatch,
firewalld, polkitd, NetworkManager, systemd, avahi-daemon, audisp,
abrt-dump-jour* (whatever it's called -- it truncated), upowerd, and
systemd-logind all taking tons of CPU.  I've listed them in decreasing
order of amount of CPU burned -- the top several are taking about as
much as is possible.  Load average is over 13.  That's if I run it
from a text console while I'm logged in to gnome in a different VT.

If I run the program from a graphical terminal, everything freezes so
hard that the cursor doesn't even make it to the next line when I hit
enter.

So I still claim that kdbus doesn't scale.  I'm not even just saying
that it doesn't scale to large systems -- somewhat to my surprise, it
doesn't even seem to scale well enough for a mostly empty Rawhide
workstation system running just a graphical terminal.  And I didn't
even try to find stress tests more interesting than connecting and
disconnecting in a loop.

FWIW, the old test (without the unref) appeared to be allocating 16M
of mapped kdbus pool every iteration, which seems unlikely to have
helped matters.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-03 Thread Andy Lutomirski
On Mon, Jun 22, 2015 at 11:06 PM, Andy Lutomirski  wrote:
> 2. Kdbus introduces a novel buffering model.  Receivers allocate a big
> chunk of what's essentially tmpfs space.  Assuming that space is
> available (in a virtual memory sense), senders synchronously write to
> the receivers' tmpfs space.  Broadcast senders synchronously write to
> *all* receivers' tmpfs space.  I think that, regardless of
> implementation, this is problematic if the sender and the receiver are
> in different memcgs.  Suppose that the message is to be written to a
> page in the receivers' tmpfs space that is not currently resident.  If
> the write happens in the sender's memcg context, then a receiver can
> effectively allocate an unlimited number of pages in the sender's
> memcg, which will, in practice, be the init memcg if the sender is
> systemd.  This breaks the memcg model.  If, on the other hand, the
> sender writes to the receiver's tmpfs space in the receiver's memcg
> context, then the sender will block (or fail?  presumably
> unpredictable failures are a bad thing) if the receiver's memcg is at
> capacity.

I realize that everyone is sick of this thread.  Nonetheless, I should
emphasize that I'm actually serious about this issue.  I got Fedora
Rawhide working under kdbus (thanks, everyone!), and I ran this little
program:

#include 
#include 

int main(int argc, char *argv[])
{
while (1) {
sd_bus *bus;
if (sd_bus_open_system() < 0) {
/* warn("sd_bus_open_system"); */
continue;
}
sd_bus_close(bus);
}
}

under both userspace dbus and under kdbus.  Userspace dbus burns some
CPU -- no big deal.  I expected kdbus to fail to scale and burn a
disproportionate amount of CPU (because I don't see how it /can/
scale).  Instead it fell over completely.  I didn't bother debugging
it, but offhand I'd guess that the system OOMed and didn't come back.

On very brief inspection, Rawhide seems to have a lot of kdbus
connections with 16MiB of mapped tmpfs stuff each.  (53 of them
mapped, and I don't know how many exist with tmpfs backing but aren't
mapped.  Presumably the number only goes up as the degree of reliance
on the userspace proxy goes down.  As it stands, that's over 3GB of
uncommitted backing store that my test is likely to forcibly commit
very quickly.)

Frankly, I don't understand how it's possible to cleanly implement
kdbus' broadcast or lifetime semantics* in an environment with bounded
CPU or bounded memory.  (And unbounded memory just changes the
problem, since the message backlog can just get worse and worse.)

I work in an industry in which lots of parties broadcast lots of data
to lots of people.  If you try to drink from the firehose and you
can't swallow fast enough, either you need to throw something out (and
test your recovery code!) or you fail.  At least in finance, no one
pretends that a global order of events in different cities is
practical.

* Detecting when when your peer goes away is, of course, a widely
encountered and widely solved problem.  I don't know of any deployed
systems that solve it by broadcasting the lifetime of everything to
everyone and relying on those broadcasts going through, though.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-08-03 Thread Andy Lutomirski
On Mon, Jun 22, 2015 at 11:06 PM, Andy Lutomirski l...@amacapital.net wrote:
 2. Kdbus introduces a novel buffering model.  Receivers allocate a big
 chunk of what's essentially tmpfs space.  Assuming that space is
 available (in a virtual memory sense), senders synchronously write to
 the receivers' tmpfs space.  Broadcast senders synchronously write to
 *all* receivers' tmpfs space.  I think that, regardless of
 implementation, this is problematic if the sender and the receiver are
 in different memcgs.  Suppose that the message is to be written to a
 page in the receivers' tmpfs space that is not currently resident.  If
 the write happens in the sender's memcg context, then a receiver can
 effectively allocate an unlimited number of pages in the sender's
 memcg, which will, in practice, be the init memcg if the sender is
 systemd.  This breaks the memcg model.  If, on the other hand, the
 sender writes to the receiver's tmpfs space in the receiver's memcg
 context, then the sender will block (or fail?  presumably
 unpredictable failures are a bad thing) if the receiver's memcg is at
 capacity.

I realize that everyone is sick of this thread.  Nonetheless, I should
emphasize that I'm actually serious about this issue.  I got Fedora
Rawhide working under kdbus (thanks, everyone!), and I ran this little
program:

#include systemd/sd-bus.h
#include err.h

int main(int argc, char *argv[])
{
while (1) {
sd_bus *bus;
if (sd_bus_open_system(bus)  0) {
/* warn(sd_bus_open_system); */
continue;
}
sd_bus_close(bus);
}
}

under both userspace dbus and under kdbus.  Userspace dbus burns some
CPU -- no big deal.  I expected kdbus to fail to scale and burn a
disproportionate amount of CPU (because I don't see how it /can/
scale).  Instead it fell over completely.  I didn't bother debugging
it, but offhand I'd guess that the system OOMed and didn't come back.

On very brief inspection, Rawhide seems to have a lot of kdbus
connections with 16MiB of mapped tmpfs stuff each.  (53 of them
mapped, and I don't know how many exist with tmpfs backing but aren't
mapped.  Presumably the number only goes up as the degree of reliance
on the userspace proxy goes down.  As it stands, that's over 3GB of
uncommitted backing store that my test is likely to forcibly commit
very quickly.)

Frankly, I don't understand how it's possible to cleanly implement
kdbus' broadcast or lifetime semantics* in an environment with bounded
CPU or bounded memory.  (And unbounded memory just changes the
problem, since the message backlog can just get worse and worse.)

I work in an industry in which lots of parties broadcast lots of data
to lots of people.  If you try to drink from the firehose and you
can't swallow fast enough, either you need to throw something out (and
test your recovery code!) or you fail.  At least in finance, no one
pretends that a global order of events in different cities is
practical.

* Detecting when when your peer goes away is, of course, a widely
encountered and widely solved problem.  I don't know of any deployed
systems that solve it by broadcasting the lifetime of everything to
everyone and relying on those broadcasts going through, though.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-07-09 Thread Pavel Machek
On Thu 2015-07-09 10:39:58, Geert Uytterhoeven wrote:
> On Wed, Jul 8, 2015 at 3:54 PM, Pavel Machek  wrote:
> > Apparently, new tools are needed in the community, as normal review
> > comments did not stop drivers/android/binder.c merge.
> >
> > For example binder_transaction does not exactly look like a kernel
> > code, "TODO: fput" does not really invoke confidence, and ammount of
> > BUG_ON()s is quite amazing...
> 
> Amazingly, checkpatch (without --strict) only complains about long lines.

Well, checkpatch only tells half of storry.

Anyway worst problem is that there's no documentation of kernel<->user
interface binder provides, making understanding it hard/impossible.

Closest to documentation pointer is:

 * Based on, but no longer compatible with, the original
 * OpenBinder.org binder driver interface, which is:
  
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-07-09 Thread Geert Uytterhoeven
On Thu, Jul 9, 2015 at 12:29 PM, Joe Perches  wrote:
> On Thu, 2015-07-09 at 10:39 +0200, Geert Uytterhoeven wrote:
>> On Wed, Jul 8, 2015 at 3:54 PM, Pavel Machek  wrote:
>> > Apparently, new tools are needed in the community, as normal review
>> > comments did not stop drivers/android/binder.c merge.
>> >
>> > For example binder_transaction does not exactly look like a kernel
>> > code, "TODO: fput" does not really invoke confidence, and amount of
>> > BUG_ON()s is quite amazing...
>>
>> Amazingly, checkpatch (without --strict) only complains about long lines.
>>
>> Seems like the test for "BUG" is (and always has been) commented out...
>
> Maybe (requires --strict when scanning files)
> ---
>  scripts/checkpatch.pl | 14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)

Thanks!

Detected 31 occurrences (+ 1 commented out), shudder...

Tested-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-07-09 Thread Joe Perches
On Thu, 2015-07-09 at 10:39 +0200, Geert Uytterhoeven wrote:
> On Wed, Jul 8, 2015 at 3:54 PM, Pavel Machek  wrote:
> > Apparently, new tools are needed in the community, as normal review
> > comments did not stop drivers/android/binder.c merge.
> >
> > For example binder_transaction does not exactly look like a kernel
> > code, "TODO: fput" does not really invoke confidence, and amount of
> > BUG_ON()s is quite amazing...
> 
> Amazingly, checkpatch (without --strict) only complains about long lines.
> 
> Seems like the test for "BUG" is (and always has been) commented out...

Maybe (requires --strict when scanning files)
---
 scripts/checkpatch.pl | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 90e1edc..11c8186 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3439,13 +3439,15 @@ sub process {
}
}
 
-# # no BUG() or BUG_ON()
-#  if ($line =~ /\b(BUG|BUG_ON)\b/) {
-#  print "Try to use WARN_ON & Recovery code rather than 
BUG() or BUG_ON()\n";
-#  print "$herecurr";
-#  $clean = 0;
-#  }
+# avoid BUG() or BUG_ON()
+   if ($line =~ /\b(?:BUG|BUG_ON)\b/) {
+   my $msg_type = \
+   $msg_type = \ if ($file);
+   &{$msg_type}("AVOID_BUG",
+"Avoid crashing the kernel - Try using 
WARN_ON & Recovery code rather than BUG() or BUG_ON()\n" . $herecurr);
+   }
 
+# avoid LINUX_VERSION_CODE
if ($line =~ /\bLINUX_VERSION_CODE\b/) {
WARN("LINUX_VERSION_CODE",
 "LINUX_VERSION_CODE should be avoided, code should 
be for the version to which it is merged\n" . $herecurr);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-07-09 Thread Geert Uytterhoeven
On Wed, Jul 8, 2015 at 3:54 PM, Pavel Machek  wrote:
> Apparently, new tools are needed in the community, as normal review
> comments did not stop drivers/android/binder.c merge.
>
> For example binder_transaction does not exactly look like a kernel
> code, "TODO: fput" does not really invoke confidence, and ammount of
> BUG_ON()s is quite amazing...

Amazingly, checkpatch (without --strict) only complains about long lines.

Seems like the test for "BUG" is (and always has been) commented out...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-07-09 Thread Geert Uytterhoeven
On Wed, Jul 8, 2015 at 3:54 PM, Pavel Machek pa...@ucw.cz wrote:
 Apparently, new tools are needed in the community, as normal review
 comments did not stop drivers/android/binder.c merge.

 For example binder_transaction does not exactly look like a kernel
 code, TODO: fput does not really invoke confidence, and ammount of
 BUG_ON()s is quite amazing...

Amazingly, checkpatch (without --strict) only complains about long lines.

Seems like the test for BUG is (and always has been) commented out...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-07-09 Thread Geert Uytterhoeven
On Thu, Jul 9, 2015 at 12:29 PM, Joe Perches j...@perches.com wrote:
 On Thu, 2015-07-09 at 10:39 +0200, Geert Uytterhoeven wrote:
 On Wed, Jul 8, 2015 at 3:54 PM, Pavel Machek pa...@ucw.cz wrote:
  Apparently, new tools are needed in the community, as normal review
  comments did not stop drivers/android/binder.c merge.
 
  For example binder_transaction does not exactly look like a kernel
  code, TODO: fput does not really invoke confidence, and amount of
  BUG_ON()s is quite amazing...

 Amazingly, checkpatch (without --strict) only complains about long lines.

 Seems like the test for BUG is (and always has been) commented out...

 Maybe (requires --strict when scanning files)
 ---
  scripts/checkpatch.pl | 14 --
  1 file changed, 8 insertions(+), 6 deletions(-)

Thanks!

Detected 31 occurrences (+ 1 commented out), shudder...

Tested-by: Geert Uytterhoeven ge...@linux-m68k.org

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-07-09 Thread Pavel Machek
On Thu 2015-07-09 10:39:58, Geert Uytterhoeven wrote:
 On Wed, Jul 8, 2015 at 3:54 PM, Pavel Machek pa...@ucw.cz wrote:
  Apparently, new tools are needed in the community, as normal review
  comments did not stop drivers/android/binder.c merge.
 
  For example binder_transaction does not exactly look like a kernel
  code, TODO: fput does not really invoke confidence, and ammount of
  BUG_ON()s is quite amazing...
 
 Amazingly, checkpatch (without --strict) only complains about long lines.

Well, checkpatch only tells half of storry.

Anyway worst problem is that there's no documentation of kernel-user
interface binder provides, making understanding it hard/impossible.

Closest to documentation pointer is:

 * Based on, but no longer compatible with, the original
 * OpenBinder.org binder driver interface, which is:
  
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-07-09 Thread Joe Perches
On Thu, 2015-07-09 at 10:39 +0200, Geert Uytterhoeven wrote:
 On Wed, Jul 8, 2015 at 3:54 PM, Pavel Machek pa...@ucw.cz wrote:
  Apparently, new tools are needed in the community, as normal review
  comments did not stop drivers/android/binder.c merge.
 
  For example binder_transaction does not exactly look like a kernel
  code, TODO: fput does not really invoke confidence, and amount of
  BUG_ON()s is quite amazing...
 
 Amazingly, checkpatch (without --strict) only complains about long lines.
 
 Seems like the test for BUG is (and always has been) commented out...

Maybe (requires --strict when scanning files)
---
 scripts/checkpatch.pl | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 90e1edc..11c8186 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3439,13 +3439,15 @@ sub process {
}
}
 
-# # no BUG() or BUG_ON()
-#  if ($line =~ /\b(BUG|BUG_ON)\b/) {
-#  print Try to use WARN_ON  Recovery code rather than 
BUG() or BUG_ON()\n;
-#  print $herecurr;
-#  $clean = 0;
-#  }
+# avoid BUG() or BUG_ON()
+   if ($line =~ /\b(?:BUG|BUG_ON)\b/) {
+   my $msg_type = \WARN;
+   $msg_type = \CHK if ($file);
+   {$msg_type}(AVOID_BUG,
+Avoid crashing the kernel - Try using 
WARN_ON  Recovery code rather than BUG() or BUG_ON()\n . $herecurr);
+   }
 
+# avoid LINUX_VERSION_CODE
if ($line =~ /\bLINUX_VERSION_CODE\b/) {
WARN(LINUX_VERSION_CODE,
 LINUX_VERSION_CODE should be avoided, code should 
be for the version to which it is merged\n . $herecurr);


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-07-08 Thread Pavel Machek
On Mon 2015-06-22 23:41:40, Greg KH wrote:
> On Mon, Jun 22, 2015 at 11:06:09PM -0700, Andy Lutomirski wrote:
> > Hi Linus,
> > 
> > Can you opine as to whether you think that kdbus should be merged?
> 
> Ah, a preemptive pull request denial, how nice.

> I don't think I've ever seen such a thing before, congratulations for
> creating something so must have previously been lacking in our
> development model in how to work together in a community in a productive
> manner.

Apparently, new tools are needed in the community, as normal review
comments did not stop drivers/android/binder.c merge.

For example binder_transaction does not exactly look like a kernel
code, "TODO: fput" does not really invoke confidence, and ammount of
BUG_ON()s is quite amazing...


Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-07-08 Thread Pavel Machek
On Mon 2015-06-22 23:41:40, Greg KH wrote:
 On Mon, Jun 22, 2015 at 11:06:09PM -0700, Andy Lutomirski wrote:
  Hi Linus,
  
  Can you opine as to whether you think that kdbus should be merged?
 
 Ah, a preemptive pull request denial, how nice.

 I don't think I've ever seen such a thing before, congratulations for
 creating something so must have previously been lacking in our
 development model in how to work together in a community in a productive
 manner.

Apparently, new tools are needed in the community, as normal review
comments did not stop drivers/android/binder.c merge.

For example binder_transaction does not exactly look like a kernel
code, TODO: fput does not really invoke confidence, and ammount of
BUG_ON()s is quite amazing...


Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-07-06 Thread Kalle A. Sandstrom
On Wed, Jul 01, 2015 at 06:51:41PM +0200, David Herrmann wrote:
> Hi
> 

Thanks for the answers; in response I've got some further questions. Again,
apologies for length -- I apparently don't know how to discuss IPC tersely.

> On Wed, Jul 1, 2015 at 2:03 AM, Kalle A. Sandstrom  wrote:
> > For the first, compare unix domain sockets (i.e. point-to-point mode, access
> > control through filesystem [or fork() parentage], read/write/select) to the
> > kdbus message-sending ioctl. In the main data-exchanging portion, the former
> > requires only a connection identifier, a pointer to a buffer, and the length
> > of data in that buffer. To contrast, kdbus takes a complex message-sending
> > command structure with 0..n items of m kinds that the ioctl must parse in a
> > m-way switching loop, and then another complex message-describing structure
> > which has its own 1..n items of another m kinds describing its contents,
> > destination-lookup options, negotiation of supported options, and so forth.
> 
> sendmsg(2) uses a very similar payload to kdbus. send(2) is a shortcut
> to simplify the most common use-case. I'd be more than glad to accept
> patches adding such shortcuts to kdbus, if accompanied by benchmark
> numbers and reasoning why this is a common path for dbus/etc. clients.
> 

A shortcut special case for e.g. only iovec-like payload items, only to a
numerically designated peer, and only RPC forms, should be an immediate gain
given that reduced functionality would lower the number of instructions
executed, the number of impredictable branches met, and the number of
possibly-cold cache lines accessed.

The difference in raw cycles should be significant in comparison to the
number of kernel exits avoided during a client's RPC to a service and the
associated reply. Assuming that such RPCs are the bulk of what kdbus will
do, and that c/s avoidance is crucial to the performance argument in its
design, it seems silly not to have such a fast-path -- even if it is
initially implemented as a simple wrapper of the full send ioctl.

It would also put the basic send operation on par with sendmsg(2) over a
connected socket in terms of interface complexity, and simplify any future
"exit into peer without scheduler latency" shenanigans.

However, these gains would go unobserved in code written to the current
kdbus ABI. Bridging to such a fast-path from the full interface would
eliminate most of its benefits while hurting its legit callers.

That being said, considering that the eventual de-facto user API to kdbus is
a library with explicit deserialization, endianness conversion, and
suchlike, I could see how the difference would go unobserved.


> The kdbus API is kept generic and extendable, while trying to keep
> runtime overhead minimal. If this overhead turns out to be a
> significant runtime slowdown (which none of my benchmarks showed), we
> should consider adding shortcuts. Until then, I prefer an API that is
> consistent, easy to extend and flexible.
> 

Out of curiosity, what payload item types do you see being added in the near
future, e.g. the next year? UDS knows only of simple buffers, scatter/gather
iovecs, and inter-process dup(2); and recent Linux adds sourcing from a file
descriptor. Perhaps a "pass this previously-received message on" item?


> > Consequently, a carefully optimized implementation of unix domain sockets 
> > (and
> > by extension all the data-carrying SysV etc. IPC primitives, optimized
> > similarly) will always be superior to kdbus for both message throughput and
> > latency, [...]
> 
> Yes, that's due to the point-to-point nature of UDS.
> 

Does this change for broadcast, unassociated, or doubly-addressed[0]
operation? For the first, kdbus must already cause allocation of cache lines
in proportion to msg_length * n_recvrs, which mutes the broker's single-copy
advantage as the number of receivers grows. For the second, name lookup from
(say) a hash table only adds to required processing, though the resulting
identifier could be re-used immediately afterward; and the third mode would
prohibit that optimization altogether.

Relatedly, is there publicly-available data concerning the distribution of
various dbus IPC modalities? Such as a desktop booting under systemd,
running for a decent bit, and shutting down; or the automotive industry's
(presumably signaling-heavy) use cases which I've heard quoted for a figure
of 600k transactions before steady state.


> > [...] For long messages (> L1 cache size per Stetson-Harrison[0]) the
> > only performance benefit from kdbus is its claimed single-copy mode of
> > operation-- an equivalent to which could be had with ye olde sockets by 
> > copying
> > data from the writer directly into the reader while one of them blocks[1] in
> > the appropriate syscall. That the current Linux pipes, SysV queues, unix 
> > domain
> > sockets, etc. don't do this doesn't really factor in.
> 
> Parts of the network subsystem have supported single-copy (mmap'ed 

Re: kdbus: to merge or not to merge?

2015-07-06 Thread Kalle A. Sandstrom
On Wed, Jul 01, 2015 at 06:51:41PM +0200, David Herrmann wrote:
 Hi
 

Thanks for the answers; in response I've got some further questions. Again,
apologies for length -- I apparently don't know how to discuss IPC tersely.

 On Wed, Jul 1, 2015 at 2:03 AM, Kalle A. Sandstrom ksand...@iki.fi wrote:
  For the first, compare unix domain sockets (i.e. point-to-point mode, access
  control through filesystem [or fork() parentage], read/write/select) to the
  kdbus message-sending ioctl. In the main data-exchanging portion, the former
  requires only a connection identifier, a pointer to a buffer, and the length
  of data in that buffer. To contrast, kdbus takes a complex message-sending
  command structure with 0..n items of m kinds that the ioctl must parse in a
  m-way switching loop, and then another complex message-describing structure
  which has its own 1..n items of another m kinds describing its contents,
  destination-lookup options, negotiation of supported options, and so forth.
 
 sendmsg(2) uses a very similar payload to kdbus. send(2) is a shortcut
 to simplify the most common use-case. I'd be more than glad to accept
 patches adding such shortcuts to kdbus, if accompanied by benchmark
 numbers and reasoning why this is a common path for dbus/etc. clients.
 

A shortcut special case for e.g. only iovec-like payload items, only to a
numerically designated peer, and only RPC forms, should be an immediate gain
given that reduced functionality would lower the number of instructions
executed, the number of impredictable branches met, and the number of
possibly-cold cache lines accessed.

The difference in raw cycles should be significant in comparison to the
number of kernel exits avoided during a client's RPC to a service and the
associated reply. Assuming that such RPCs are the bulk of what kdbus will
do, and that c/s avoidance is crucial to the performance argument in its
design, it seems silly not to have such a fast-path -- even if it is
initially implemented as a simple wrapper of the full send ioctl.

It would also put the basic send operation on par with sendmsg(2) over a
connected socket in terms of interface complexity, and simplify any future
exit into peer without scheduler latency shenanigans.

However, these gains would go unobserved in code written to the current
kdbus ABI. Bridging to such a fast-path from the full interface would
eliminate most of its benefits while hurting its legit callers.

That being said, considering that the eventual de-facto user API to kdbus is
a library with explicit deserialization, endianness conversion, and
suchlike, I could see how the difference would go unobserved.


 The kdbus API is kept generic and extendable, while trying to keep
 runtime overhead minimal. If this overhead turns out to be a
 significant runtime slowdown (which none of my benchmarks showed), we
 should consider adding shortcuts. Until then, I prefer an API that is
 consistent, easy to extend and flexible.
 

Out of curiosity, what payload item types do you see being added in the near
future, e.g. the next year? UDS knows only of simple buffers, scatter/gather
iovecs, and inter-process dup(2); and recent Linux adds sourcing from a file
descriptor. Perhaps a pass this previously-received message on item?


  Consequently, a carefully optimized implementation of unix domain sockets 
  (and
  by extension all the data-carrying SysV etc. IPC primitives, optimized
  similarly) will always be superior to kdbus for both message throughput and
  latency, [...]
 
 Yes, that's due to the point-to-point nature of UDS.
 

Does this change for broadcast, unassociated, or doubly-addressed[0]
operation? For the first, kdbus must already cause allocation of cache lines
in proportion to msg_length * n_recvrs, which mutes the broker's single-copy
advantage as the number of receivers grows. For the second, name lookup from
(say) a hash table only adds to required processing, though the resulting
identifier could be re-used immediately afterward; and the third mode would
prohibit that optimization altogether.

Relatedly, is there publicly-available data concerning the distribution of
various dbus IPC modalities? Such as a desktop booting under systemd,
running for a decent bit, and shutting down; or the automotive industry's
(presumably signaling-heavy) use cases which I've heard quoted for a figure
of 600k transactions before steady state.


  [...] For long messages ( L1 cache size per Stetson-Harrison[0]) the
  only performance benefit from kdbus is its claimed single-copy mode of
  operation-- an equivalent to which could be had with ye olde sockets by 
  copying
  data from the writer directly into the reader while one of them blocks[1] in
  the appropriate syscall. That the current Linux pipes, SysV queues, unix 
  domain
  sockets, etc. don't do this doesn't really factor in.
 
 Parts of the network subsystem have supported single-copy (mmap'ed IO)
 for quite some time. kdbus mandates it, but 

Re: kdbus: to merge or not to merge?

2015-07-01 Thread David Herrmann
Hi

On Wed, Jul 1, 2015 at 2:03 AM, Kalle A. Sandstrom  wrote:
> For the first, compare unix domain sockets (i.e. point-to-point mode, access
> control through filesystem [or fork() parentage], read/write/select) to the
> kdbus message-sending ioctl. In the main data-exchanging portion, the former
> requires only a connection identifier, a pointer to a buffer, and the length
> of data in that buffer. To contrast, kdbus takes a complex message-sending
> command structure with 0..n items of m kinds that the ioctl must parse in a
> m-way switching loop, and then another complex message-describing structure
> which has its own 1..n items of another m kinds describing its contents,
> destination-lookup options, negotiation of supported options, and so forth.

sendmsg(2) uses a very similar payload to kdbus. send(2) is a shortcut
to simplify the most common use-case. I'd be more than glad to accept
patches adding such shortcuts to kdbus, if accompanied by benchmark
numbers and reasoning why this is a common path for dbus/etc. clients.

The kdbus API is kept generic and extendable, while trying to keep
runtime overhead minimal. If this overhead turns out to be a
significant runtime slowdown (which none of my benchmarks showed), we
should consider adding shortcuts. Until then, I prefer an API that is
consistent, easy to extend and flexible.

> Consequently, a carefully optimized implementation of unix domain sockets (and
> by extension all the data-carrying SysV etc. IPC primitives, optimized
> similarly) will always be superior to kdbus for both message throughput and
> latency, [...]

Yes, that's due to the point-to-point nature of UDS.

> [...] For long messages (> L1 cache size per Stetson-Harrison[0]) the
> only performance benefit from kdbus is its claimed single-copy mode of
> operation-- an equivalent to which could be had with ye olde sockets by 
> copying
> data from the writer directly into the reader while one of them blocks[1] in
> the appropriate syscall. That the current Linux pipes, SysV queues, unix 
> domain
> sockets, etc. don't do this doesn't really factor in.

Parts of the network subsystem have supported single-copy (mmap'ed IO)
for quite some time. kdbus mandates it, but otherwise is not special
in that regard.

> A consequence of this buffering is that whenever a client sends a message with
> kdbus, it must be prepared to handle an out-of-space non-delivery status.
> [...] There's no option to e.g. overwrite a previous
> message, or to discard queued messages in an oldest-first order, instead of
> rebuffing the sender.

Correct.

> For broadcast messaging, a recipient may observe that messages were dropped by
> looking at a `dropped_msgs' field delivered (and then reset) as part of the
> message reception ioctl. Its value is the number of messages dropped since 
> last
> read, so arguably a client could achieve the equivalent of the condition's
> absence by resynchronizing explicitly with all signal-senders on its current
> bus wrt which it knows the protocol, when the value is >0. This method could 
> in
> principle apply to 1-to-1 unidirectional messaging as well[2].

Correct.

> Looking at the kdbus "send message, wait for tagged reply" feature in
> conjunction with these details appears to reveal two holes in its state graph.
> The first is that if replies are delivered through the requestor's buffer,
> concurrent sends into that same buffer may cause it to become full (or the
> queue to grow too long, w/e) before the service gets a chance to reply. If 
> this
> condition causes a reply to fall out of the IPC flow, the requestor will hang
> until either its specified timeout happens or it gets interrupted by a signal.

If sending a reply fails, the kdbus_reply state is destructed and the
caller must be woken up. We do that for sync-calls just fine, but the
async case does indeed lack a wake-up in the error path. I noted this
down and will fix it.

> If replies are delivered outside the shm pool, the requestor must be prepared
> to pick them up using a different means from the "in your pool w/ offset X,
> length Y" format the main-line kdbus interface provides. [...]

Replies are never delivered outside the shm pool.

> The second problem is that given how there can be a timeout or interrupt on 
> the
> receive side of a "method call" transaction, it's possible for the requestor 
> to
> bow out of the IPC flow _while the service is processing its request_. This
> results either in the reply message being lost, or its ending up in the
> requestor's buffer to appear in a loop where it may not be expected. Either

(for completeness: we properly support resuming interrupted sync-calls)

> way, the client must at that point resynchronize wrt all objects related to 
> the
> request's side effects, or abandon the IPC flow entirely and start over.
> (services need only confirm their replies before effecting e.g. a chardev-like
> "destructively read N bytes from buffer" operation's outcome, 

Re: kdbus: to merge or not to merge?

2015-07-01 Thread David Herrmann
Hi

On Wed, Jul 1, 2015 at 2:03 AM, Kalle A. Sandstrom ksand...@iki.fi wrote:
 For the first, compare unix domain sockets (i.e. point-to-point mode, access
 control through filesystem [or fork() parentage], read/write/select) to the
 kdbus message-sending ioctl. In the main data-exchanging portion, the former
 requires only a connection identifier, a pointer to a buffer, and the length
 of data in that buffer. To contrast, kdbus takes a complex message-sending
 command structure with 0..n items of m kinds that the ioctl must parse in a
 m-way switching loop, and then another complex message-describing structure
 which has its own 1..n items of another m kinds describing its contents,
 destination-lookup options, negotiation of supported options, and so forth.

sendmsg(2) uses a very similar payload to kdbus. send(2) is a shortcut
to simplify the most common use-case. I'd be more than glad to accept
patches adding such shortcuts to kdbus, if accompanied by benchmark
numbers and reasoning why this is a common path for dbus/etc. clients.

The kdbus API is kept generic and extendable, while trying to keep
runtime overhead minimal. If this overhead turns out to be a
significant runtime slowdown (which none of my benchmarks showed), we
should consider adding shortcuts. Until then, I prefer an API that is
consistent, easy to extend and flexible.

 Consequently, a carefully optimized implementation of unix domain sockets (and
 by extension all the data-carrying SysV etc. IPC primitives, optimized
 similarly) will always be superior to kdbus for both message throughput and
 latency, [...]

Yes, that's due to the point-to-point nature of UDS.

 [...] For long messages ( L1 cache size per Stetson-Harrison[0]) the
 only performance benefit from kdbus is its claimed single-copy mode of
 operation-- an equivalent to which could be had with ye olde sockets by 
 copying
 data from the writer directly into the reader while one of them blocks[1] in
 the appropriate syscall. That the current Linux pipes, SysV queues, unix 
 domain
 sockets, etc. don't do this doesn't really factor in.

Parts of the network subsystem have supported single-copy (mmap'ed IO)
for quite some time. kdbus mandates it, but otherwise is not special
in that regard.

 A consequence of this buffering is that whenever a client sends a message with
 kdbus, it must be prepared to handle an out-of-space non-delivery status.
 [...] There's no option to e.g. overwrite a previous
 message, or to discard queued messages in an oldest-first order, instead of
 rebuffing the sender.

Correct.

 For broadcast messaging, a recipient may observe that messages were dropped by
 looking at a `dropped_msgs' field delivered (and then reset) as part of the
 message reception ioctl. Its value is the number of messages dropped since 
 last
 read, so arguably a client could achieve the equivalent of the condition's
 absence by resynchronizing explicitly with all signal-senders on its current
 bus wrt which it knows the protocol, when the value is 0. This method could 
 in
 principle apply to 1-to-1 unidirectional messaging as well[2].

Correct.

 Looking at the kdbus send message, wait for tagged reply feature in
 conjunction with these details appears to reveal two holes in its state graph.
 The first is that if replies are delivered through the requestor's buffer,
 concurrent sends into that same buffer may cause it to become full (or the
 queue to grow too long, w/e) before the service gets a chance to reply. If 
 this
 condition causes a reply to fall out of the IPC flow, the requestor will hang
 until either its specified timeout happens or it gets interrupted by a signal.

If sending a reply fails, the kdbus_reply state is destructed and the
caller must be woken up. We do that for sync-calls just fine, but the
async case does indeed lack a wake-up in the error path. I noted this
down and will fix it.

 If replies are delivered outside the shm pool, the requestor must be prepared
 to pick them up using a different means from the in your pool w/ offset X,
 length Y format the main-line kdbus interface provides. [...]

Replies are never delivered outside the shm pool.

 The second problem is that given how there can be a timeout or interrupt on 
 the
 receive side of a method call transaction, it's possible for the requestor 
 to
 bow out of the IPC flow _while the service is processing its request_. This
 results either in the reply message being lost, or its ending up in the
 requestor's buffer to appear in a loop where it may not be expected. Either

(for completeness: we properly support resuming interrupted sync-calls)

 way, the client must at that point resynchronize wrt all objects related to 
 the
 request's side effects, or abandon the IPC flow entirely and start over.
 (services need only confirm their replies before effecting e.g. a chardev-like
 destructively read N bytes from buffer operation's outcome, which is 
 slightly
 less ugly.)

Correct. If you 

Re: kdbus: to merge or not to merge?

2015-06-30 Thread Kalle A. Sandstrom

[delurk; apparently kdbus is not receiving the architectural review it should.
i've got quite a bit of knowledge on message-passing mechanisms in general, and
kernel IPC in particular, so i'll weigh in uninvited. apologies for length.

as my "proper" review on this topic is still under construction, i'll try (and
fail) to be brief here. i started down that road only to realize that kdbus is
quite the ball of mud even if the only thing under the scope is its interface,
and that if i held off until properly ready i'd risk kdbus having already been
merged, making review moot.]


Ingo Molnar wrote:

>- I've been closely monitoring Linux kernel changes for over 20 years, and for 
>the
>  last 10 years the linux/ipc/* code has been dormant: it works and was kept 
> good
>  for existing usecases, but no-one was maintaining and enhancing it with the
>  future in mind.

It's my understanding that linux/ipc/* contains only SysV IPC, i.e. shm, sem,
SysV message queues, and POSIX message queues. There are other IPC-implementing
things in the kernel also, such as unix domain sockets, pipes, shared memory
via mmap(), signals, mappings that appear shared across fork(), and whatever
else provides either kernel-mediated multi-client buffer access or some
combination of shared memory and synchronization that lets userspace exchange
hot data across the address space boundary.

It's also my understanding that no-one in their right mind would call SysV IPC
state-of-the-art even at the level of interface; indeed its presence in the
hoariest of vendor unixes suggests it's not supposed to be even close.

However, the suggested replacement in kdbus replicates the worst[-1] of all
known user-to-user IPC mechanisms, i.e. Mach. I'm not suggesting that Linux
adopt e.g. a different microkernel IPC mechanism-- those are by and large
inapplicable to a monolithic kernel for reasons of ABI (and, well, why would
you do IPC when function calls are zomgfast already?)-- but rather, that the
existing ones either are good enough at this time or can be reworked to become
near-equivalent to the state of the art in terms of performance.


>  So there exists a technical vacuum: the kernel does not have any good, modern
>  IPC ABI at the moment that distros can rely on as a 'golden standard'. This 
> is
>  partly technical, partly political. The technical reason is that SysV IPC is
>  ancient and cumbersome. The political reason is that SystemD could be using
>  and extending Android's existing kernel accelerated IPC subsystem (Binder)
>  that is already upstream - but does not.

I'll contend that the reason for this vacuum is that the existing kernel IPC
interfaces are fine to the point that other mechanisms may be derived from
them solely in user-space without significant performance demerit, and without
pushing ca. 10k SLOC of IPC broker and policy engine into kernel space.

Furthermore, it's my well-ruminated opinion that implementations of the
userspace ABI specified in the kdbus 4.1-rc1 version (of April this year) will
always be necessarily slower than existing IPC primitives in terms of both
throughput and latency; and that the latter are directly applicable to
constructing a more convenient user-space IPC broker that implements what
kdbus seeks to provide: naming, broadcast, unidirectional signaling,
bidirectional "method calls", and a policy mechanism.

In addition I'll argue that as currently specified, the kdbus interface-- even
if tuned to its utmost-- is not only necessarily inferior to e.g. a well-tuned
version of unix domain sockets, but also fundamentally flawed in ways that
prohibit construction of robust in-system distributed programs by kdbus'
mechanisms alone (i.e. byzantine call-site workarounds notwithstanding).


For the first, compare unix domain sockets (i.e. point-to-point mode, access
control through filesystem [or fork() parentage], read/write/select) to the
kdbus message-sending ioctl. In the main data-exchanging portion, the former
requires only a connection identifier, a pointer to a buffer, and the length
of data in that buffer. To contrast, kdbus takes a complex message-sending
command structure with 0..n items of m kinds that the ioctl must parse in a
m-way switching loop, and then another complex message-describing structure
which has its own 1..n items of another m kinds describing its contents,
destination-lookup options, negotiation of supported options, and so forth.

Consequently, a carefully optimized implementation of unix domain sockets (and
by extension all the data-carrying SysV etc. IPC primitives, optimized
similarly) will always be superior to kdbus for both message throughput and
latency, for the reason of kdbus' comparatively great interface complexity
alone.

There's an obvious caveat here, i.e. "well where is it, then?". Given the
overhead dictated by its interface, kdbus' performance is already inferior for
short messages. For long messages (> L1 cache size per Stetson-Harrison[0]) the
only 

Re: kdbus: to merge or not to merge?

2015-06-30 Thread Kalle A. Sandstrom

[delurk; apparently kdbus is not receiving the architectural review it should.
i've got quite a bit of knowledge on message-passing mechanisms in general, and
kernel IPC in particular, so i'll weigh in uninvited. apologies for length.

as my proper review on this topic is still under construction, i'll try (and
fail) to be brief here. i started down that road only to realize that kdbus is
quite the ball of mud even if the only thing under the scope is its interface,
and that if i held off until properly ready i'd risk kdbus having already been
merged, making review moot.]


Ingo Molnar wrote:

- I've been closely monitoring Linux kernel changes for over 20 years, and for 
the
  last 10 years the linux/ipc/* code has been dormant: it works and was kept 
 good
  for existing usecases, but no-one was maintaining and enhancing it with the
  future in mind.

It's my understanding that linux/ipc/* contains only SysV IPC, i.e. shm, sem,
SysV message queues, and POSIX message queues. There are other IPC-implementing
things in the kernel also, such as unix domain sockets, pipes, shared memory
via mmap(), signals, mappings that appear shared across fork(), and whatever
else provides either kernel-mediated multi-client buffer access or some
combination of shared memory and synchronization that lets userspace exchange
hot data across the address space boundary.

It's also my understanding that no-one in their right mind would call SysV IPC
state-of-the-art even at the level of interface; indeed its presence in the
hoariest of vendor unixes suggests it's not supposed to be even close.

However, the suggested replacement in kdbus replicates the worst[-1] of all
known user-to-user IPC mechanisms, i.e. Mach. I'm not suggesting that Linux
adopt e.g. a different microkernel IPC mechanism-- those are by and large
inapplicable to a monolithic kernel for reasons of ABI (and, well, why would
you do IPC when function calls are zomgfast already?)-- but rather, that the
existing ones either are good enough at this time or can be reworked to become
near-equivalent to the state of the art in terms of performance.


  So there exists a technical vacuum: the kernel does not have any good, modern
  IPC ABI at the moment that distros can rely on as a 'golden standard'. This 
 is
  partly technical, partly political. The technical reason is that SysV IPC is
  ancient and cumbersome. The political reason is that SystemD could be using
  and extending Android's existing kernel accelerated IPC subsystem (Binder)
  that is already upstream - but does not.

I'll contend that the reason for this vacuum is that the existing kernel IPC
interfaces are fine to the point that other mechanisms may be derived from
them solely in user-space without significant performance demerit, and without
pushing ca. 10k SLOC of IPC broker and policy engine into kernel space.

Furthermore, it's my well-ruminated opinion that implementations of the
userspace ABI specified in the kdbus 4.1-rc1 version (of April this year) will
always be necessarily slower than existing IPC primitives in terms of both
throughput and latency; and that the latter are directly applicable to
constructing a more convenient user-space IPC broker that implements what
kdbus seeks to provide: naming, broadcast, unidirectional signaling,
bidirectional method calls, and a policy mechanism.

In addition I'll argue that as currently specified, the kdbus interface-- even
if tuned to its utmost-- is not only necessarily inferior to e.g. a well-tuned
version of unix domain sockets, but also fundamentally flawed in ways that
prohibit construction of robust in-system distributed programs by kdbus'
mechanisms alone (i.e. byzantine call-site workarounds notwithstanding).


For the first, compare unix domain sockets (i.e. point-to-point mode, access
control through filesystem [or fork() parentage], read/write/select) to the
kdbus message-sending ioctl. In the main data-exchanging portion, the former
requires only a connection identifier, a pointer to a buffer, and the length
of data in that buffer. To contrast, kdbus takes a complex message-sending
command structure with 0..n items of m kinds that the ioctl must parse in a
m-way switching loop, and then another complex message-describing structure
which has its own 1..n items of another m kinds describing its contents,
destination-lookup options, negotiation of supported options, and so forth.

Consequently, a carefully optimized implementation of unix domain sockets (and
by extension all the data-carrying SysV etc. IPC primitives, optimized
similarly) will always be superior to kdbus for both message throughput and
latency, for the reason of kdbus' comparatively great interface complexity
alone.

There's an obvious caveat here, i.e. well where is it, then?. Given the
overhead dictated by its interface, kdbus' performance is already inferior for
short messages. For long messages ( L1 cache size per Stetson-Harrison[0]) the
only performance benefit from 

Re: kdbus: to merge or not to merge?

2015-06-25 Thread Steven Rostedt
On Thu, Jun 25, 2015 at 09:57:45AM +0200, Geert Uytterhoeven wrote:
> >
> > in-kernel webserver
> 
> Which was cool, and small, and _faster_ than anything else...
> Until it was integrated, and people working on (userspace) webservers
> started considering its performance as a target, and soon it was
> out-performed by userspace webservers...
> 
> So it did teach us a lesson...
> 
> (Perhaps the above paragraph is actually good advocacy for integrating
>  kdbus, and for seeding a better userspace implementation? ;-)
> 

Except back then, the userspace web servers were created by the competition
and there was a strong incentive to beat tux.

But today, kdbus is written by the same folks that write dbus, and there's no
other competition. There's no incentive to fix dbus once kdbus is merged, and
in fact, it gives incentive to just drop it completely.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread Martin Steigerwald
Am Donnerstag, 25. Juni 2015, 09:34:56 schrieb Theodore Ts'o:
> On Thu, Jun 25, 2015 at 08:05:58AM +0200, Martin Steigerwald wrote:
> > Or, do you think, that there is a different option to handle this then
> > the both I outlined above?
> 
> Hmm... distros could have their engineers **fix** the busted userspace
> code, instead of fixing the problem by jamming a different
> implementation into the kernel?

Hmm, I read on Devuan mailing list, that Qt engineers work on doing dbus 
directly inside Qt instead of using the existing libdbus. I did not verify 
this claim yet. But considering what I read here about performance issues 
with libdbus I think it would make quite some sense.

Also I wonder who will use sdbus stuff from systemd / libsystemd – I sure 
hope sdbus will work without systemd running as PID 1, but I am not clear on 
this either – from the desktop environment people beside xdg-app. I doubt 
that Qt will depend on it, being available for more than the Linux platform.

And if GNOME wants to be portable to the BSD variants at least, they can´t 
depend on it either.

So who will use non portable sdbus anyway – except specialized apps?

In case I missed this in the discussion so far, sorry, but from what I read 
from the various threads I am really not clear on this.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread Theodore Ts'o
On Thu, Jun 25, 2015 at 08:05:58AM +0200, Martin Steigerwald wrote:
> 
> Or, do you think, that there is a different option to handle this then the 
> both I outlined above?

Hmm... distros could have their engineers **fix** the busted userspace
code, instead of fixing the problem by jamming a different
implementation into the kernel?

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread Geert Uytterhoeven
On Wed, Jun 24, 2015 at 9:12 PM, David Lang  wrote:
> On Wed, 24 Jun 2015, Martin Steigerwald wrote:
>> Am Mittwoch, 24. Juni 2015, 10:39:52 schrieb David Lang:
>>> On Wed, 24 Jun 2015, Ingo Molnar wrote:
 And the thing is, in hindsight, after such huge flamewars, years down
 the line, almost never do I see the following question asked: 'what
 were we thinking merging that crap??'. If any question arises it's
 usually along the lines of: 'what was the big fuss about?'. So I think
 by and large the process works.
>>>
>>> counterexamples, devfs, tux
>>
>> What was tux?
>
> in-kernel webserver

Which was cool, and small, and _faster_ than anything else...
Until it was integrated, and people working on (userspace) webservers
started considering its performance as a target, and soon it was
out-performed by userspace webservers...

So it did teach us a lesson...

(Perhaps the above paragraph is actually good advocacy for integrating
 kdbus, and for seeding a better userspace implementation? ;-)

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread Ingo Molnar

* Ingo Molnar  wrote:

> 
> * David Lang  wrote:
> 
> > On Wed, 24 Jun 2015, Ingo Molnar wrote:
> > 
> > > And the thing is, in hindsight, after such huge flamewars, years down the 
> > > line, almost never do I see the following question asked: 'what were we 
> > > thinking merging that crap??'. If any question arises it's usually along 
> > > the 
> > > lines of: 'what was the big fuss about?'. So I think by and large the 
> > > process 
> > > works.
> > 
> > counterexamples, devfs, tux
> 
> Actually, we never merged the Tux web server upstream, and the devfs concept 
> has 
> kind of made a comeback via devtmpfs.

Bits of devfs also live on in sysfs. So devfs wasn't a bad initial idea IMHO, 
but 
we had to do one more (incompatible ...) iteration to figure out why we didn't 
like it.

Furthermore, I'm pretty sure there's a snowball's chance in hell that we'd have 
ended up with the current pretty cleaned up hardware/system ABI _without_ 
devfs. 
So it was a necessary pain.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread Ingo Molnar

* David Lang  wrote:

> On Wed, 24 Jun 2015, Ingo Molnar wrote:
> 
> > And the thing is, in hindsight, after such huge flamewars, years down the 
> > line, almost never do I see the following question asked: 'what were we 
> > thinking merging that crap??'. If any question arises it's usually along 
> > the 
> > lines of: 'what was the big fuss about?'. So I think by and large the 
> > process 
> > works.
> 
> counterexamples, devfs, tux

Actually, we never merged the Tux web server upstream, and the devfs concept 
has 
kind of made a comeback via devtmpfs.

And there are examples of bits we _should_ have merged:

 - GGI (General Graphics Interface)

 - [ and we should probably also have merged kgdb a decade earlier to avoid 
 wasting all that energy on flaming about it unnecessarily ;-) ]

And the thing is, I specifically talked about 'near zero cost' kernel patches 
that 
don't appreciably impact the 'core kernel'.

There's plenty of examples of features with non-trivial 'core kernel' costs 
that 
weren't merged, and rightfully IMHO:

 - the STREAMS ABI
 - various forms of a generic kABI that were proposed
 - moving the kernel to C++ :-)

... and devfs arguably belongs into that category as well.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread David Lang

On Wed, 24 Jun 2015, Greg KH wrote:


On Wed, Jun 24, 2015 at 10:39:52AM -0700, David Lang wrote:

On Wed, 24 Jun 2015, Ingo Molnar wrote:


And the thing is, in hindsight, after such huge flamewars, years down the line,
almost never do I see the following question asked: 'what were we thinking 
merging
that crap??'. If any question arises it's usually along the lines of: 'what was
the big fuss about?'. So I think by and large the process works.


counterexamples, devfs, tux


Don't knock devfs.  It created a lot of things that we take for granted
now with our development model.  Off the top of my head, here's a short
list:
- it showed that we can't arbritrary make user/kernel api
  changes without working with people outside of the kernel
  developer community, and expect people to follow them
- the idea was sound, but the implementation was not, it had
  unfixable problems, so to fix those problems, we came up with
  better, kernel-wide solutions, forcing us to unify all
  device/driver subsystems.
- we were forced to try to document our user/kernel apis better,
  hence Documentation/ABI/ was created
- to remove devfs, we had to create a structure of _how_ to
  remove features.  It took me 2-3 years to be able to finally
  delete the devfs code, as the infrastructure and feedback
  loops were just not in place before then to allow that to
  happen.

So I would strongly argue that merging devfs was a good thing, it
spurned a lot of us to get the job done correctly.  Without it, we would
have never seen the need, or had the knowledge of what needed to be
done.


I don't disagree with you, but it was definantly a case of adding something that 
was later regretted and removed. A lot was learned in the process, but that 
wasn't the issue I was referring to.


I don't want kdbus to end up the same way. The more I think back to those 
discussions, the more parallels I see between the two.


David Lang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread Greg KH
On Wed, Jun 24, 2015 at 10:39:52AM -0700, David Lang wrote:
> On Wed, 24 Jun 2015, Ingo Molnar wrote:
> 
> >And the thing is, in hindsight, after such huge flamewars, years down the 
> >line,
> >almost never do I see the following question asked: 'what were we thinking 
> >merging
> >that crap??'. If any question arises it's usually along the lines of: 'what 
> >was
> >the big fuss about?'. So I think by and large the process works.
> 
> counterexamples, devfs, tux

Don't knock devfs.  It created a lot of things that we take for granted
now with our development model.  Off the top of my head, here's a short
list:
- it showed that we can't arbritrary make user/kernel api
  changes without working with people outside of the kernel
  developer community, and expect people to follow them
- the idea was sound, but the implementation was not, it had
  unfixable problems, so to fix those problems, we came up with
  better, kernel-wide solutions, forcing us to unify all
  device/driver subsystems.
- we were forced to try to document our user/kernel apis better,
  hence Documentation/ABI/ was created
- to remove devfs, we had to create a structure of _how_ to
  remove features.  It took me 2-3 years to be able to finally
  delete the devfs code, as the infrastructure and feedback
  loops were just not in place before then to allow that to
  happen.

So I would strongly argue that merging devfs was a good thing, it
spurned a lot of us to get the job done correctly.  Without it, we would
have never seen the need, or had the knowledge of what needed to be
done.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread Martin Steigerwald
Am Donnerstag, 25. Juni 2015, 08:01:35 schrieb Martin Steigerwald:
> Am Mittwoch, 24. Juni 2015, 19:20:27 schrieb Linus Torvalds:
> > On Wed, Jun 24, 2015 at 7:14 PM, Steven Rostedt 
> 
> wrote:
> > > I don't think it will complicate things even if the API changes. The
> > > distros will have to deal with that fall out. Mainline only cares
> > > about
> > > its own regressions. But any API changes would only be done for good
> > > reasons, and give the distros an excuse to fix whatever was done wrong
> > > in the first place.
> > 
> > I don't think that's true.
> > 
> > Realistically, every single kernel developer tends to work on a
> > machine with some random distro. If that developer cannot compile his
> > own kernel because his distro stops working, or has to use some
> > "kdbus=0" switch to turn off the kernel kdbus and (hopefuly) the
> > distro just switches to the legacy user mode bus, then for that
> > developer, merging and enabling incompatible kdbus implementation is
> > basically a regression.
> > 
> > We've seen this before. We end up stuck with the ABI of whatever user
> > land applications. It doesn't matter where that ABI came from.
> > 
> > I do agree that distro's that want to enable kdbus before any agreed
> > version has been merged would get to also act as guinea pigs and do
> > their own QA, and handle fallout from whatever problems they encounter
> > etc. That part might be good. But I don't think we really end up
> > having the option to make up some incompatible kdbus ABI
> > after-the-fact.
> 
> Linus, so is that a recommendation to the distros to be careful to put
> kdbus into the distro kernel right now and probably better defer it or
> are you thinking that the ABI of kdbus already is suitable for merging
> and you see no issues to merge a kdbus with the ABI it currently has, but
> probably otherwise improved?

Or, do you think, that there is a different option to handle this then the 
both I outlined above?

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread Martin Steigerwald
Am Mittwoch, 24. Juni 2015, 19:20:27 schrieb Linus Torvalds:
> On Wed, Jun 24, 2015 at 7:14 PM, Steven Rostedt  
wrote:
> > I don't think it will complicate things even if the API changes. The
> > distros will have to deal with that fall out. Mainline only cares about
> > its own regressions. But any API changes would only be done for good
> > reasons, and give the distros an excuse to fix whatever was done wrong
> > in the first place.
> I don't think that's true.
> 
> Realistically, every single kernel developer tends to work on a
> machine with some random distro. If that developer cannot compile his
> own kernel because his distro stops working, or has to use some
> "kdbus=0" switch to turn off the kernel kdbus and (hopefuly) the
> distro just switches to the legacy user mode bus, then for that
> developer, merging and enabling incompatible kdbus implementation is
> basically a regression.
> 
> We've seen this before. We end up stuck with the ABI of whatever user
> land applications. It doesn't matter where that ABI came from.
> 
> I do agree that distro's that want to enable kdbus before any agreed
> version has been merged would get to also act as guinea pigs and do
> their own QA, and handle fallout from whatever problems they encounter
> etc. That part might be good. But I don't think we really end up
> having the option to make up some incompatible kdbus ABI
> after-the-fact.

Linus, so is that a recommendation to the distros to be careful to put kdbus 
into the distro kernel right now and probably better defer it or are you 
thinking that the ABI of kdbus already is suitable for merging and you see 
no issues to merge a kdbus with the ABI it currently has, but probably 
otherwise improved?

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread David Lang

On Wed, 24 Jun 2015, Greg KH wrote:


On Wed, Jun 24, 2015 at 10:39:52AM -0700, David Lang wrote:

On Wed, 24 Jun 2015, Ingo Molnar wrote:


And the thing is, in hindsight, after such huge flamewars, years down the line,
almost never do I see the following question asked: 'what were we thinking 
merging
that crap??'. If any question arises it's usually along the lines of: 'what was
the big fuss about?'. So I think by and large the process works.


counterexamples, devfs, tux


Don't knock devfs.  It created a lot of things that we take for granted
now with our development model.  Off the top of my head, here's a short
list:
- it showed that we can't arbritrary make user/kernel api
  changes without working with people outside of the kernel
  developer community, and expect people to follow them
- the idea was sound, but the implementation was not, it had
  unfixable problems, so to fix those problems, we came up with
  better, kernel-wide solutions, forcing us to unify all
  device/driver subsystems.
- we were forced to try to document our user/kernel apis better,
  hence Documentation/ABI/ was created
- to remove devfs, we had to create a structure of _how_ to
  remove features.  It took me 2-3 years to be able to finally
  delete the devfs code, as the infrastructure and feedback
  loops were just not in place before then to allow that to
  happen.

So I would strongly argue that merging devfs was a good thing, it
spurned a lot of us to get the job done correctly.  Without it, we would
have never seen the need, or had the knowledge of what needed to be
done.


I don't disagree with you, but it was definantly a case of adding something that 
was later regretted and removed. A lot was learned in the process, but that 
wasn't the issue I was referring to.


I don't want kdbus to end up the same way. The more I think back to those 
discussions, the more parallels I see between the two.


David Lang
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread Ingo Molnar

* David Lang da...@lang.hm wrote:

 On Wed, 24 Jun 2015, Ingo Molnar wrote:
 
  And the thing is, in hindsight, after such huge flamewars, years down the 
  line, almost never do I see the following question asked: 'what were we 
  thinking merging that crap??'. If any question arises it's usually along 
  the 
  lines of: 'what was the big fuss about?'. So I think by and large the 
  process 
  works.
 
 counterexamples, devfs, tux

Actually, we never merged the Tux web server upstream, and the devfs concept 
has 
kind of made a comeback via devtmpfs.

And there are examples of bits we _should_ have merged:

 - GGI (General Graphics Interface)

 - [ and we should probably also have merged kgdb a decade earlier to avoid 
 wasting all that energy on flaming about it unnecessarily ;-) ]

And the thing is, I specifically talked about 'near zero cost' kernel patches 
that 
don't appreciably impact the 'core kernel'.

There's plenty of examples of features with non-trivial 'core kernel' costs 
that 
weren't merged, and rightfully IMHO:

 - the STREAMS ABI
 - various forms of a generic kABI that were proposed
 - moving the kernel to C++ :-)

... and devfs arguably belongs into that category as well.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread Ingo Molnar

* Ingo Molnar mi...@kernel.org wrote:

 
 * David Lang da...@lang.hm wrote:
 
  On Wed, 24 Jun 2015, Ingo Molnar wrote:
  
   And the thing is, in hindsight, after such huge flamewars, years down the 
   line, almost never do I see the following question asked: 'what were we 
   thinking merging that crap??'. If any question arises it's usually along 
   the 
   lines of: 'what was the big fuss about?'. So I think by and large the 
   process 
   works.
  
  counterexamples, devfs, tux
 
 Actually, we never merged the Tux web server upstream, and the devfs concept 
 has 
 kind of made a comeback via devtmpfs.

Bits of devfs also live on in sysfs. So devfs wasn't a bad initial idea IMHO, 
but 
we had to do one more (incompatible ...) iteration to figure out why we didn't 
like it.

Furthermore, I'm pretty sure there's a snowball's chance in hell that we'd have 
ended up with the current pretty cleaned up hardware/system ABI _without_ 
devfs. 
So it was a necessary pain.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread Geert Uytterhoeven
On Wed, Jun 24, 2015 at 9:12 PM, David Lang da...@lang.hm wrote:
 On Wed, 24 Jun 2015, Martin Steigerwald wrote:
 Am Mittwoch, 24. Juni 2015, 10:39:52 schrieb David Lang:
 On Wed, 24 Jun 2015, Ingo Molnar wrote:
 And the thing is, in hindsight, after such huge flamewars, years down
 the line, almost never do I see the following question asked: 'what
 were we thinking merging that crap??'. If any question arises it's
 usually along the lines of: 'what was the big fuss about?'. So I think
 by and large the process works.

 counterexamples, devfs, tux

 What was tux?

 in-kernel webserver

Which was cool, and small, and _faster_ than anything else...
Until it was integrated, and people working on (userspace) webservers
started considering its performance as a target, and soon it was
out-performed by userspace webservers...

So it did teach us a lesson...

(Perhaps the above paragraph is actually good advocacy for integrating
 kdbus, and for seeding a better userspace implementation? ;-)

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread Theodore Ts'o
On Thu, Jun 25, 2015 at 08:05:58AM +0200, Martin Steigerwald wrote:
 
 Or, do you think, that there is a different option to handle this then the 
 both I outlined above?

Hmm... distros could have their engineers **fix** the busted userspace
code, instead of fixing the problem by jamming a different
implementation into the kernel?

- Ted
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread Steven Rostedt
On Thu, Jun 25, 2015 at 09:57:45AM +0200, Geert Uytterhoeven wrote:
 
  in-kernel webserver
 
 Which was cool, and small, and _faster_ than anything else...
 Until it was integrated, and people working on (userspace) webservers
 started considering its performance as a target, and soon it was
 out-performed by userspace webservers...
 
 So it did teach us a lesson...
 
 (Perhaps the above paragraph is actually good advocacy for integrating
  kdbus, and for seeding a better userspace implementation? ;-)
 

Except back then, the userspace web servers were created by the competition
and there was a strong incentive to beat tux.

But today, kdbus is written by the same folks that write dbus, and there's no
other competition. There's no incentive to fix dbus once kdbus is merged, and
in fact, it gives incentive to just drop it completely.

-- Steve

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread Martin Steigerwald
Am Donnerstag, 25. Juni 2015, 09:34:56 schrieb Theodore Ts'o:
 On Thu, Jun 25, 2015 at 08:05:58AM +0200, Martin Steigerwald wrote:
  Or, do you think, that there is a different option to handle this then
  the both I outlined above?
 
 Hmm... distros could have their engineers **fix** the busted userspace
 code, instead of fixing the problem by jamming a different
 implementation into the kernel?

Hmm, I read on Devuan mailing list, that Qt engineers work on doing dbus 
directly inside Qt instead of using the existing libdbus. I did not verify 
this claim yet. But considering what I read here about performance issues 
with libdbus I think it would make quite some sense.

Also I wonder who will use sdbus stuff from systemd / libsystemd – I sure 
hope sdbus will work without systemd running as PID 1, but I am not clear on 
this either – from the desktop environment people beside xdg-app. I doubt 
that Qt will depend on it, being available for more than the Linux platform.

And if GNOME wants to be portable to the BSD variants at least, they can´t 
depend on it either.

So who will use non portable sdbus anyway – except specialized apps?

In case I missed this in the discussion so far, sorry, but from what I read 
from the various threads I am really not clear on this.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread Martin Steigerwald
Am Mittwoch, 24. Juni 2015, 19:20:27 schrieb Linus Torvalds:
 On Wed, Jun 24, 2015 at 7:14 PM, Steven Rostedt rost...@goodmis.org 
wrote:
  I don't think it will complicate things even if the API changes. The
  distros will have to deal with that fall out. Mainline only cares about
  its own regressions. But any API changes would only be done for good
  reasons, and give the distros an excuse to fix whatever was done wrong
  in the first place.
 I don't think that's true.
 
 Realistically, every single kernel developer tends to work on a
 machine with some random distro. If that developer cannot compile his
 own kernel because his distro stops working, or has to use some
 kdbus=0 switch to turn off the kernel kdbus and (hopefuly) the
 distro just switches to the legacy user mode bus, then for that
 developer, merging and enabling incompatible kdbus implementation is
 basically a regression.
 
 We've seen this before. We end up stuck with the ABI of whatever user
 land applications. It doesn't matter where that ABI came from.
 
 I do agree that distro's that want to enable kdbus before any agreed
 version has been merged would get to also act as guinea pigs and do
 their own QA, and handle fallout from whatever problems they encounter
 etc. That part might be good. But I don't think we really end up
 having the option to make up some incompatible kdbus ABI
 after-the-fact.

Linus, so is that a recommendation to the distros to be careful to put kdbus 
into the distro kernel right now and probably better defer it or are you 
thinking that the ABI of kdbus already is suitable for merging and you see 
no issues to merge a kdbus with the ABI it currently has, but probably 
otherwise improved?

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread Greg KH
On Wed, Jun 24, 2015 at 10:39:52AM -0700, David Lang wrote:
 On Wed, 24 Jun 2015, Ingo Molnar wrote:
 
 And the thing is, in hindsight, after such huge flamewars, years down the 
 line,
 almost never do I see the following question asked: 'what were we thinking 
 merging
 that crap??'. If any question arises it's usually along the lines of: 'what 
 was
 the big fuss about?'. So I think by and large the process works.
 
 counterexamples, devfs, tux

Don't knock devfs.  It created a lot of things that we take for granted
now with our development model.  Off the top of my head, here's a short
list:
- it showed that we can't arbritrary make user/kernel api
  changes without working with people outside of the kernel
  developer community, and expect people to follow them
- the idea was sound, but the implementation was not, it had
  unfixable problems, so to fix those problems, we came up with
  better, kernel-wide solutions, forcing us to unify all
  device/driver subsystems.
- we were forced to try to document our user/kernel apis better,
  hence Documentation/ABI/ was created
- to remove devfs, we had to create a structure of _how_ to
  remove features.  It took me 2-3 years to be able to finally
  delete the devfs code, as the infrastructure and feedback
  loops were just not in place before then to allow that to
  happen.

So I would strongly argue that merging devfs was a good thing, it
spurned a lot of us to get the job done correctly.  Without it, we would
have never seen the need, or had the knowledge of what needed to be
done.

thanks,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-25 Thread Martin Steigerwald
Am Donnerstag, 25. Juni 2015, 08:01:35 schrieb Martin Steigerwald:
 Am Mittwoch, 24. Juni 2015, 19:20:27 schrieb Linus Torvalds:
  On Wed, Jun 24, 2015 at 7:14 PM, Steven Rostedt rost...@goodmis.org
 
 wrote:
   I don't think it will complicate things even if the API changes. The
   distros will have to deal with that fall out. Mainline only cares
   about
   its own regressions. But any API changes would only be done for good
   reasons, and give the distros an excuse to fix whatever was done wrong
   in the first place.
  
  I don't think that's true.
  
  Realistically, every single kernel developer tends to work on a
  machine with some random distro. If that developer cannot compile his
  own kernel because his distro stops working, or has to use some
  kdbus=0 switch to turn off the kernel kdbus and (hopefuly) the
  distro just switches to the legacy user mode bus, then for that
  developer, merging and enabling incompatible kdbus implementation is
  basically a regression.
  
  We've seen this before. We end up stuck with the ABI of whatever user
  land applications. It doesn't matter where that ABI came from.
  
  I do agree that distro's that want to enable kdbus before any agreed
  version has been merged would get to also act as guinea pigs and do
  their own QA, and handle fallout from whatever problems they encounter
  etc. That part might be good. But I don't think we really end up
  having the option to make up some incompatible kdbus ABI
  after-the-fact.
 
 Linus, so is that a recommendation to the distros to be careful to put
 kdbus into the distro kernel right now and probably better defer it or
 are you thinking that the ABI of kdbus already is suitable for merging
 and you see no issues to merge a kdbus with the ABI it currently has, but
 probably otherwise improved?

Or, do you think, that there is a different option to handle this then the 
both I outlined above?

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-24 Thread Linus Torvalds
On Wed, Jun 24, 2015 at 7:14 PM, Steven Rostedt  wrote:
>
> I don't think it will complicate things even if the API changes. The distros
> will have to deal with that fall out. Mainline only cares about its own
> regressions. But any API changes would only be done for good reasons, and give
> the distros an excuse to fix whatever was done wrong in the first place.

I don't think that's true.

Realistically, every single kernel developer tends to work on a
machine with some random distro. If that developer cannot compile his
own kernel because his distro stops working, or has to use some
"kdbus=0" switch to turn off the kernel kdbus and (hopefuly) the
distro just switches to the legacy user mode bus, then for that
developer, merging and enabling incompatible kdbus implementation is
basically a regression.

We've seen this before. We end up stuck with the ABI of whatever user
land applications. It doesn't matter where that ABI came from.

I do agree that distro's that want to enable kdbus before any agreed
version has been merged would get to also act as guinea pigs and do
their own QA, and handle fallout from whatever problems they encounter
etc. That part might be good. But I don't think we really end up
having the option to make up some incompatible kdbus ABI
after-the-fact.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-24 Thread Steven Rostedt
On Tue, Jun 23, 2015 at 08:07:41AM -0700, Andy Lutomirski wrote:
> 
> FWIW, once there are real distros with kdbus userspace enabled,
> reviewing kdbus gets more complicated -- we'll be in the position
> where merging kdbus in a different form from that which was proposed
> will break existing users.

Actually, I think distros having it in their kernel before it's in mainline is
actually a good thing. Let them straighten out the issues that may come up
(not to mention possible CVEs). If the distros have it in their kernels and
out in the public for 6 months or more, that may give enough information as to
whether or not it should be merged.

I don't think it will complicate things even if the API changes. The distros
will have to deal with that fall out. Mainline only cares about its own
regressions. But any API changes would only be done for good reasons, and give
the distros an excuse to fix whatever was done wrong in the first place.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-24 Thread Alexander Larsson
On Wed, Jun 24, 2015 at 9:43 PM, Andy Lutomirski  wrote:
> On Wed, Jun 24, 2015 at 10:11 AM, Alexander Larsson
>  wrote:

>> My name is on the dbus specification, and I am (and was
>> then) well aware of systems with object references. In fact, both
>> previous ipc systems (Corba ORBs) that Gnome used before we designed
>> dbus uses object references, and they had a lot of problems which dbus
>> solved for us. I'm not saying dbus is perfect, but it has its reasons
>> for the way it works.
>>
>> So, dbus-the-system has some kind of notion of an object reference
>> (peer + object path), but the *bus* is fundamentally a way to
>> communicate between peers, and the object path is merely some
>> uninterpreted metadata.
>
> I'm talking about the reference part, not the object part.  Peer +
> object path is a name, not a reference.

True, its not a reference in the "refcount" style.

>> You wish that the kernel controlled access to a particular object in a
>> peer, but once the message is dispatched into the target process all
>> bets are off anyway. It will be running some code parsing your message
>> in the process with no real separation from the other objects. Any bug
>> there could give you wider access. I don't see how this fundamentally
>> makes the whole system much more secure. On the other hand, I *do*
>> remember having to track down cross-process leaks from circular
>> references between processes using Corba...
>
> If you have peer ids keeping things alive on dbus, surely you can also
> have circular references, no?

Technically you could set up a situation where this happens, but in
practice it doesn't really. Because object paths don't keep other
processes alive you don't accidentally get circular references,
whereas this happened a lot on corba because references was the only
thing you had.

>> You can run three instances of an app, but only one of them can own
>> the bus name. This works out fine if your app does not use dbus, but
>> it may be a problem if it uses dbus activation.
>
> I'd really like to be able to xdg-app --stateless oowriter
> some_untrusted_file.docx and have it fully functional, including
> printing, even if I have another instance running.

If that was to work then you'd have to have a way to make all the
session services that are needed for it to work to also listen to the
new custom bus for only that app.

>> Well, the service providers are not the same as the portals. Say you
>> have a twitter client that you want to register to be able to share
>> some selected text. The twitter client can be fully sandboxed. The
>> portal is just the link between the requesting app and the list of
>> registered share providers.
>>
>
> Ah.  I clearly am misunderstanding something.  What's a portal?

Well, portal is a general name for "service needed for making
sandboxed apps work". So, they can be a bit different, but in essence
they are small dbus services that facilitate communication between
different apps and between the app and the host session, in a safe
way. Think of them sort of like filtering proxies, but with a gui.

>> Well, that is essentially what a portal like the share one does.
>> Although it shows a user controlled UI inbetween to avoid the app
>> being able to start any other app it wants.
>
> Hmm.  So shouldn't xdg-app be explicitly choosing which portals are
> allowed for which sandboxed apps rather than allowing
> org.freedesktop.portal.*?

Right now there is no default policy for this, as we don't really have
the portal system fully formed yet. But, yeah, using portal.* was an
example of a policy, another would be to list the allowed portals
explicitly.

>> You're free to design such a system and write a desktop to use it.
>> However, in Gnome (and in the other desktops as well), dbus is already
>> used for all ipc like this and all the freedesktop specs,
>> infrastructure, type systems, interfaces, code and frameworks are
>> built around that. There has to be a *massive* advantage for us to use
>> something else, and I'm not at all convinced by the issues you bring
>> up.
>
> The custom endpoint policy thing is brand new, whereas using a
> userspace proxy for xdg-app actually sounds easier than using a
> separate kdbus bus.  Sticking dbus in the kernel would also be new.

Yeah, some code in the middle is new, but the entire infrastructure
and sematics are the same. I got the feeling you were proposing
something completely different to dbus.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-24 Thread Andy Lutomirski
On Wed, Jun 24, 2015 at 10:11 AM, Alexander Larsson
 wrote:
> On Wed, Jun 24, 2015 at 5:38 PM, Andy Lutomirski  wrote:
>> Was this intentionally off-list?
>
> Nah, that was a mistake, adding back the list.
>
>> On Wed, Jun 24, 2015 at 8:10 AM, Alexander Larsson
>
>>> The way i did it in the userspace proxy is to allow peer exited
>>> messages from services that talked to you at some point, as this is
>>> the core requirement (you must be able to limit things to the lifetime
>>> of clients). However, i can see how tracking that in the kernel is a
>>> bit painful, so just allowing all is probably a reasonable choice.
>>>
>>
>> Hmm.  I guess this is an ugliness of dbus in general.  Since dbus
>> doesn't really have a concept of objects (AIUI) you can't really get a
>> notification that a particular object that you have a reference to is
>> gone, so you have to ask for notification that the peer providing the
>> object is gone, but there was never any concept of having a reference
>> to a peer, so here we are :(
>
> You keep using works like ugly and stupid, which isn't super
> impressive.

Fair enough.  On the other hand, I've called my own code ugly plenty of times.

> My name is on the dbus specification, and I am (and was
> then) well aware of systems with object references. In fact, both
> previous ipc systems (Corba ORBs) that Gnome used before we designed
> dbus uses object references, and they had a lot of problems which dbus
> solved for us. I'm not saying dbus is perfect, but it has its reasons
> for the way it works.
>
> So, dbus-the-system has some kind of notion of an object reference
> (peer + object path), but the *bus* is fundamentally a way to
> communicate between peers, and the object path is merely some
> uninterpreted metadata.

I'm talking about the reference part, not the object part.  Peer +
object path is a name, not a reference.

> Once the message reaches the destination
> process it is essentially free to interpret the object path however
> they want. If something needs a long lasting "reference" to an object
> you can implement that by e.g. using a Subscribe method, and you can
> guarantee cleanup because the bus will tell you if the peer died.

Except you can't pass them around.  So it's still reference-by-name
instead of reference-by-actual-reference.

>
> This also means that the bus itself is vastly simplified. It only has
> to track peers, not every object in every peer. And clients are more
> flexible with how objects are handled. They can be instantiated
> lazily, or even created algorithmically from the object path if
> needed.

True.  Nonetheless, things like Cap'n Proto and seL4 are quite simple
and have real references.

>
> You wish that the kernel controlled access to a particular object in a
> peer, but once the message is dispatched into the target process all
> bets are off anyway. It will be running some code parsing your message
> in the process with no real separation from the other objects. Any bug
> there could give you wider access. I don't see how this fundamentally
> makes the whole system much more secure. On the other hand, I *do*
> remember having to track down cross-process leaks from circular
> references between processes using Corba...

If you have peer ids keeping things alive on dbus, surely you can also
have circular references, no?

>
>>> The desktop file lists the icon, name and whatnot which is displayed
>>> by the desktop environment. If DBusActivatable is true, then the app
>>> is started by sending dbus messages to the same name as the desktop
>>> file, to the org.freedesktop.Application interface, this way we can
>>> ensure a singleton app and you can do more things than just spawning
>>> it.
>>
>> How do I install apps as an unprivileged user?  What about running
>> sandboxed apps that aren't installed at all?  What about downloading
>> one app and running three instances of it that are all isolated from
>> each other?
>
> Users install desktop files in a file in their home directory
> (~/.local/share/applications/ typically).
>
> xdg-app apps require some form of installation before running.

IMO that's unfortunate.  If nothing else, it prevents programs from
easily starting one-off sandboxed apps that weren't separately
installed.

>
> You can run three instances of an app, but only one of them can own
> the bus name. This works out fine if your app does not use dbus, but
> it may be a problem if it uses dbus activation.

I'd really like to be able to xdg-app --stateless oowriter
some_untrusted_file.docx and have it fully functional, including
printing, even if I have another instance running.

>
>>> Well, your "other than" part kinda breaks things like launching the
>>> application. So, we need to be on the real bus.
>>> Could you then *also* have a bus per app for talking to the portal? I
>>> guess so, but I don't quite see the point. Just having the portals
>>> trying to find all new buses that come and go will be all kinds of
>>> 

Re: kdbus: to merge or not to merge?

2015-06-24 Thread David Lang

On Wed, 24 Jun 2015, Martin Steigerwald wrote:


Am Mittwoch, 24. Juni 2015, 10:39:52 schrieb David Lang:

On Wed, 24 Jun 2015, Ingo Molnar wrote:

And the thing is, in hindsight, after such huge flamewars, years down
the line, almost never do I see the following question asked: 'what
were we thinking merging that crap??'. If any question arises it's
usually along the lines of: 'what was the big fuss about?'. So I think
by and large the process works.

counterexamples, devfs, tux


What was tux?


in-kernel webserver

David Lang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-24 Thread Martin Steigerwald
Am Mittwoch, 24. Juni 2015, 10:39:52 schrieb David Lang:
> On Wed, 24 Jun 2015, Ingo Molnar wrote:
> > And the thing is, in hindsight, after such huge flamewars, years down
> > the line, almost never do I see the following question asked: 'what
> > were we thinking merging that crap??'. If any question arises it's
> > usually along the lines of: 'what was the big fuss about?'. So I think
> > by and large the process works.
> counterexamples, devfs, tux

What was tux?

The filesystem tux3 is not merged as far as I am aware.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-24 Thread Eric W. Biederman
David Lang  writes:

> On Wed, 24 Jun 2015, Ingo Molnar wrote:
>
>> And the thing is, in hindsight, after such huge flamewars, years down the 
>> line,
>> almost never do I see the following question asked: 'what were we thinking 
>> merging
>> that crap??'. If any question arises it's usually along the lines of: 'what 
>> was
>> the big fuss about?'. So I think by and large the process works.
>
> counterexamples, devfs, tux

The biggest I can think of cgroups.

The way cgroups connect to processes instead of resources (semantically)
and the fact that controllers are different from fundamental entities
like schedulers.

Of course I don't think "What were we thinking" I remember it all too
well in that case.

I think "What do we do now that we have made this mess".

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-24 Thread David Lang

On Wed, 24 Jun 2015, Ingo Molnar wrote:


And the thing is, in hindsight, after such huge flamewars, years down the line,
almost never do I see the following question asked: 'what were we thinking 
merging
that crap??'. If any question arises it's usually along the lines of: 'what was
the big fuss about?'. So I think by and large the process works.


counterexamples, devfs, tux

David Lang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-24 Thread Alexander Larsson
On Wed, Jun 24, 2015 at 5:38 PM, Andy Lutomirski  wrote:
> Was this intentionally off-list?

Nah, that was a mistake, adding back the list.

> On Wed, Jun 24, 2015 at 8:10 AM, Alexander Larsson

>> The way i did it in the userspace proxy is to allow peer exited
>> messages from services that talked to you at some point, as this is
>> the core requirement (you must be able to limit things to the lifetime
>> of clients). However, i can see how tracking that in the kernel is a
>> bit painful, so just allowing all is probably a reasonable choice.
>>
>
> Hmm.  I guess this is an ugliness of dbus in general.  Since dbus
> doesn't really have a concept of objects (AIUI) you can't really get a
> notification that a particular object that you have a reference to is
> gone, so you have to ask for notification that the peer providing the
> object is gone, but there was never any concept of having a reference
> to a peer, so here we are :(

You keep using works like ugly and stupid, which isn't super
impressive. My name is on the dbus specification, and I am (and was
then) well aware of systems with object references. In fact, both
previous ipc systems (Corba ORBs) that Gnome used before we designed
dbus uses object references, and they had a lot of problems which dbus
solved for us. I'm not saying dbus is perfect, but it has its reasons
for the way it works.

So, dbus-the-system has some kind of notion of an object reference
(peer + object path), but the *bus* is fundamentally a way to
communicate between peers, and the object path is merely some
uninterpreted metadata. Once the message reaches the destination
process it is essentially free to interpret the object path however
they want. If something needs a long lasting "reference" to an object
you can implement that by e.g. using a Subscribe method, and you can
guarantee cleanup because the bus will tell you if the peer died.

This also means that the bus itself is vastly simplified. It only has
to track peers, not every object in every peer. And clients are more
flexible with how objects are handled. They can be instantiated
lazily, or even created algorithmically from the object path if
needed.

You wish that the kernel controlled access to a particular object in a
peer, but once the message is dispatched into the target process all
bets are off anyway. It will be running some code parsing your message
in the process with no real separation from the other objects. Any bug
there could give you wider access. I don't see how this fundamentally
makes the whole system much more secure. On the other hand, I *do*
remember having to track down cross-process leaks from circular
references between processes using Corba...

>> The desktop file lists the icon, name and whatnot which is displayed
>> by the desktop environment. If DBusActivatable is true, then the app
>> is started by sending dbus messages to the same name as the desktop
>> file, to the org.freedesktop.Application interface, this way we can
>> ensure a singleton app and you can do more things than just spawning
>> it.
>
> How do I install apps as an unprivileged user?  What about running
> sandboxed apps that aren't installed at all?  What about downloading
> one app and running three instances of it that are all isolated from
> each other?

Users install desktop files in a file in their home directory
(~/.local/share/applications/ typically).

xdg-app apps require some form of installation before running.

You can run three instances of an app, but only one of them can own
the bus name. This works out fine if your app does not use dbus, but
it may be a problem if it uses dbus activation.

>> Well, your "other than" part kinda breaks things like launching the
>> application. So, we need to be on the real bus.
>> Could you then *also* have a bus per app for talking to the portal? I
>> guess so, but I don't quite see the point. Just having the portals
>> trying to find all new buses that come and go will be all kinds of
>> painful.
>
> How many portals will there be?  It seems like, if you want multiple
> portals programs (in the org.freedesktop.portal.* sense), then you'd
> have some awkwardness if each app were on its own bus and you didn't
> want a proxy, but I think you'll also have prevented yourself from
> meaningfully sandboxing the portals themselves.

You can sandbox the portals to some extent, but fundamentally they are
meant to run in some kind of "higher privileges" mode, so they have to
have access to things normal apps would not. For instance, they have
to be able to activate other dbus names.

> Android, on the other hand, sandboxes most of its service providers,
> and Binder provides a nice way to selectively grant capabilities
> between sandboxes.  (The privacy and security disaster that's built on
> top of Binder is another story, but that's not Binder's fault.)

Well, the service providers are not the same as the portals. Say you
have a twitter client that you want to register to be able to 

Re: kdbus: to merge or not to merge?

2015-06-24 Thread Andy Lutomirski
On Wed, Jun 24, 2015 at 2:55 AM, Alexander Larsson
 wrote:
>
> I don't really understand this objection. I'm working on an
> application sandboxing model for desktop applications (xdg-app), and
> the kdbus model matches my needs well. In fact, I'm currently using a
> userspace filtering proxy that implements exactly the kdbus policy
> model. Of course, this adds *yet* another context switch per message.
> The only problem I found is that kdbus filtering broke the ability to
> track the lifetime of clients[1]. However, this has now been fixed
> with exactly the change you complain about above.

I find myself wondering whether the change I complain about will be a
problem down the road.  It's certainly an information leak of some
sort.  Whether the information that it leaks is valuable to anyone is
an interesting question.

>
> I definitely don't want to do low level request interception with UI.
> We learned long ago that it is a very poor fit for desktop use. At the
> interception point you have no context at all about the larger scope,
> such as what window caused the operation and how you would make it
> modal or even just get the window parenting right. Also, if you do
> this you will keep popping up windows all the time as apps do calls in
> the background to be able to e.g. gray out unavailable menu items,
> update folder counts, etc. Any operation that may cause user
> interaction must be carefully designed to handle this.
>
> The way I expect to use kdbus policy, for an app called say
> "org.gnome.gedit" is to have the following policy:
> TALK org.freedesktop.DBus
> OWN org.gnome.gedit
> OWN org.gnome.gedit.*
> TALK org.freedesktop.portal.*

Aha!  You're not doing what I assumed you were doing at all.

>
> This allows the app to conntect to and talk to the bus, own its own
> name and broadcast signals. It also lets anyone else (that are not
> sandboxed) talk to the app and it will be able to reply. This is
> enough to have regular dbus activation of  desktop files[2], as well
> as allowing app-related custom services.

Do I understand correctly that you're committing to an iOS-like model
in which activations go to a particular named app as opposed to a more
Android-like model in which multiple providers can offer the same
service?

>
> It also allows the app to talk to a set of "portals" which are
> sandbox-specific APIs that supply the necessary services for sandboxed
> apps to interact with each other and the host.

[snip description of what the portal does]

This seems generally sensible.  Here are my concerns.  Feel free to
tell me I'm nuts or ask me more.

1. Other than allowing non-sandboxed code to contact sandboxed apps
directly (as opposed to via the portal), I still don't see how this is
better than having a completely separate kdbusfs instance (or
userspace socket or whatever) per app.  The only things on the outside
the app talks to are org.freedesktop.portal.*, and whatever service
provides them could be taught to provide them to more than one running
sandboxed app.

By doing it with a policy rule like this, you're at risk of random
non-sandboxed programs having a bright idea to offer some completely
insecure service with a name like "org.freedesktop.portal.badidea"
that destroys security.  See, for example, the tons of reports of
exploitable Android system services that shouldn't have been there in
the first place.

By using this type of policy rule, you're also preventing meaningful
use of two different portal implementations -- their names will
collide.  That's fine when there's exactly one implementation that
you're developing, but it might be nice to be able to run some apps
under a super-locked-down portal, some under a standard portal, and
some under some other fork of the portal, all at once.

2. Without seeing more details, I don't see how you will defend
against name collisions.  By allowing a sandboxed application to claim
a well-known name with global significance (e.g.
org.freedesktop.gedit), you're vulnerable to apps that maliciously
claim some other app's name (e.g. by sticking it in their manifest or
whatever).  Search for the iOS "XARA" attacks, which mostly work like
this and almost completely break iOS security (currently unfixed
AFAIK).

3. Due to the IMO absurd way that kdbus policy works, you think you're
limiting sandboxed apps to talking to names that match entries in your
policy table.  Instead, you're limiting sandboxed apps to talking to
peer ids that advertise names that match entries in your policy table.
As I understand it, you are completely and utterly hosed if your
portal implements org.freedesktop.portal.secure_printing and
org.freedesktop.admin.something_else.  This issue is a large part of
the reason that I consider kdbus' policy framework to be an
unacceptable design.

> Now, there will likely be a few cases where we need a more
> fine-grained access limit. For instance you may have a service that
> dynamically grants access to particular objects in 

Re: kdbus: to merge or not to merge?

2015-06-24 Thread Ingo Molnar

* Martin Steigerwald  wrote:

> Am Mittwoch, 24. Juni 2015, 10:05:02 schrieb Ingo Molnar:
> > Not because I like it so much, but because I think the merge process
> > should be  stripped of politics and emotion as much as possible: if an
> > initial submission is good and addresses all technical review properly,
> > and if the cost to the core kernel is low, then barring alternative,
> > fully equivalent and superior patch submissions, rejecting it does more
> > harm than good.
> 
> Now that is an interesting challenge.
> 
> As I realize more and more we are all feeling beings.
> 
> Linus himself according to his own words as I received them wants to make 
> perfectly sure that the developer who receives a message from him exactly 
> knows how he feels, especially when he disagrees with a pull request and 
> does not want to take it.

So that twists what I said: how 'I feel about a pull request' is a technical 
term 
for: 'what is my subjective but rational technological opinion' about it.

That's not an invitation to be irrationally emotional. (I'm reasonably sure 
that's 
what Linus meant there too, but I don't speak for him.)

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-24 Thread Ingo Molnar

* Martin Steigerwald  wrote:

> Am Mittwoch, 24. Juni 2015, 10:05:02 schrieb Ingo Molnar:
>
> >  - Once one (or two) major distros go with kdbus, it becomes a de-facto 
> > ABI. 
> > If the ABI is bad then that distro will hurt from it regardless of whether 
> > we 
> > merge it upstream or not - so technical pressure is there to improve it. 
> > But 
> > if the kernel refuses to merge it, Linux users will get hurt 
> > disproportionately badly. The kernel not being the first mover with a new 
> > ABI 
> > is absolutely sensible. But once Linux distros have taken the initial 
> > (non-trivial) plunge, not merging a zero-cost ABI upstream becomes more 
> > like 
> > revenge and obstruction, which is not productive. The kernel has very 
> > little 
> > value without full user-space, after all, so within reason the kernel 
> > project 
> > has to own up to distro ABI mistakes as well.
> 
> So, in order to merge something that is not accepted upstream yet, is it an 
> accepted way to encourage distros to use it nonetheless, to get it upstream 
> then 
> anyway as in "as, look, now this and this distro uses it"?
> 
> When I read
> 
> > Not because I like it so much, but because I think the merge process should 
> > be 
> > stripped of politics and emotion as much as possible: if an initial 
> > submission 
> > is good and addresses all technical review properly, and if the cost to the 
> > core kernel is low, then barring alternative, fully equivalent and superior 
> > patch submissions, rejecting it does more harm than good.
> 
> I think you didn´t mean it that way, as you state proper technical review as 
> a 
> requirement.
> 
> Can you clarify?

There's no conflict: when merging something upstream, technical feedback has to 
be 
addressed. AFAICS that is what happened when we merged controversial bits in 
the 
past where Linux distros jumped the gun: such as AppArmor or Binder.

The main question that gets eliminated by a major distro using something is the 
(important) question of: 'does the Linux kernel need an ABI like that?'.

Distros still run a considerable risk when forking new ABIs, obviously - as 
'pre 
release' ABIs rarely survive upstreaming, and there's no guarantee that it will 
be 
accepted upstream.

> Still as far as I got it, Andy raised technical concerns which Greg 
> outrightly 
> rejected as invalid without any further explaination. That does not seem like 
> technical concerns have been properly addressed to me.

I haven't seen such responses but maybe I haven't managed to dig deep enough 
into 
the rather sizable discussion. Not addressing valid technical feedback would be 
a 
first for Greg in my book, so he definitely deserves the benefit of doubt from 
me.

And the thing is, in hindsight, after such huge flamewars, years down the line, 
almost never do I see the following question asked: 'what were we thinking 
merging 
that crap??'. If any question arises it's usually along the lines of: 'what was 
the big fuss about?'. So I think by and large the process works.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-24 Thread Martin Steigerwald
Am Mittwoch, 24. Juni 2015, 10:05:02 schrieb Ingo Molnar:
> Not because I like it so much, but because I think the merge process
> should be  stripped of politics and emotion as much as possible: if an
> initial submission is good and addresses all technical review properly,
> and if the cost to the core kernel is low, then barring alternative,
> fully equivalent and superior patch submissions, rejecting it does more
> harm than good.

Now that is an interesting challenge.

As I realize more and more we are all feeling beings.

Linus himself according to his own words as I received them wants to make 
perfectly sure that the developer who receives a message from him exactly 
knows how he feels, especially when he disagrees with a pull request and 
does not want to take it.

To my perception the whole kernel development process is quite full of 
emotion, including your message I reply to.

And now you want to get rid of it.

I bet you can.


If you remove Linus… and every other kernel developer from the development 
process, including yourself.

But then, who will develop the kernel?


I think a different way to handle emotions can help and I intend handle them 
this way to see what results I create this way. I am aiming to feel my 
feelings as they are, instead of immediately judging them or attaching a 
thought to them basically making them emotions and distorting them that way, 
blocking my energy in them [1].

So I will attempt to feel my feelings before I answer again. I didn´t do so 
in the last answer to you, and I think it shows.



[1] Arnold M. Patent, "You can have it all"

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-24 Thread Martin Steigerwald
Am Mittwoch, 24. Juni 2015, 10:05:02 schrieb Ingo Molnar:
>  - Once one (or two) major distros go with kdbus, it becomes a de-facto
> ABI. If the ABI is bad then that distro will hurt from it regardless of
> whether we merge it upstream or not - so technical pressure is there to
> improve it. But if the kernel refuses to merge it, Linux users will get
> hurt disproportionately badly. The kernel not being the first mover with
> a new ABI is absolutely sensible. But once Linux distros have taken the
> initial (non-trivial) plunge, not merging a zero-cost ABI upstream
> becomes more like revenge and obstruction, which is not productive. The
> kernel has very little value without full user-space, after all, so
> within reason the kernel project has to own up to distro ABI mistakes as
> well.

So, in order to merge something that is not accepted upstream yet, is it an 
accepted way to encourage distros to use it nonetheless, to get it upstream 
then anyway as in "as, look, now this and this distro uses it"?

When I read

> Not because I like it so much, but because I think the merge process
> should be  stripped of politics and emotion as much as possible: if an
> initial submission is good and addresses all technical review properly,
> and if the cost to the core kernel is low, then barring alternative,
> fully equivalent and superior patch submissions, rejecting it does more
> harm than good.

I think you didn´t mean it that way, as you state proper technical review as 
a requirement.

Can you clarify?


Still as far as I got it, Andy raised technical concerns which Greg 
outrightly rejected as invalid without any further explaination. That does 
not seem like technical concerns have been properly addressed to me.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-24 Thread Eric W. Biederman
Ingo Molnar  writes:

> Not because I like it so much, but because I think the merge process should 
> be 
> stripped of politics and emotion as much as possible: if an initial 
> submission is 
> good and addresses all technical review properly, and if the cost to the core 
> kernel is low, then barring alternative, fully equivalent and superior patch 
> submissions, rejecting it does more harm than good.

This is largely not what happened with kdbus.

The initial submission was problematic.  Many pieces of technical review
were not addressed at the time a pull request was sent to Linus.  Even
now there are remaining outstanding technical items such as performance
that have not been addressed.

The cost to the rest of the core is potentially quite high as parts of
kdbus double down on the worst mistakes in user interface of the kernel.

Politics and emotion are involved because the discussions around kdbus
have not been honest:
- Lennart Poettering who has been hugely involved in the creation and
  the design of kdbus has not shown is face on lkml during the review,
  and he seems the only one who can actually answer many of the
  technical questions about kdbus.

- Many times it was said some feature of kdbus is not important because
  using it was not required, and yet in practice using that feature is
  required in the common case.

- Performance has been said to be a large benefit of kdbus and yet in
  the common case there will be a number of shared cache lines modifed
  for every message sent, for reference counts.

  At a quick glance it appears that communication with every system
  daemon will be serialized because they all have init as their parent
  process, so every reply will modify the reference count of init's
  struct pid.

At this point I honestly do not know how to have a technical dialogue
about the code in kdbus.  

Pointing out that bumping several reference counts per message is a bad
idea, has gotten no where so far.

Crazy things like using the processes command line (copied from
userspace when a message is sent) for message authentication is still
present in the code.

I don't think any of these things are particularly subtle, hard to
understand, or hard to fix yet months after they have been pointed out
the code persists.

For subtle issues who knows.  Every review I have seen seems to get to
a couple of simple things, point them out, and then stops.  I am
actually very strongly surprised at how many of these little issues
remain in the code.  There were enough changes added to the kdbus tree
to fix small issues since the last merge window I would have thought I
would have had to looked a little harder for problems.

So whatever else the case may be I think the current kdbus code base is
a long way from being ready to be merged.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-24 Thread Alexander Larsson
On Tue, Jun 23, 2015 at 8:06 AM, Andy Lutomirski  wrote:

> 3. The sandbox model is, in my opinion, an experiment that isn't going
> to succeed.  It's a poor model: a "restricted endpoint" (i.e. a
> sandboxed kdbus client) sees a view of the world defined by a limited
> policy language implemented by the kernel.  This completely fails to
> express what I think should be common use cases.  If a sandboxed app
> is given permission to access, say,
> /org/gnome/evolution/dataserver/CalendarView/3125/12, then it knows
> that it's looking at CalendarView/3125/12 (whatever that means) and
> there's no way to hide the name.  If someone subsequently deletes that
> CalendarView and creates a new one with that name, racelessly blocking
> access to the new one for the app may be complicated.  If a sandbox
> wants to prompt the user before allowing access to some resource, it
> has a problem: the policy language doesn't seem to be able to express
> request interception.
>
> The sandbox model is also already starting to accumulate kludges.
> Apparently it was recently discovered that the kdbus connection
> lifetime model was incompatible with sandbox policy, so as of a recent
> change [2] connection lifetime messages completely bypass sandbox
> policy.  Maybe this isn't obviously insecure, but it seems like a bad
> sign that "it's probably okay to poke this hole" is already happening
> before the thing is even merged.
>
> I'll point out that a pure userspace implementation of sandboxed dbus
> connections would be straightforward to implement today, would have
> none of these problems, and would allow arbitrarily complex policy and
> the flexibility to redesign it in the future if the initial design
> turned out to be inappropriate for the sandbox being written.  (You
> could even have two different implementations to go with two different
> sandboxes.  Let a thousand sandboxes bloom, which is easy in userspace
> but not so great in the kernel.)

I don't really understand this objection. I'm working on an
application sandboxing model for desktop applications (xdg-app), and
the kdbus model matches my needs well. In fact, I'm currently using a
userspace filtering proxy that implements exactly the kdbus policy
model. Of course, this adds *yet* another context switch per message.
The only problem I found is that kdbus filtering broke the ability to
track the lifetime of clients[1]. However, this has now been fixed
with exactly the change you complain about above.

I definitely don't want to do low level request interception with UI.
We learned long ago that it is a very poor fit for desktop use. At the
interception point you have no context at all about the larger scope,
such as what window caused the operation and how you would make it
modal or even just get the window parenting right. Also, if you do
this you will keep popping up windows all the time as apps do calls in
the background to be able to e.g. gray out unavailable menu items,
update folder counts, etc. Any operation that may cause user
interaction must be carefully designed to handle this.

The way I expect to use kdbus policy, for an app called say
"org.gnome.gedit" is to have the following policy:
TALK org.freedesktop.DBus
OWN org.gnome.gedit
OWN org.gnome.gedit.*
TALK org.freedesktop.portal.*

This allows the app to conntect to and talk to the bus, own its own
name and broadcast signals. It also lets anyone else (that are not
sandboxed) talk to the app and it will be able to reply. This is
enough to have regular dbus activation of  desktop files[2], as well
as allowing app-related custom services.

It also allows the app to talk to a set of "portals" which are
sandbox-specific APIs that supply the necessary services for sandboxed
apps to interact with each other and the host. For instance, it would
have APIs for file choosing, where all user interaction will happen on
the host side and the app just gets back the file data. Another
example is sharing with intents-like semantics, where you'd say "I
want to share text " and we open a dialog on the host side
allowing you to chose how to share the text (tweet it, open in other
app, etc) without the app knowing anything about it other than
supplying the data.

Operations like these are safe because they are interactive. An app
can't use them to silently read the users files, and the user can
always interactively abort the operation if it was unexpected.

Now, there will likely be a few cases where we need a more
fine-grained access limit. For instance you may have a service that
dynamically grants access to particular objects in a portal service to
an app. These things can be implemented fine in userspace in the
actual service itself. The way I do that currently is by looking at
the peer cgroup name, which encodes the xdg-app id. I don't see how
making up policy dynamically and uploading it to the bus is better
than just doing the filtering in the portal.

[1] 

  1   2   >