RE: [PATCH V4] binder: ipc namespace support for android binder

2018-11-19 Thread 周威
> On Fri, Nov 16, 2018 at 11:19:41AM -0800, Todd Kjos wrote:
> > On Thu, Nov 15, 2018 at 2:54 PM gre...@linuxfoundation.org
> >  wrote:
> > ...
> > >
> > > A number of us have talked about this in the plumbers Android track, and
> > > a different proposal for how to solve this has been made that should be
> > > much more resiliant.  So I will drop this patch from my queue and wait
> > > for the patches based on the discussions we had there.
> > >
> > > I think there's some notes/slides on the discussion online somewhere,
> > > but it hasn't been published as the conference is still happening,
> > > otherwise I would link to it here...
> >
> > Here is a link to the session where you can look at the slides [1].
> > There was consensus that "binderfs" (slide 5) was the most promising
> > -- but would be behind a config option to provide backwards
> > compatibility for non-container use-cases.
> >
> > The etherpad notes are at [2] (look at "Dynamically Allocated Binder
> > Devices" section)
> >
> > Christian Brauner will be sending out more details.
> 
> Ok, sorry for the delay I got caught up in other work.
> 
> The idea is to implement binderfs which I'm starting to work on.
> binderfs will be a separate pseudo-filesystem that will be mountable
> per-ipc namespace.
> This has the advantage that the ipc namespace is attached to the
> superblock of the mount and can be easily retrieved by the binder
> driver. It also implies that - in contrast to the proposed patch here -
> an open() on a given binder device will not be able to change the ipc
> namespace of said devices. The obvious corollary is that you can
> bind-mount binder devices or the whole binderfs mount between different
> ipc namespaces and given the right permissions (CAP_IPC_OWNER in the
> owning userns of the ipcns) can see services on the host which is
> something that people wanted to be able to do.
> Additionally, each binderfs mount will come with a binder-control
> device. This device is functionally similar to loop-control and will
> allow for dynamic allocation (and possibly deallocation) of binder
> devices. With this we can remove the restriction to hard-code the number
> of binder devices at compile time.
> Backwards compatibility can either be guaranteed by hiding binderfs
> behind a compile flag or by special-casing the inital binderfs mount to
> pre-create the standard binder devices. The jury is still out on this.
> 
> Christian
> 

Since you are working on this, I will withdraw this patch. We will evaluate 
whether to your solution in our environment after you implement it. 


RE: [PATCH V4] binder: ipc namespace support for android binder

2018-11-19 Thread 周威
> On Fri, Nov 16, 2018 at 11:19:41AM -0800, Todd Kjos wrote:
> > On Thu, Nov 15, 2018 at 2:54 PM gre...@linuxfoundation.org
> >  wrote:
> > ...
> > >
> > > A number of us have talked about this in the plumbers Android track, and
> > > a different proposal for how to solve this has been made that should be
> > > much more resiliant.  So I will drop this patch from my queue and wait
> > > for the patches based on the discussions we had there.
> > >
> > > I think there's some notes/slides on the discussion online somewhere,
> > > but it hasn't been published as the conference is still happening,
> > > otherwise I would link to it here...
> >
> > Here is a link to the session where you can look at the slides [1].
> > There was consensus that "binderfs" (slide 5) was the most promising
> > -- but would be behind a config option to provide backwards
> > compatibility for non-container use-cases.
> >
> > The etherpad notes are at [2] (look at "Dynamically Allocated Binder
> > Devices" section)
> >
> > Christian Brauner will be sending out more details.
> 
> Ok, sorry for the delay I got caught up in other work.
> 
> The idea is to implement binderfs which I'm starting to work on.
> binderfs will be a separate pseudo-filesystem that will be mountable
> per-ipc namespace.
> This has the advantage that the ipc namespace is attached to the
> superblock of the mount and can be easily retrieved by the binder
> driver. It also implies that - in contrast to the proposed patch here -
> an open() on a given binder device will not be able to change the ipc
> namespace of said devices. The obvious corollary is that you can
> bind-mount binder devices or the whole binderfs mount between different
> ipc namespaces and given the right permissions (CAP_IPC_OWNER in the
> owning userns of the ipcns) can see services on the host which is
> something that people wanted to be able to do.
> Additionally, each binderfs mount will come with a binder-control
> device. This device is functionally similar to loop-control and will
> allow for dynamic allocation (and possibly deallocation) of binder
> devices. With this we can remove the restriction to hard-code the number
> of binder devices at compile time.
> Backwards compatibility can either be guaranteed by hiding binderfs
> behind a compile flag or by special-casing the inital binderfs mount to
> pre-create the standard binder devices. The jury is still out on this.
> 
> Christian
> 

Since you are working on this, I will withdraw this patch. We will evaluate 
whether to your solution in our environment after you implement it. 


Re: [PATCH V4] binder: ipc namespace support for android binder

2018-11-19 Thread Christian Brauner
On Fri, Nov 16, 2018 at 11:19:41AM -0800, Todd Kjos wrote:
> On Thu, Nov 15, 2018 at 2:54 PM gre...@linuxfoundation.org
>  wrote:
> ...
> >
> > A number of us have talked about this in the plumbers Android track, and
> > a different proposal for how to solve this has been made that should be
> > much more resiliant.  So I will drop this patch from my queue and wait
> > for the patches based on the discussions we had there.
> >
> > I think there's some notes/slides on the discussion online somewhere,
> > but it hasn't been published as the conference is still happening,
> > otherwise I would link to it here...
> 
> Here is a link to the session where you can look at the slides [1].
> There was consensus that "binderfs" (slide 5) was the most promising
> -- but would be behind a config option to provide backwards
> compatibility for non-container use-cases.
> 
> The etherpad notes are at [2] (look at "Dynamically Allocated Binder
> Devices" section)
> 
> Christian Brauner will be sending out more details.

Ok, sorry for the delay I got caught up in other work.

The idea is to implement binderfs which I'm starting to work on.
binderfs will be a separate pseudo-filesystem that will be mountable
per-ipc namespace.
This has the advantage that the ipc namespace is attached to the
superblock of the mount and can be easily retrieved by the binder
driver. It also implies that - in contrast to the proposed patch here -
an open() on a given binder device will not be able to change the ipc
namespace of said devices. The obvious corollary is that you can
bind-mount binder devices or the whole binderfs mount between different
ipc namespaces and given the right permissions (CAP_IPC_OWNER in the
owning userns of the ipcns) can see services on the host which is
something that people wanted to be able to do.
Additionally, each binderfs mount will come with a binder-control
device. This device is functionally similar to loop-control and will
allow for dynamic allocation (and possibly deallocation) of binder
devices. With this we can remove the restriction to hard-code the number
of binder devices at compile time.
Backwards compatibility can either be guaranteed by hiding binderfs
behind a compile flag or by special-casing the inital binderfs mount to
pre-create the standard binder devices. The jury is still out on this.

Christian

> 
> -Todd
> 
> [1] https://linuxplumbersconf.org/event/2/contributions/222/
> [2] https://etherpad.openstack.org/p/Android
> 
> >
> > thanks,
> >
> > greg k-h


Re: [PATCH V4] binder: ipc namespace support for android binder

2018-11-19 Thread Christian Brauner
On Fri, Nov 16, 2018 at 11:19:41AM -0800, Todd Kjos wrote:
> On Thu, Nov 15, 2018 at 2:54 PM gre...@linuxfoundation.org
>  wrote:
> ...
> >
> > A number of us have talked about this in the plumbers Android track, and
> > a different proposal for how to solve this has been made that should be
> > much more resiliant.  So I will drop this patch from my queue and wait
> > for the patches based on the discussions we had there.
> >
> > I think there's some notes/slides on the discussion online somewhere,
> > but it hasn't been published as the conference is still happening,
> > otherwise I would link to it here...
> 
> Here is a link to the session where you can look at the slides [1].
> There was consensus that "binderfs" (slide 5) was the most promising
> -- but would be behind a config option to provide backwards
> compatibility for non-container use-cases.
> 
> The etherpad notes are at [2] (look at "Dynamically Allocated Binder
> Devices" section)
> 
> Christian Brauner will be sending out more details.

Ok, sorry for the delay I got caught up in other work.

The idea is to implement binderfs which I'm starting to work on.
binderfs will be a separate pseudo-filesystem that will be mountable
per-ipc namespace.
This has the advantage that the ipc namespace is attached to the
superblock of the mount and can be easily retrieved by the binder
driver. It also implies that - in contrast to the proposed patch here -
an open() on a given binder device will not be able to change the ipc
namespace of said devices. The obvious corollary is that you can
bind-mount binder devices or the whole binderfs mount between different
ipc namespaces and given the right permissions (CAP_IPC_OWNER in the
owning userns of the ipcns) can see services on the host which is
something that people wanted to be able to do.
Additionally, each binderfs mount will come with a binder-control
device. This device is functionally similar to loop-control and will
allow for dynamic allocation (and possibly deallocation) of binder
devices. With this we can remove the restriction to hard-code the number
of binder devices at compile time.
Backwards compatibility can either be guaranteed by hiding binderfs
behind a compile flag or by special-casing the inital binderfs mount to
pre-create the standard binder devices. The jury is still out on this.

Christian

> 
> -Todd
> 
> [1] https://linuxplumbersconf.org/event/2/contributions/222/
> [2] https://etherpad.openstack.org/p/Android
> 
> >
> > thanks,
> >
> > greg k-h


Re: [PATCH V4] binder: ipc namespace support for android binder

2018-11-16 Thread Todd Kjos
On Thu, Nov 15, 2018 at 2:54 PM gre...@linuxfoundation.org
 wrote:
...
>
> A number of us have talked about this in the plumbers Android track, and
> a different proposal for how to solve this has been made that should be
> much more resiliant.  So I will drop this patch from my queue and wait
> for the patches based on the discussions we had there.
>
> I think there's some notes/slides on the discussion online somewhere,
> but it hasn't been published as the conference is still happening,
> otherwise I would link to it here...

Here is a link to the session where you can look at the slides [1].
There was consensus that "binderfs" (slide 5) was the most promising
-- but would be behind a config option to provide backwards
compatibility for non-container use-cases.

The etherpad notes are at [2] (look at "Dynamically Allocated Binder
Devices" section)

Christian Brauner will be sending out more details.

-Todd

[1] https://linuxplumbersconf.org/event/2/contributions/222/
[2] https://etherpad.openstack.org/p/Android

>
> thanks,
>
> greg k-h


Re: [PATCH V4] binder: ipc namespace support for android binder

2018-11-16 Thread Todd Kjos
On Thu, Nov 15, 2018 at 2:54 PM gre...@linuxfoundation.org
 wrote:
...
>
> A number of us have talked about this in the plumbers Android track, and
> a different proposal for how to solve this has been made that should be
> much more resiliant.  So I will drop this patch from my queue and wait
> for the patches based on the discussions we had there.
>
> I think there's some notes/slides on the discussion online somewhere,
> but it hasn't been published as the conference is still happening,
> otherwise I would link to it here...

Here is a link to the session where you can look at the slides [1].
There was consensus that "binderfs" (slide 5) was the most promising
-- but would be behind a config option to provide backwards
compatibility for non-container use-cases.

The etherpad notes are at [2] (look at "Dynamically Allocated Binder
Devices" section)

Christian Brauner will be sending out more details.

-Todd

[1] https://linuxplumbersconf.org/event/2/contributions/222/
[2] https://etherpad.openstack.org/p/Android

>
> thanks,
>
> greg k-h


Re: [PATCH V4] binder: ipc namespace support for android binder

2018-11-15 Thread gre...@linuxfoundation.org
On Thu, Nov 15, 2018 at 02:33:49PM -0800, Andrew Morton wrote:
> On Mon, 12 Nov 2018 09:37:51 + chouryzhou(周威)  
> wrote:
> 
> > Currently android's binder is not isolated by ipc namespace. Since binder 
> > is a form of IPC and therefore should be tied to ipc namespace. With this 
> > patch, we can run multiple instances of  android container on one host.
> > 
> > This patch move "binder_procs" and "binder_context" into ipc_namespace,
> > driver will find the context from it when opening. For debugfs, binder_proc
> > is namespace-aware, but not for binder dead nodes, binder_stats and 
> > binder_transaction_log_entry (we added ipc inum to trace it).
> > 
> > ...
> >
> >  drivers/android/binder.c  | 133 
> > --
> >  include/linux/ipc_namespace.h |  15 +
> >  ipc/namespace.c   |  10 +++-
> >  3 files changed, 125 insertions(+), 33 deletions(-)
> 
> Well, it's mainly an android patch so I suggest this be taken via the
> android tree.
> 
> Acked-by: Andrew Morton 
> 

A number of us have talked about this in the plumbers Android track, and
a different proposal for how to solve this has been made that should be
much more resiliant.  So I will drop this patch from my queue and wait
for the patches based on the discussions we had there.

I think there's some notes/slides on the discussion online somewhere,
but it hasn't been published as the conference is still happening,
otherwise I would link to it here...

thanks,

greg k-h


Re: [PATCH V4] binder: ipc namespace support for android binder

2018-11-15 Thread gre...@linuxfoundation.org
On Thu, Nov 15, 2018 at 02:33:49PM -0800, Andrew Morton wrote:
> On Mon, 12 Nov 2018 09:37:51 + chouryzhou(周威)  
> wrote:
> 
> > Currently android's binder is not isolated by ipc namespace. Since binder 
> > is a form of IPC and therefore should be tied to ipc namespace. With this 
> > patch, we can run multiple instances of  android container on one host.
> > 
> > This patch move "binder_procs" and "binder_context" into ipc_namespace,
> > driver will find the context from it when opening. For debugfs, binder_proc
> > is namespace-aware, but not for binder dead nodes, binder_stats and 
> > binder_transaction_log_entry (we added ipc inum to trace it).
> > 
> > ...
> >
> >  drivers/android/binder.c  | 133 
> > --
> >  include/linux/ipc_namespace.h |  15 +
> >  ipc/namespace.c   |  10 +++-
> >  3 files changed, 125 insertions(+), 33 deletions(-)
> 
> Well, it's mainly an android patch so I suggest this be taken via the
> android tree.
> 
> Acked-by: Andrew Morton 
> 

A number of us have talked about this in the plumbers Android track, and
a different proposal for how to solve this has been made that should be
much more resiliant.  So I will drop this patch from my queue and wait
for the patches based on the discussions we had there.

I think there's some notes/slides on the discussion online somewhere,
but it hasn't been published as the conference is still happening,
otherwise I would link to it here...

thanks,

greg k-h


Re: [PATCH V4] binder: ipc namespace support for android binder

2018-11-15 Thread Andrew Morton
On Mon, 12 Nov 2018 09:37:51 + chouryzhou(周威)  
wrote:

> Currently android's binder is not isolated by ipc namespace. Since binder 
> is a form of IPC and therefore should be tied to ipc namespace. With this 
> patch, we can run multiple instances of  android container on one host.
> 
> This patch move "binder_procs" and "binder_context" into ipc_namespace,
> driver will find the context from it when opening. For debugfs, binder_proc
> is namespace-aware, but not for binder dead nodes, binder_stats and 
> binder_transaction_log_entry (we added ipc inum to trace it).
> 
> ...
>
>  drivers/android/binder.c  | 133 
> --
>  include/linux/ipc_namespace.h |  15 +
>  ipc/namespace.c   |  10 +++-
>  3 files changed, 125 insertions(+), 33 deletions(-)

Well, it's mainly an android patch so I suggest this be taken via the
android tree.

Acked-by: Andrew Morton 



Re: [PATCH V4] binder: ipc namespace support for android binder

2018-11-15 Thread Andrew Morton
On Mon, 12 Nov 2018 09:37:51 + chouryzhou(周威)  
wrote:

> Currently android's binder is not isolated by ipc namespace. Since binder 
> is a form of IPC and therefore should be tied to ipc namespace. With this 
> patch, we can run multiple instances of  android container on one host.
> 
> This patch move "binder_procs" and "binder_context" into ipc_namespace,
> driver will find the context from it when opening. For debugfs, binder_proc
> is namespace-aware, but not for binder dead nodes, binder_stats and 
> binder_transaction_log_entry (we added ipc inum to trace it).
> 
> ...
>
>  drivers/android/binder.c  | 133 
> --
>  include/linux/ipc_namespace.h |  15 +
>  ipc/namespace.c   |  10 +++-
>  3 files changed, 125 insertions(+), 33 deletions(-)

Well, it's mainly an android patch so I suggest this be taken via the
android tree.

Acked-by: Andrew Morton 



Re: [PATCH V4] binder: ipc namespace support for android binder

2018-11-13 Thread Todd Kjos
On Tue, Nov 13, 2018 at 12:12 AM chouryzhou(周威)  wrote:
>
> > I have not received an answer to my questions in the last version of this 
> > patch
> > set. Also it would be good if I could be Cc'ed by default. I can't hunt 
> > down all
> > patches.
> > I do not know of any kernel entity, specifically devices, that change 
> > namespaces
> > on open().
> > This seems like an invitation for all kinds of security bugs.
> > A device node belongs to one namespace only which is attached to the
> > underlying kobject. Opening the device should never change that.
> > Please look at how mqueue or shm are doing this. They don't change
> > namespaces on open either.
> > I have to say that is one of the main reasons why I disagree with that 
> > design.
> >
> >
>
> If we must return the same context when every open in proc, we can only 
> isolate
> binder with mnt namespace instead of ipc namespace, what do you think, Todd?

I don't have strong feelings on this -- it seems like a bizarre
use-case to send the fd through a backchannel as christian describes,
but I do agree it is strange behavior (though it seems safe to me
since it prevents communication between unrelated entities). I don't
know how mqueue and shm work, its worth a look since this patch is
modelling their behavior...

We'll talk about it here at LPC and then on this thread.


Re: [PATCH V4] binder: ipc namespace support for android binder

2018-11-13 Thread Todd Kjos
On Tue, Nov 13, 2018 at 12:12 AM chouryzhou(周威)  wrote:
>
> > I have not received an answer to my questions in the last version of this 
> > patch
> > set. Also it would be good if I could be Cc'ed by default. I can't hunt 
> > down all
> > patches.
> > I do not know of any kernel entity, specifically devices, that change 
> > namespaces
> > on open().
> > This seems like an invitation for all kinds of security bugs.
> > A device node belongs to one namespace only which is attached to the
> > underlying kobject. Opening the device should never change that.
> > Please look at how mqueue or shm are doing this. They don't change
> > namespaces on open either.
> > I have to say that is one of the main reasons why I disagree with that 
> > design.
> >
> >
>
> If we must return the same context when every open in proc, we can only 
> isolate
> binder with mnt namespace instead of ipc namespace, what do you think, Todd?

I don't have strong feelings on this -- it seems like a bizarre
use-case to send the fd through a backchannel as christian describes,
but I do agree it is strange behavior (though it seems safe to me
since it prevents communication between unrelated entities). I don't
know how mqueue and shm work, its worth a look since this patch is
modelling their behavior...

We'll talk about it here at LPC and then on this thread.


RE: [PATCH V4] binder: ipc namespace support for android binder

2018-11-13 Thread 周威
> I have not received an answer to my questions in the last version of this 
> patch
> set. Also it would be good if I could be Cc'ed by default. I can't hunt down 
> all
> patches.
> I do not know of any kernel entity, specifically devices, that change 
> namespaces
> on open().
> This seems like an invitation for all kinds of security bugs.
> A device node belongs to one namespace only which is attached to the
> underlying kobject. Opening the device should never change that.
> Please look at how mqueue or shm are doing this. They don't change
> namespaces on open either.
> I have to say that is one of the main reasons why I disagree with that design.
> 
> 

If we must return the same context when every open in proc, we can only isolate
binder with mnt namespace instead of ipc namespace, what do you think, Todd?


RE: [PATCH V4] binder: ipc namespace support for android binder

2018-11-13 Thread 周威
> I have not received an answer to my questions in the last version of this 
> patch
> set. Also it would be good if I could be Cc'ed by default. I can't hunt down 
> all
> patches.
> I do not know of any kernel entity, specifically devices, that change 
> namespaces
> on open().
> This seems like an invitation for all kinds of security bugs.
> A device node belongs to one namespace only which is attached to the
> underlying kobject. Opening the device should never change that.
> Please look at how mqueue or shm are doing this. They don't change
> namespaces on open either.
> I have to say that is one of the main reasons why I disagree with that design.
> 
> 

If we must return the same context when every open in proc, we can only isolate
binder with mnt namespace instead of ipc namespace, what do you think, Todd?


Re: [PATCH V4] binder: ipc namespace support for android binder

2018-11-12 Thread Christian Brauner
On November 12, 2018 8:45:07 AM PST, Todd Kjos  wrote:
>+christ...@brauner.io +Martijn Coenen
>
>Christian,
>
>Does this patch work for your container use-cases? If not, please
>comment on this thread. Let's discuss at LPC this week.

I have not received an answer to my questions in the last version of this patch 
set. Also it would be good if I could be Cc'ed by default. I can't hunt down 
all patches.
I do not know of any kernel entity, specifically devices, that change 
namespaces on open().
This seems like an invitation for all kinds of security bugs.
A device node belongs to one namespace only which is attached to the underlying 
kobject. Opening the device should never change that.
Please look at how mqueue or shm are doing this. They don't change namespaces 
on open either.
I have to say that is one of the main reasons why I disagree with that design.


>
>-Todd
>
>On Mon, Nov 12, 2018 at 1:38 AM chouryzhou(周威) 
>wrote:
>>
>> Currently android's binder is not isolated by ipc namespace. Since
>binder
>> is a form of IPC and therefore should be tied to ipc namespace. With
>this
>> patch, we can run multiple instances of  android container on one
>host.
>>
>> This patch move "binder_procs" and "binder_context" into
>ipc_namespace,
>> driver will find the context from it when opening. For debugfs,
>binder_proc
>> is namespace-aware, but not for binder dead nodes, binder_stats and
>> binder_transaction_log_entry (we added ipc inum to trace it).
>>
>> Signed-off-by: chouryzhou 
>> Reviewed-by: Davidlohr Bueso 
>> ---
>>  drivers/android/binder.c  | 133
>--
>>  include/linux/ipc_namespace.h |  15 +
>>  ipc/namespace.c   |  10 +++-
>>  3 files changed, 125 insertions(+), 33 deletions(-)
>>
>> diff --git a/drivers/android/binder.c b/drivers/android/binder.c
>> index cb30a524d16d..453265505b04 100644
>> --- a/drivers/android/binder.c
>> +++ b/drivers/android/binder.c
>> @@ -67,6 +67,8 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -80,13 +82,21 @@
>>  #include "binder_alloc.h"
>>  #include "binder_trace.h"
>>
>> +
>> +#ifndef CONFIG_IPC_NS
>> +static struct ipc_namespace binder_ipc_ns = {
>> +   .ns.inum = PROC_IPC_INIT_INO,
>> +};
>> +
>> +#define ipcns  (_ipc_ns)
>> +#else
>> +#define ipcns  (current->nsproxy->ipc_ns)
>> +#endif
>> +
>>  static HLIST_HEAD(binder_deferred_list);
>>  static DEFINE_MUTEX(binder_deferred_lock);
>>
>>  static HLIST_HEAD(binder_devices);
>> -static HLIST_HEAD(binder_procs);
>> -static DEFINE_MUTEX(binder_procs_lock);
>> -
>>  static HLIST_HEAD(binder_dead_nodes);
>>  static DEFINE_SPINLOCK(binder_dead_nodes_lock);
>>
>> @@ -233,6 +243,7 @@ struct binder_transaction_log_entry {
>> uint32_t return_error;
>> uint32_t return_error_param;
>> const char *context_name;
>> +   unsigned int ipc_inum;
>>  };
>>  struct binder_transaction_log {
>> atomic_t cur;
>> @@ -263,19 +274,66 @@ static struct binder_transaction_log_entry
>*binder_transaction_log_add(
>>  }
>>
>>  struct binder_context {
>> +   struct hlist_node hlist;
>> struct binder_node *binder_context_mgr_node;
>> struct mutex context_mgr_node_lock;
>>
>> kuid_t binder_context_mgr_uid;
>> +   int   device;
>> const char *name;
>>  };
>>
>>  struct binder_device {
>> struct hlist_node hlist;
>> struct miscdevice miscdev;
>> -   struct binder_context context;
>>  };
>>
>> +void binder_exit_ns(struct ipc_namespace *ns)
>> +{
>> +   struct binder_context *context;
>> +   struct hlist_node *tmp;
>> +
>> +   mutex_destroy(>binder_procs_lock);
>> +   hlist_for_each_entry_safe(context, tmp, >binder_contexts,
>hlist) {
>> +   mutex_destroy(>context_mgr_node_lock);
>> +   hlist_del(>hlist);
>> +   kfree(context);
>> +   }
>> +}
>> +
>> +int binder_init_ns(struct ipc_namespace *ns)
>> +{
>> +   int ret;
>> +   struct binder_device *device;
>> +
>> +   mutex_init(>binder_procs_lock);
>> +   INIT_HLIST_HEAD(>binder_procs);
>> +   INIT_HLIST_HEAD(>binder_contexts);
>> +
>> +   hlist_for_each_entry(device, _devices, hlist) {
>> +   struct binder_context *context;
>> +
>> +   context = kzalloc(sizeof(*context), GFP_KERNEL);
>> +   if (!context) {
>> +   ret = -ENOMEM;
>> +   goto err;
>> +   }
>> +
>> +   context->device = device->miscdev.minor;
>> +   context->name = device->miscdev.name;
>> +   context->binder_context_mgr_uid = INVALID_UID;
>> +   mutex_init(>context_mgr_node_lock);
>> +
>> +   hlist_add_head(>hlist,
>>binder_contexts);
>> +   }
>> +
>> +   return 0;
>> +err:
>> +   binder_exit_ns(ns);
>> +   return ret;
>> +}
>> +
>> +
>>  /**
>>   * struct binder_work - 

Re: [PATCH V4] binder: ipc namespace support for android binder

2018-11-12 Thread Christian Brauner
On November 12, 2018 8:45:07 AM PST, Todd Kjos  wrote:
>+christ...@brauner.io +Martijn Coenen
>
>Christian,
>
>Does this patch work for your container use-cases? If not, please
>comment on this thread. Let's discuss at LPC this week.

I have not received an answer to my questions in the last version of this patch 
set. Also it would be good if I could be Cc'ed by default. I can't hunt down 
all patches.
I do not know of any kernel entity, specifically devices, that change 
namespaces on open().
This seems like an invitation for all kinds of security bugs.
A device node belongs to one namespace only which is attached to the underlying 
kobject. Opening the device should never change that.
Please look at how mqueue or shm are doing this. They don't change namespaces 
on open either.
I have to say that is one of the main reasons why I disagree with that design.


>
>-Todd
>
>On Mon, Nov 12, 2018 at 1:38 AM chouryzhou(周威) 
>wrote:
>>
>> Currently android's binder is not isolated by ipc namespace. Since
>binder
>> is a form of IPC and therefore should be tied to ipc namespace. With
>this
>> patch, we can run multiple instances of  android container on one
>host.
>>
>> This patch move "binder_procs" and "binder_context" into
>ipc_namespace,
>> driver will find the context from it when opening. For debugfs,
>binder_proc
>> is namespace-aware, but not for binder dead nodes, binder_stats and
>> binder_transaction_log_entry (we added ipc inum to trace it).
>>
>> Signed-off-by: chouryzhou 
>> Reviewed-by: Davidlohr Bueso 
>> ---
>>  drivers/android/binder.c  | 133
>--
>>  include/linux/ipc_namespace.h |  15 +
>>  ipc/namespace.c   |  10 +++-
>>  3 files changed, 125 insertions(+), 33 deletions(-)
>>
>> diff --git a/drivers/android/binder.c b/drivers/android/binder.c
>> index cb30a524d16d..453265505b04 100644
>> --- a/drivers/android/binder.c
>> +++ b/drivers/android/binder.c
>> @@ -67,6 +67,8 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -80,13 +82,21 @@
>>  #include "binder_alloc.h"
>>  #include "binder_trace.h"
>>
>> +
>> +#ifndef CONFIG_IPC_NS
>> +static struct ipc_namespace binder_ipc_ns = {
>> +   .ns.inum = PROC_IPC_INIT_INO,
>> +};
>> +
>> +#define ipcns  (_ipc_ns)
>> +#else
>> +#define ipcns  (current->nsproxy->ipc_ns)
>> +#endif
>> +
>>  static HLIST_HEAD(binder_deferred_list);
>>  static DEFINE_MUTEX(binder_deferred_lock);
>>
>>  static HLIST_HEAD(binder_devices);
>> -static HLIST_HEAD(binder_procs);
>> -static DEFINE_MUTEX(binder_procs_lock);
>> -
>>  static HLIST_HEAD(binder_dead_nodes);
>>  static DEFINE_SPINLOCK(binder_dead_nodes_lock);
>>
>> @@ -233,6 +243,7 @@ struct binder_transaction_log_entry {
>> uint32_t return_error;
>> uint32_t return_error_param;
>> const char *context_name;
>> +   unsigned int ipc_inum;
>>  };
>>  struct binder_transaction_log {
>> atomic_t cur;
>> @@ -263,19 +274,66 @@ static struct binder_transaction_log_entry
>*binder_transaction_log_add(
>>  }
>>
>>  struct binder_context {
>> +   struct hlist_node hlist;
>> struct binder_node *binder_context_mgr_node;
>> struct mutex context_mgr_node_lock;
>>
>> kuid_t binder_context_mgr_uid;
>> +   int   device;
>> const char *name;
>>  };
>>
>>  struct binder_device {
>> struct hlist_node hlist;
>> struct miscdevice miscdev;
>> -   struct binder_context context;
>>  };
>>
>> +void binder_exit_ns(struct ipc_namespace *ns)
>> +{
>> +   struct binder_context *context;
>> +   struct hlist_node *tmp;
>> +
>> +   mutex_destroy(>binder_procs_lock);
>> +   hlist_for_each_entry_safe(context, tmp, >binder_contexts,
>hlist) {
>> +   mutex_destroy(>context_mgr_node_lock);
>> +   hlist_del(>hlist);
>> +   kfree(context);
>> +   }
>> +}
>> +
>> +int binder_init_ns(struct ipc_namespace *ns)
>> +{
>> +   int ret;
>> +   struct binder_device *device;
>> +
>> +   mutex_init(>binder_procs_lock);
>> +   INIT_HLIST_HEAD(>binder_procs);
>> +   INIT_HLIST_HEAD(>binder_contexts);
>> +
>> +   hlist_for_each_entry(device, _devices, hlist) {
>> +   struct binder_context *context;
>> +
>> +   context = kzalloc(sizeof(*context), GFP_KERNEL);
>> +   if (!context) {
>> +   ret = -ENOMEM;
>> +   goto err;
>> +   }
>> +
>> +   context->device = device->miscdev.minor;
>> +   context->name = device->miscdev.name;
>> +   context->binder_context_mgr_uid = INVALID_UID;
>> +   mutex_init(>context_mgr_node_lock);
>> +
>> +   hlist_add_head(>hlist,
>>binder_contexts);
>> +   }
>> +
>> +   return 0;
>> +err:
>> +   binder_exit_ns(ns);
>> +   return ret;
>> +}
>> +
>> +
>>  /**
>>   * struct binder_work - 

Re: [PATCH V4] binder: ipc namespace support for android binder

2018-11-12 Thread Todd Kjos
+christ...@brauner.io +Martijn Coenen

Christian,

Does this patch work for your container use-cases? If not, please
comment on this thread. Let's discuss at LPC this week.

-Todd

On Mon, Nov 12, 2018 at 1:38 AM chouryzhou(周威)  wrote:
>
> Currently android's binder is not isolated by ipc namespace. Since binder
> is a form of IPC and therefore should be tied to ipc namespace. With this
> patch, we can run multiple instances of  android container on one host.
>
> This patch move "binder_procs" and "binder_context" into ipc_namespace,
> driver will find the context from it when opening. For debugfs, binder_proc
> is namespace-aware, but not for binder dead nodes, binder_stats and
> binder_transaction_log_entry (we added ipc inum to trace it).
>
> Signed-off-by: chouryzhou 
> Reviewed-by: Davidlohr Bueso 
> ---
>  drivers/android/binder.c  | 133 
> --
>  include/linux/ipc_namespace.h |  15 +
>  ipc/namespace.c   |  10 +++-
>  3 files changed, 125 insertions(+), 33 deletions(-)
>
> diff --git a/drivers/android/binder.c b/drivers/android/binder.c
> index cb30a524d16d..453265505b04 100644
> --- a/drivers/android/binder.c
> +++ b/drivers/android/binder.c
> @@ -67,6 +67,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -80,13 +82,21 @@
>  #include "binder_alloc.h"
>  #include "binder_trace.h"
>
> +
> +#ifndef CONFIG_IPC_NS
> +static struct ipc_namespace binder_ipc_ns = {
> +   .ns.inum = PROC_IPC_INIT_INO,
> +};
> +
> +#define ipcns  (_ipc_ns)
> +#else
> +#define ipcns  (current->nsproxy->ipc_ns)
> +#endif
> +
>  static HLIST_HEAD(binder_deferred_list);
>  static DEFINE_MUTEX(binder_deferred_lock);
>
>  static HLIST_HEAD(binder_devices);
> -static HLIST_HEAD(binder_procs);
> -static DEFINE_MUTEX(binder_procs_lock);
> -
>  static HLIST_HEAD(binder_dead_nodes);
>  static DEFINE_SPINLOCK(binder_dead_nodes_lock);
>
> @@ -233,6 +243,7 @@ struct binder_transaction_log_entry {
> uint32_t return_error;
> uint32_t return_error_param;
> const char *context_name;
> +   unsigned int ipc_inum;
>  };
>  struct binder_transaction_log {
> atomic_t cur;
> @@ -263,19 +274,66 @@ static struct binder_transaction_log_entry 
> *binder_transaction_log_add(
>  }
>
>  struct binder_context {
> +   struct hlist_node hlist;
> struct binder_node *binder_context_mgr_node;
> struct mutex context_mgr_node_lock;
>
> kuid_t binder_context_mgr_uid;
> +   int   device;
> const char *name;
>  };
>
>  struct binder_device {
> struct hlist_node hlist;
> struct miscdevice miscdev;
> -   struct binder_context context;
>  };
>
> +void binder_exit_ns(struct ipc_namespace *ns)
> +{
> +   struct binder_context *context;
> +   struct hlist_node *tmp;
> +
> +   mutex_destroy(>binder_procs_lock);
> +   hlist_for_each_entry_safe(context, tmp, >binder_contexts, hlist) {
> +   mutex_destroy(>context_mgr_node_lock);
> +   hlist_del(>hlist);
> +   kfree(context);
> +   }
> +}
> +
> +int binder_init_ns(struct ipc_namespace *ns)
> +{
> +   int ret;
> +   struct binder_device *device;
> +
> +   mutex_init(>binder_procs_lock);
> +   INIT_HLIST_HEAD(>binder_procs);
> +   INIT_HLIST_HEAD(>binder_contexts);
> +
> +   hlist_for_each_entry(device, _devices, hlist) {
> +   struct binder_context *context;
> +
> +   context = kzalloc(sizeof(*context), GFP_KERNEL);
> +   if (!context) {
> +   ret = -ENOMEM;
> +   goto err;
> +   }
> +
> +   context->device = device->miscdev.minor;
> +   context->name = device->miscdev.name;
> +   context->binder_context_mgr_uid = INVALID_UID;
> +   mutex_init(>context_mgr_node_lock);
> +
> +   hlist_add_head(>hlist, >binder_contexts);
> +   }
> +
> +   return 0;
> +err:
> +   binder_exit_ns(ns);
> +   return ret;
> +}
> +
> +
>  /**
>   * struct binder_work - work enqueued on a worklist
>   * @entry: node enqueued on list
> @@ -2728,6 +2786,7 @@ static void binder_transaction(struct binder_proc *proc,
> e->data_size = tr->data_size;
> e->offsets_size = tr->offsets_size;
> e->context_name = proc->context->name;
> +   e->ipc_inum = ipcns->ns.inum;
>
> if (reply) {
> binder_inner_proc_lock(proc);
> @@ -4922,6 +4981,7 @@ static int binder_open(struct inode *nodp, struct file 
> *filp)
>  {
> struct binder_proc *proc;
> struct binder_device *binder_dev;
> +   struct binder_context *context;
>
> binder_debug(BINDER_DEBUG_OPEN_CLOSE, "%s: %d:%d\n", __func__,
>  current->group_leader->pid, current->pid);
> @@ -4937,7 +4997,15 @@ static int binder_open(struct inode *nodp, struct 

Re: [PATCH V4] binder: ipc namespace support for android binder

2018-11-12 Thread Todd Kjos
+christ...@brauner.io +Martijn Coenen

Christian,

Does this patch work for your container use-cases? If not, please
comment on this thread. Let's discuss at LPC this week.

-Todd

On Mon, Nov 12, 2018 at 1:38 AM chouryzhou(周威)  wrote:
>
> Currently android's binder is not isolated by ipc namespace. Since binder
> is a form of IPC and therefore should be tied to ipc namespace. With this
> patch, we can run multiple instances of  android container on one host.
>
> This patch move "binder_procs" and "binder_context" into ipc_namespace,
> driver will find the context from it when opening. For debugfs, binder_proc
> is namespace-aware, but not for binder dead nodes, binder_stats and
> binder_transaction_log_entry (we added ipc inum to trace it).
>
> Signed-off-by: chouryzhou 
> Reviewed-by: Davidlohr Bueso 
> ---
>  drivers/android/binder.c  | 133 
> --
>  include/linux/ipc_namespace.h |  15 +
>  ipc/namespace.c   |  10 +++-
>  3 files changed, 125 insertions(+), 33 deletions(-)
>
> diff --git a/drivers/android/binder.c b/drivers/android/binder.c
> index cb30a524d16d..453265505b04 100644
> --- a/drivers/android/binder.c
> +++ b/drivers/android/binder.c
> @@ -67,6 +67,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -80,13 +82,21 @@
>  #include "binder_alloc.h"
>  #include "binder_trace.h"
>
> +
> +#ifndef CONFIG_IPC_NS
> +static struct ipc_namespace binder_ipc_ns = {
> +   .ns.inum = PROC_IPC_INIT_INO,
> +};
> +
> +#define ipcns  (_ipc_ns)
> +#else
> +#define ipcns  (current->nsproxy->ipc_ns)
> +#endif
> +
>  static HLIST_HEAD(binder_deferred_list);
>  static DEFINE_MUTEX(binder_deferred_lock);
>
>  static HLIST_HEAD(binder_devices);
> -static HLIST_HEAD(binder_procs);
> -static DEFINE_MUTEX(binder_procs_lock);
> -
>  static HLIST_HEAD(binder_dead_nodes);
>  static DEFINE_SPINLOCK(binder_dead_nodes_lock);
>
> @@ -233,6 +243,7 @@ struct binder_transaction_log_entry {
> uint32_t return_error;
> uint32_t return_error_param;
> const char *context_name;
> +   unsigned int ipc_inum;
>  };
>  struct binder_transaction_log {
> atomic_t cur;
> @@ -263,19 +274,66 @@ static struct binder_transaction_log_entry 
> *binder_transaction_log_add(
>  }
>
>  struct binder_context {
> +   struct hlist_node hlist;
> struct binder_node *binder_context_mgr_node;
> struct mutex context_mgr_node_lock;
>
> kuid_t binder_context_mgr_uid;
> +   int   device;
> const char *name;
>  };
>
>  struct binder_device {
> struct hlist_node hlist;
> struct miscdevice miscdev;
> -   struct binder_context context;
>  };
>
> +void binder_exit_ns(struct ipc_namespace *ns)
> +{
> +   struct binder_context *context;
> +   struct hlist_node *tmp;
> +
> +   mutex_destroy(>binder_procs_lock);
> +   hlist_for_each_entry_safe(context, tmp, >binder_contexts, hlist) {
> +   mutex_destroy(>context_mgr_node_lock);
> +   hlist_del(>hlist);
> +   kfree(context);
> +   }
> +}
> +
> +int binder_init_ns(struct ipc_namespace *ns)
> +{
> +   int ret;
> +   struct binder_device *device;
> +
> +   mutex_init(>binder_procs_lock);
> +   INIT_HLIST_HEAD(>binder_procs);
> +   INIT_HLIST_HEAD(>binder_contexts);
> +
> +   hlist_for_each_entry(device, _devices, hlist) {
> +   struct binder_context *context;
> +
> +   context = kzalloc(sizeof(*context), GFP_KERNEL);
> +   if (!context) {
> +   ret = -ENOMEM;
> +   goto err;
> +   }
> +
> +   context->device = device->miscdev.minor;
> +   context->name = device->miscdev.name;
> +   context->binder_context_mgr_uid = INVALID_UID;
> +   mutex_init(>context_mgr_node_lock);
> +
> +   hlist_add_head(>hlist, >binder_contexts);
> +   }
> +
> +   return 0;
> +err:
> +   binder_exit_ns(ns);
> +   return ret;
> +}
> +
> +
>  /**
>   * struct binder_work - work enqueued on a worklist
>   * @entry: node enqueued on list
> @@ -2728,6 +2786,7 @@ static void binder_transaction(struct binder_proc *proc,
> e->data_size = tr->data_size;
> e->offsets_size = tr->offsets_size;
> e->context_name = proc->context->name;
> +   e->ipc_inum = ipcns->ns.inum;
>
> if (reply) {
> binder_inner_proc_lock(proc);
> @@ -4922,6 +4981,7 @@ static int binder_open(struct inode *nodp, struct file 
> *filp)
>  {
> struct binder_proc *proc;
> struct binder_device *binder_dev;
> +   struct binder_context *context;
>
> binder_debug(BINDER_DEBUG_OPEN_CLOSE, "%s: %d:%d\n", __func__,
>  current->group_leader->pid, current->pid);
> @@ -4937,7 +4997,15 @@ static int binder_open(struct inode *nodp, struct