Re: [PATCH 0/5] Make /sys/class/net per net namespace objects belong to container
Dmitry Torokhovwrites: > On Mon, Aug 29, 2016 at 5:38 AM, Eric W. Biederman > wrote: >> David Miller writes: >> >>> From: Dmitry Torokhov >>> Date: Tue, 16 Aug 2016 15:33:10 -0700 >>> There are objects in /sys hierarchy (/sys/class/net/) that logically belong to a namespace/container. Unfortunately all sysfs objects start their life belonging to global root, and while we could change ownership manually, keeping tracks of all objects that come and go is cumbersome. It would be better if kernel created them using correct uid/gid from the beginning. This series changes kernfs to allow creating object's with arbitrary uid/gid, adds get_ownership() callback to ktype structure so subsystems could supply their own logic (likely tied to namespace support) for determining ownership of kobjects, and adjusts sysfs code to make use of this information. Lastly net-sysfs is adjusted to make sure that objects in net namespace are owned by the root user from the owning user namespace. Note that we do not adjust ownership of objects moved into a new namespace (as when moving a network device into a container) as userspace can easily do it. >>> >>> I need some domain experts to review this series please. >> >> I just came back from vacation and I will aim to take a look shortly. >> >> The big picture idea seems sensible. Having a better ownship of sysfs >> files that are part of a network namespace. I will have to look at the >> details to see if the implementation is similarly sensible. > > Eric, > > Did you find anything objectionable in the series or should I fix up > the !CONFIG_SYSFS error in networking patch and resubmit? Thank you for the ping, I put this patchset down and forgot to look back. The notion of a get_ownership call seems sensible. At some level I am not a fan of setting the uids and gids on the sysfs nodes as that requires allocation of an additional data structure and it will increase the code of sysfs nodes. Certainly I don't think we should incur that cost if we are not using user namespaces. sysfs nodes can be expensive data wise because we sometimes have so many of them. So skipping the setattr when uid == GLOBAL_ROOT_UID and gid == GLOBAL_ROOT_GID seems very desirable. Perhaps that is just an optimization in setattr, but it should be somewhere. I would very much prefer it if we can find a way not to touch all of the layers, in the stack. As I recall it is the code in drivers/base/core.c that creates the attributes. So my gut feel says we want to export a sysfs_setattr modeled after sysfs_chmod from sysfs.h and then just have the driver core level perform the setattr calls for non-default uids and gids. Symlinks we don't need to worry about changing their ownership they are globally read, write, execute. As long as the chattr happens before the uevent is triggered the code should be essentially race free in dealing with userspace. I think that will lead to a simpler more comprehensible and more maintainable implementation. Hooking in where or near where the namespace bits hook in seems excessively complicated (although there may be a good reason for it) that I am forgetting. Eric
Re: [PATCH 0/5] Make /sys/class/net per net namespace objects belong to container
On Mon, Aug 29, 2016 at 5:38 AM, Eric W. Biedermanwrote: > David Miller writes: > >> From: Dmitry Torokhov >> Date: Tue, 16 Aug 2016 15:33:10 -0700 >> >>> There are objects in /sys hierarchy (/sys/class/net/) that logically belong >>> to a namespace/container. Unfortunately all sysfs objects start their life >>> belonging to global root, and while we could change ownership manually, >>> keeping tracks of all objects that come and go is cumbersome. It would >>> be better if kernel created them using correct uid/gid from the beginning. >>> >>> This series changes kernfs to allow creating object's with arbitrary >>> uid/gid, adds get_ownership() callback to ktype structure so subsystems >>> could supply their own logic (likely tied to namespace support) for >>> determining ownership of kobjects, and adjusts sysfs code to make use of >>> this information. Lastly net-sysfs is adjusted to make sure that objects in >>> net namespace are owned by the root user from the owning user namespace. >>> >>> Note that we do not adjust ownership of objects moved into a new namespace >>> (as when moving a network device into a container) as userspace can easily >>> do it. >> >> I need some domain experts to review this series please. > > I just came back from vacation and I will aim to take a look shortly. > > The big picture idea seems sensible. Having a better ownship of sysfs > files that are part of a network namespace. I will have to look at the > details to see if the implementation is similarly sensible. Eric, Did you find anything objectionable in the series or should I fix up the !CONFIG_SYSFS error in networking patch and resubmit? Thanks. -- Dmitry
Re: [PATCH 0/5] Make /sys/class/net per net namespace objects belong to container
David Millerwrites: > From: Dmitry Torokhov > Date: Tue, 16 Aug 2016 15:33:10 -0700 > >> There are objects in /sys hierarchy (/sys/class/net/) that logically belong >> to a namespace/container. Unfortunately all sysfs objects start their life >> belonging to global root, and while we could change ownership manually, >> keeping tracks of all objects that come and go is cumbersome. It would >> be better if kernel created them using correct uid/gid from the beginning. >> >> This series changes kernfs to allow creating object's with arbitrary >> uid/gid, adds get_ownership() callback to ktype structure so subsystems >> could supply their own logic (likely tied to namespace support) for >> determining ownership of kobjects, and adjusts sysfs code to make use of >> this information. Lastly net-sysfs is adjusted to make sure that objects in >> net namespace are owned by the root user from the owning user namespace. >> >> Note that we do not adjust ownership of objects moved into a new namespace >> (as when moving a network device into a container) as userspace can easily >> do it. > > I need some domain experts to review this series please. I just came back from vacation and I will aim to take a look shortly. The big picture idea seems sensible. Having a better ownship of sysfs files that are part of a network namespace. I will have to look at the details to see if the implementation is similarly sensible. Eric
Re: [PATCH 0/5] Make /sys/class/net per net namespace objects belong to container
On Sun, Aug 21, 2016 at 11:41:39PM -0700, David Miller wrote: > From: Dmitry Torokhov> Date: Tue, 16 Aug 2016 15:33:10 -0700 > > > There are objects in /sys hierarchy (/sys/class/net/) that logically belong > > to a namespace/container. Unfortunately all sysfs objects start their life > > belonging to global root, and while we could change ownership manually, > > keeping tracks of all objects that come and go is cumbersome. It would > > be better if kernel created them using correct uid/gid from the beginning. > > > > This series changes kernfs to allow creating object's with arbitrary > > uid/gid, adds get_ownership() callback to ktype structure so subsystems > > could supply their own logic (likely tied to namespace support) for > > determining ownership of kobjects, and adjusts sysfs code to make use of > > this information. Lastly net-sysfs is adjusted to make sure that objects in > > net namespace are owned by the root user from the owning user namespace. > > > > Note that we do not adjust ownership of objects moved into a new namespace > > (as when moving a network device into a container) as userspace can easily > > do it. > > As shown by the kbuild robot, this fails to build when CONFIG_SYSFS > it disabled. Sorry about that, I haven't noticed that all netdev queue stuff is protected by #ifdef CONFIG_SYSFS. I'll move the definition out and resubmit in a few days (maybe Eric and Tejun will have some comments for be by then as well...). Thanks. -- Dmitry
Re: [PATCH 0/5] Make /sys/class/net per net namespace objects belong to container
From: Dmitry TorokhovDate: Tue, 16 Aug 2016 15:33:10 -0700 > There are objects in /sys hierarchy (/sys/class/net/) that logically belong > to a namespace/container. Unfortunately all sysfs objects start their life > belonging to global root, and while we could change ownership manually, > keeping tracks of all objects that come and go is cumbersome. It would > be better if kernel created them using correct uid/gid from the beginning. > > This series changes kernfs to allow creating object's with arbitrary > uid/gid, adds get_ownership() callback to ktype structure so subsystems > could supply their own logic (likely tied to namespace support) for > determining ownership of kobjects, and adjusts sysfs code to make use of > this information. Lastly net-sysfs is adjusted to make sure that objects in > net namespace are owned by the root user from the owning user namespace. > > Note that we do not adjust ownership of objects moved into a new namespace > (as when moving a network device into a container) as userspace can easily > do it. As shown by the kbuild robot, this fails to build when CONFIG_SYSFS it disabled.
Re: [PATCH 0/5] Make /sys/class/net per net namespace objects belong to container
From: Dmitry TorokhovDate: Tue, 16 Aug 2016 15:33:10 -0700 > There are objects in /sys hierarchy (/sys/class/net/) that logically belong > to a namespace/container. Unfortunately all sysfs objects start their life > belonging to global root, and while we could change ownership manually, > keeping tracks of all objects that come and go is cumbersome. It would > be better if kernel created them using correct uid/gid from the beginning. > > This series changes kernfs to allow creating object's with arbitrary > uid/gid, adds get_ownership() callback to ktype structure so subsystems > could supply their own logic (likely tied to namespace support) for > determining ownership of kobjects, and adjusts sysfs code to make use of > this information. Lastly net-sysfs is adjusted to make sure that objects in > net namespace are owned by the root user from the owning user namespace. > > Note that we do not adjust ownership of objects moved into a new namespace > (as when moving a network device into a container) as userspace can easily > do it. I need some domain experts to review this series please. Thank you.
[PATCH 0/5] Make /sys/class/net per net namespace objects belong to container
There are objects in /sys hierarchy (/sys/class/net/) that logically belong to a namespace/container. Unfortunately all sysfs objects start their life belonging to global root, and while we could change ownership manually, keeping tracks of all objects that come and go is cumbersome. It would be better if kernel created them using correct uid/gid from the beginning. This series changes kernfs to allow creating object's with arbitrary uid/gid, adds get_ownership() callback to ktype structure so subsystems could supply their own logic (likely tied to namespace support) for determining ownership of kobjects, and adjusts sysfs code to make use of this information. Lastly net-sysfs is adjusted to make sure that objects in net namespace are owned by the root user from the owning user namespace. Note that we do not adjust ownership of objects moved into a new namespace (as when moving a network device into a container) as userspace can easily do it. Thanks! Dmitry Torokhov (5): kernfs: allow creating kernfs objects with arbitrary uid/gid sysfs, kobject: allow creating kobject belonging to arbitrary users kobject: kset_create_and_add() - fetch ownership info from parent driver core: set up ownership of class devices in sysfs net-sysfs: make sure objects belong to contrainer's owner drivers/base/core.c | 9 + fs/kernfs/dir.c | 29 ++--- fs/kernfs/file.c| 8 ++-- fs/kernfs/inode.c | 2 +- fs/kernfs/kernfs-internal.h | 2 ++ fs/kernfs/symlink.c | 11 ++- fs/sysfs/dir.c | 7 ++- fs/sysfs/file.c | 33 + fs/sysfs/group.c| 23 +-- fs/sysfs/sysfs.h| 3 ++- include/linux/device.h | 5 + include/linux/kernfs.h | 28 +++- include/linux/kobject.h | 4 kernel/cgroup.c | 4 +++- lib/kobject.c | 28 +++- net/core/net-sysfs.c| 44 +++- 16 files changed, 201 insertions(+), 39 deletions(-) -- 2.8.0.rc3.226.g39d4020