Re: [RFC] Improve subvolume usability for a normal user

2017-12-07 Thread Hugo Mills
On Thu, Dec 07, 2017 at 07:21:46AM -0500, Austin S. Hemmelgarn wrote:
> On 2017-12-07 06:55, Duncan wrote:
> >Misono, Tomohiro posted on Thu, 07 Dec 2017 16:15:47 +0900 as excerpted:
> >
> >>On 2017/12/07 11:56, Duncan wrote:
> >>>Austin S. Hemmelgarn posted on Wed, 06 Dec 2017 07:39:56 -0500 as
> >>>excerpted:
> >>>
> Somewhat OT, but the only operation that's remotely 'instant' is
> creating an empty subvolume.  Snapshot creation has to walk the tree
> in the subvolume being snapshotted, which can take a long time (and as
> a result of it's implementation, also means BTRFS snapshots are _not_
> atomic). Subvolume deletion has to do a bunch of cleanup work in the
> background (though it may be fairly quick if it was a snapshot and the
> source subvolume hasn't changed much).
> >>>
> >>>Indeed, while btrfs in general has taken a strategy of making
> >>>/creating/ snapshots and subvolumes fast, snapshot deletion in
> >>>particular can take some time[1].
> >>>
> >>>And in that regard a question just occurred to me regarding this whole
> >>>very tough problem of a user being able to create but not delete
> >>>subvolumes and snapshots:
> >>>
> >>>Given that at least snapshot deletion (not so sure about non-snapshot
> >>>subvolume deletion, tho I strongly suspect it would depend on the
> >>>number of cross-subvolume reflinks) is already a task that can take
> >>>some time, why /not/ just bite the bullet and make the behavior much
> >>>more like the directory deletion, given that subvolumes already behave
> >>>much like directories.  Yes, for non-root, that /does/ mean tracing the
> >>>entire subtree and checking permissions, and yes, that's going to take
> >>>time and lower performance somewhat, but subvolume and in particular
> >>>snapshot deletion is already an operation that takes time, so this
> >>>wouldn't be unduly changing the situation, and it would eliminate the
> >>>entire class of security issues that come with either asymmetrically
> >>>restricting deletion (but not creation) to root on the one hand,
> >>
> >>>or possible data loss due to allowing a user to delete a subvolume they
> >>>couldn't delete were it an ordinary directory due to not owning stuff
> >>>further down the tree.
> >>
> >>But, this is also the very reason I'm for "sub del" instead of unlink().
> >>Since snapshot creation won't check the permissions of the containing
> >>files/dirs, it can copy a directory which cannot be deleted by the user.
> >>Therefore if we won't allow "sub del" for the user, he couldn't remove
> >>the snapshot.
> >
> >Maybe snapshot creation /should/ check all that, in ordered to allow
> >permissions to allow deletion.
> >
> >Tho that would unfortunately increase the creation time, and btrfs is
> >currently optimized for fast creation time.
> >
> >Hmm... What about creating a "temporary" snapshot if not root, then
> >walking the tree to check perms and deleting it without ever showing it
> >to userspace if the perms wouldn't let the user delete it.  That would
> >retain fast creation logic, tho it wouldn't show up until the perms walk
> >was completed.
> >
> I would argue that it makes more sense to keep snapshot creation as
> is, keep the subvolume deletion command as is (with some proper
> permissions checks of course), and just make unlink() work for
> subvolumes like it does for directories.

   Definitely this.

   Principle of least surprise.

   Hugo.

-- 
Hugo Mills | ... one ping(1) to rule them all, and in the
hugo@... carfax.org.uk | darkness bind(2) them.
http://carfax.org.uk/  |
PGP: E2AB1DE4  |Illiad


signature.asc
Description: Digital signature


Re: [RFC] Improve subvolume usability for a normal user

2017-12-07 Thread Austin S. Hemmelgarn

On 2017-12-07 06:55, Duncan wrote:

Misono, Tomohiro posted on Thu, 07 Dec 2017 16:15:47 +0900 as excerpted:


On 2017/12/07 11:56, Duncan wrote:

Austin S. Hemmelgarn posted on Wed, 06 Dec 2017 07:39:56 -0500 as
excerpted:


Somewhat OT, but the only operation that's remotely 'instant' is
creating an empty subvolume.  Snapshot creation has to walk the tree
in the subvolume being snapshotted, which can take a long time (and as
a result of it's implementation, also means BTRFS snapshots are _not_
atomic). Subvolume deletion has to do a bunch of cleanup work in the
background (though it may be fairly quick if it was a snapshot and the
source subvolume hasn't changed much).


Indeed, while btrfs in general has taken a strategy of making
/creating/ snapshots and subvolumes fast, snapshot deletion in
particular can take some time[1].

And in that regard a question just occurred to me regarding this whole
very tough problem of a user being able to create but not delete
subvolumes and snapshots:

Given that at least snapshot deletion (not so sure about non-snapshot
subvolume deletion, tho I strongly suspect it would depend on the
number of cross-subvolume reflinks) is already a task that can take
some time, why /not/ just bite the bullet and make the behavior much
more like the directory deletion, given that subvolumes already behave
much like directories.  Yes, for non-root, that /does/ mean tracing the
entire subtree and checking permissions, and yes, that's going to take
time and lower performance somewhat, but subvolume and in particular
snapshot deletion is already an operation that takes time, so this
wouldn't be unduly changing the situation, and it would eliminate the
entire class of security issues that come with either asymmetrically
restricting deletion (but not creation) to root on the one hand,



or possible data loss due to allowing a user to delete a subvolume they
couldn't delete were it an ordinary directory due to not owning stuff
further down the tree.


But, this is also the very reason I'm for "sub del" instead of unlink().
Since snapshot creation won't check the permissions of the containing
files/dirs, it can copy a directory which cannot be deleted by the user.
Therefore if we won't allow "sub del" for the user, he couldn't remove
the snapshot.


Maybe snapshot creation /should/ check all that, in ordered to allow
permissions to allow deletion.

Tho that would unfortunately increase the creation time, and btrfs is
currently optimized for fast creation time.

Hmm... What about creating a "temporary" snapshot if not root, then
walking the tree to check perms and deleting it without ever showing it
to userspace if the perms wouldn't let the user delete it.  That would
retain fast creation logic, tho it wouldn't show up until the perms walk
was completed.

I would argue that it makes more sense to keep snapshot creation as is, 
keep the subvolume deletion command as is (with some proper permissions 
checks of course), and just make unlink() work for subvolumes like it 
does for directories.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Improve subvolume usability for a normal user

2017-12-06 Thread Qu Wenruo


On 2017年12月07日 15:32, Marat Khalili wrote:
> On 07/12/17 08:27, Qu Wenruo wrote:
>> When doing snapshot, btrfs only needs to increase reference of 2nd
>> highest level tree blocks of original snapshot, other than "walking the
>> tree".
>> (If tree root level is 2, then level 2 node is copied, while all
>> reference of level 1 tree blocks get increased)
> 
> Out of curiosity, how does it interacts with nocow files? Does every
> write to these files involves backref walk?

For details I need to check the code.
But at least, partial backref walk must be done.

For "partial" I mean it doesn't need to get every root referring to the
file extent, it only needs to check if the extent is only referred by
current inode and offset.

If any other inode (or even the same inode with different offset) is
found, then the walk is finished.
(The same walk we do to determine the FIEMAP_EXTENT_SHARED flag)

In contract, for qgroup and balance, the full backref walk is done,
where btrfs needs every root involved (for qgroup) or even every inode
and offset involved (for balance).

Thanks,
Qu

> 
> -- 
> 
> With Best Regards,
> Marat Khalili



signature.asc
Description: OpenPGP digital signature


Re: [RFC] Improve subvolume usability for a normal user

2017-12-06 Thread Marat Khalili

On 07/12/17 08:27, Qu Wenruo wrote:

When doing snapshot, btrfs only needs to increase reference of 2nd
highest level tree blocks of original snapshot, other than "walking the
tree".
(If tree root level is 2, then level 2 node is copied, while all
reference of level 1 tree blocks get increased)


Out of curiosity, how does it interacts with nocow files? Does every 
write to these files involves backref walk?


--

With Best Regards,
Marat Khalili
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Improve subvolume usability for a normal user

2017-12-06 Thread Misono, Tomohiro
On 2017/12/07 11:56, Duncan wrote:
> Austin S. Hemmelgarn posted on Wed, 06 Dec 2017 07:39:56 -0500 as
> excerpted:
> 
>> Somewhat OT, but the only operation that's remotely 'instant' is
>> creating an empty subvolume.  Snapshot creation has to walk the tree in
>> the subvolume being snapshotted, which can take a long time (and as a
>> result of it's implementation, also means BTRFS snapshots are _not_
>> atomic). Subvolume deletion has to do a bunch of cleanup work in the
>> background (though it may be fairly quick if it was a snapshot and the
>> source subvolume hasn't changed much).
> 
> Indeed, while btrfs in general has taken a strategy of making /creating/ 
> snapshots and subvolumes fast, snapshot deletion in particular can take 
> some time[1].
> 
> And in that regard a question just occurred to me regarding this whole 
> very tough problem of a user being able to create but not delete 
> subvolumes and snapshots:
> 
> Given that at least snapshot deletion (not so sure about non-snapshot 
> subvolume deletion, tho I strongly suspect it would depend on the number 
> of cross-subvolume reflinks) is already a task that can take some time, 
> why /not/ just bite the bullet and make the behavior much more like the 
> directory deletion, given that subvolumes already behave much like 
> directories.  Yes, for non-root, that /does/ mean tracing the entire 
> subtree and checking permissions, and yes, that's going to take time and 
> lower performance somewhat, but subvolume and in particular snapshot 
> deletion is already an operation that takes time, so this wouldn't be 
> unduly changing the situation, and it would eliminate the entire class of 
> security issues that come with either asymmetrically restricting deletion 
> (but not creation) to root on the one hand,

> or possible data loss due to 
> allowing a user to delete a subvolume they couldn't delete were it an 
> ordinary directory due to not owning stuff further down the tree.

But, this is also the very reason I'm for "sub del" instead of unlink().
Since snapshot creation won't check the permissions of the containing 
files/dirs,
it can copy a directory which cannot be deleted by the user.
Therefore if we won't allow "sub del" for the user, he couldn't remove the 
snapshot.

> 
> ---
> [1] Based on comments and reports here.  My own use-case doesn't involve 
> snapshots/subvolumes.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Improve subvolume usability for a normal user

2017-12-06 Thread Qu Wenruo


On 2017年12月06日 20:39, Austin S. Hemmelgarn wrote:
> Somewhat OT, but the only operation that's remotely 'instant' is
> creating an empty subvolume.  Snapshot creation has to walk the tree in
> the subvolume being snapshotted, which can take a long time (and as a
> result of it's implementation, also means BTRFS snapshots are _not_
> atomic).

Nope, this is not true.

Btrfs from the very beginning is designed to speed up snapshot.
The most obvious evidence is its extent tree design.

When doing snapshot, btrfs only needs to increase reference of 2nd
highest level tree blocks of original snapshot, other than "walking the
tree".
(If tree root level is 2, then level 2 node is copied, while all
reference of level 1 tree blocks get increased)

This design hugely speeds up snapshot creation, making snapshot
obviously faster than any block level snapshot.

Although this design obviously slows down backref walk.
As btrfs must do a full walk until tree root, to see if any tree blocks
between leaf and root is shared with another snapshot.


And for btrfs snapshot atomic thing, it's just because btrfs delayed
snapshot creation to transaction commit time, and buffered write is not
triggered for snapshot creation, so it makes snapshot not so atomic.

Thanks,
Qu

>  Subvolume deletion has to do a bunch of cleanup work in the
> background (though it may be fairly quick if it was a snapshot and the
> source subvolume hasn't changed much).



signature.asc
Description: OpenPGP digital signature


Re: [RFC] Improve subvolume usability for a normal user

2017-12-06 Thread Duncan
Austin S. Hemmelgarn posted on Wed, 06 Dec 2017 07:39:56 -0500 as
excerpted:

> Somewhat OT, but the only operation that's remotely 'instant' is
> creating an empty subvolume.  Snapshot creation has to walk the tree in
> the subvolume being snapshotted, which can take a long time (and as a
> result of it's implementation, also means BTRFS snapshots are _not_
> atomic). Subvolume deletion has to do a bunch of cleanup work in the
> background (though it may be fairly quick if it was a snapshot and the
> source subvolume hasn't changed much).

Indeed, while btrfs in general has taken a strategy of making /creating/ 
snapshots and subvolumes fast, snapshot deletion in particular can take 
some time[1].

And in that regard a question just occurred to me regarding this whole 
very tough problem of a user being able to create but not delete 
subvolumes and snapshots:

Given that at least snapshot deletion (not so sure about non-snapshot 
subvolume deletion, tho I strongly suspect it would depend on the number 
of cross-subvolume reflinks) is already a task that can take some time, 
why /not/ just bite the bullet and make the behavior much more like the 
directory deletion, given that subvolumes already behave much like 
directories.  Yes, for non-root, that /does/ mean tracing the entire 
subtree and checking permissions, and yes, that's going to take time and 
lower performance somewhat, but subvolume and in particular snapshot 
deletion is already an operation that takes time, so this wouldn't be 
unduly changing the situation, and it would eliminate the entire class of 
security issues that come with either asymmetrically restricting deletion 
(but not creation) to root on the one hand, or possible data loss due to 
allowing a user to delete a subvolume they couldn't delete were it an 
ordinary directory due to not owning stuff further down the tree.

---
[1] Based on comments and reports here.  My own use-case doesn't involve 
snapshots/subvolumes.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Improve subvolume usability for a normal user

2017-12-06 Thread Austin S. Hemmelgarn

On 2017-12-05 23:52, Misono, Tomohiro wrote:

On 2017/12/05 21:41, Austin S. Hemmelgarn wrote:

On 2017-12-05 03:43, Qu Wenruo wrote:



On 2017年12月05日 16:25, Misono, Tomohiro wrote:

Hello all,

I want to address some issues of subvolume usability for a normal user.
i.e. a user can create subvolumes, but
   - Cannot delete their own subvolume (by default)
   - Cannot tell subvolumes from directories (in a straightforward way)
   - Cannot check the quota limit when qgroup is enabled

Here I show the initial thoughts and approaches to this problem.
I want to check if this is a right approach or not before I start writing code.

Comments are welcome.
Tomohiro Misono

==
- Goal and current problem
The goal of this RFC is to give a normal user more control to their own 
subvolumes.
Currently the control to subvolumes for a normal user is restricted as below:

+-+--+--+
|   command   | root | user |
+-+--+--+
| sub create  | Y| Y|
| sub snap| Y| Y|
| sub del | Y| N|
| sub list| Y| N|
| sub show| Y| N|
| qgroup show | Y| N|
+-+--+--+

In short, I want to change this as below in order to improve user's usability:

+-+--++
|   command   | root | user   |
+-+--++
| sub create  | Y| Y  |
| sub snap| Y| Y  |
| sub del | Y| N -> Y |
| sub list| Y| N -> Y |
| sub show| Y| N -> Y |
| qgroup show | Y| N -> Y |
+-+--++

In words,
(1) allow deletion of subvolume if a user owns it, and
(2) allow getting subvolume/quota info if a user has read access to it
   (sub list/qgroup show just lists the subvolumes which are readable by the 
user)

I think other commands not listed above (qgroup limit, send/receive etc.) should
be done by root and not be allowed for a normal user.


- Outside the scope of this RFC
There is a qualitative problem to use qgroup for limiting user disk amount;
quota limit can easily be averted by creating a subvolume. I think that forcing
inheriting quota of parent subvolume is a solution, but I won't address nor
discuss this here.


- Proposal
   (1) deletion of subvolume

I want to change the default behavior to allow a user to delete their own
subvolumes.

This is not the same behavior as when user_subvol_rm_alowed mount option is

specified; in that case a user can delete the subvolume to which they have
write+exec right.

Since snapshot creation is already restricted to the subvolume owner, it is

consistent that only the owner of the subvolume (or root) can delete it.

The implementation should be straightforward.


Personally speaking, I prefer to do the complex owner check in user daemon.

And do the privilege in user daemon (call it btrfsd for example).

So btrfs-progs will works in 2 modes, if root calls it, do as it used to do.
If normal user calls it, proxy the request to btrfsd, and btrfsd does
the privilege checking and call ioctl (with root privilege).

Then no impact to kernel, all complex work is done in user space.

Exactly how hard is it to just check ownership of the root inode of a
subvolume from the ioctl context?  You could just as easily push all the
checking out to the VFS layer by taking an open fd for the subvolume
root (and probably implicitly closing it) instead of taking a path, and
that would give you all the benefits of ACL's and whatever security
modules the local system is using.  Alternatively, quit treating
subvolumes specially for `unlink()`, and make them behave like regular
directories (and thus avoid the one possible complication resulting from
home directories being subvolumes but needing to be owned by the users).


Thanks to all for the comments.
Let me explain my stance clearly:

I'm for limiting the subvolume operation to btrfs-progs commands since a
subvolume looks like a regular directory, but is different in many points.
This is the reason I want to fix not only "sub delete" but also "sub show/list" 
etc.
I agree that it needs to behave more sanely, but the approach of 
treating it differently from a directory is the entire cause of this 
issue in the first place.  If the original implementation had just 
treated them as regular directories once created (with the obvious 
caveat regarding read-only snapshots), we wouldn't be having this 
conversation right now, except possibly regarding fetching info about 
subvolumes.


Personally, I would really rather just let unlink() work the same on 
subvolumes as it does with directories, and only need special handling 
for creation and metadata retrieval (with the possible option of 'quick' 
deletion through a btrfs-progs command).


I think "sub delete" should be the counterpart of "sub create" and "sub snap".
Since "sub snap" can be used for the subvolume the user owns, "sub delete"
should be allowed to the suvolume the 

Re: [RFC] Improve subvolume usability for a normal user

2017-12-06 Thread Austin S. Hemmelgarn

On 2017-12-05 17:08, Goffredo Baroncelli wrote:

On 12/05/2017 09:17 PM, Austin S. Hemmelgarn wrote:

On 2017-12-05 14:09, Goffredo Baroncelli wrote:

On 12/05/2017 07:46 PM, Graham Cobb wrote:

On 05/12/17 18:01, Goffredo Baroncelli wrote:

On 12/05/2017 04:42 PM, Graham Cobb wrote:

[]

Then no impact to kernel, all complex work is done in user space.

Exactly how hard is it to just check ownership of the root inode of a
subvolume from the ioctl context?  You could just as easily push all the
checking out to the VFS layer by taking an open fd for the subvolume
root (and probably implicitly closing it) instead of taking a path, and
that would give you all the benefits of ACL's and whatever security
modules the local system is using.


+1 - stop inventing new access control rules for each different action!


A subvolume is like a directory; In all filesystems you cannot remove an inode 
if you don't have write access to the parent directory. I assume that this is a 
POSIX requirement; and if so this should be true even for BTRFS.
This means that in order to remove a subvolume and you are not root, you should 
check all the contained directories. It is not sufficient to check one inode.

In the past I create a patch [1][2] which extended the unlink (2) syscall to 
allow removing a subvolume if:
a) the user is root  or
b.1) if the subvolume is empty and
b.2) the user has the write capability to the parent directory


That is also OK, as it follows a logical and consistent model without
introducing new rules, but it has two disadvantages with snapshots the
user creates and then wants to delete:

i) they have to change the snapshot to readwrite to remove all the
contents -- how many users will know that that can even be done?


I think that this is an orthogonal question.  However the user should use btrfs 
set

Anyway I am not against to use something more specific like "btrfs sub del...", 
where it could be possible to implement a more specific checks. I am only highlight that 
destroy a subvolume without looking to its content could break the POSIX rules

No, it doesn't break POSIX rules, because you can't do it with POSIX defined 
interfaces (not counting ioctls, but ioctls are by definition system dependent).


user$ mkdir -p dir1/dir2
user$ touch dir1/dir2/file
user$ sudo chown foo:foo dir1/dir2

user$ rm -rf dir1/
rm: cannot remove 'dir1/dir2/file1': Permission denied

In a POSIX filesystem, you (as user) cannot remove file1.

But if instead of dir1 you have a subvolume (called sub1), and the check of 
removing the subvolume is performed only on the root of subvolume, you could 
remove it.

Is it a problem ? I think no, but I am not a security expert. But even if POSIX 
doesn't have the concept of subvolume, this is definitely a break of a POSIX 
rule.
Except it isn't a break of a POSIX rule, because deleting the subvolume 
doesn't go through regular POSIX API's (again, ioctl really doesn't 
count because it's designed for random extended stuff).  POSIX makes 
absolutely zero guarantees about the behavior of things not done through 
POSIX API's.  Yes, this doesn't use POSIX style permission checks, but 
it also doesn't violate POSIX standards either (because it's not defined 
in the POSIX standard, and things that aren't defined by the standards 
are explicitly stated to be implementation defined).


The data loss risk this entails is however the reason users can't delete 
subvolumes by default.


On the other side, if the user makes a snapshot of a subvolume containing a not 
owned and not empty directory, for the user it would be impossible to delete 
its snapshot (!).
I would argue that by default they shouldn't be able to make a snapshot 
because:
1. With the default behavior of users not being able to delete 
subvolumes, users can trivially cause a resource exhaustion situation 
that they can't resolve themself.

2. Whitelisting is usually better for security than blacklisting.







ii) if the snapshot contains a file the user cannot remove then the user
cannot remove the snapshot at all (even though they could create it).


It is the same for the other filesystem: if in a filesystem tree, there is a 
directory which is not changeable by the user, the user cannot remove all the 
tree. Again, I am only highlighting that there is a possible break of POSIX 
rules.

POSIX says nothing about non-permissions related reasons for not being able to 
remove a directory.

Nobody has say the opposite
Claiming that it's potentially POSIX incompatible behavior directly 
implies that POSIX has provisions regarding non-permissions related 
reasons for not being able to remove a directory.



   A subvolume is treated (too much in my opinion) like a mount point, 
regardless of whether or not it's explicitly mounted, and that seems to be most 
of why you can't remove it with unlink().




That is why I preferred Austin's first proposal. I suppose your proposal
could be extended:

a) the user is root  or

Re: [RFC] Improve subvolume usability for a normal user

2017-12-05 Thread Misono, Tomohiro
On 2017/12/05 21:41, Austin S. Hemmelgarn wrote:
> On 2017-12-05 03:43, Qu Wenruo wrote:
>>
>>
>> On 2017年12月05日 16:25, Misono, Tomohiro wrote:
>>> Hello all,
>>>
>>> I want to address some issues of subvolume usability for a normal user.
>>> i.e. a user can create subvolumes, but
>>>   - Cannot delete their own subvolume (by default)
>>>   - Cannot tell subvolumes from directories (in a straightforward way)
>>>   - Cannot check the quota limit when qgroup is enabled
>>>
>>> Here I show the initial thoughts and approaches to this problem.
>>> I want to check if this is a right approach or not before I start writing 
>>> code.
>>>
>>> Comments are welcome.
>>> Tomohiro Misono
>>>
>>> ==
>>> - Goal and current problem
>>> The goal of this RFC is to give a normal user more control to their own 
>>> subvolumes.
>>> Currently the control to subvolumes for a normal user is restricted as 
>>> below:
>>>
>>> +-+--+--+
>>> |   command   | root | user |
>>> +-+--+--+
>>> | sub create  | Y| Y|
>>> | sub snap| Y| Y|
>>> | sub del | Y| N|
>>> | sub list| Y| N|
>>> | sub show| Y| N|
>>> | qgroup show | Y| N|
>>> +-+--+--+
>>>
>>> In short, I want to change this as below in order to improve user's 
>>> usability:
>>>
>>> +-+--++
>>> |   command   | root | user   |
>>> +-+--++
>>> | sub create  | Y| Y  |
>>> | sub snap| Y| Y  |
>>> | sub del | Y| N -> Y |
>>> | sub list| Y| N -> Y |
>>> | sub show| Y| N -> Y |
>>> | qgroup show | Y| N -> Y |
>>> +-+--++
>>>
>>> In words,
>>> (1) allow deletion of subvolume if a user owns it, and
>>> (2) allow getting subvolume/quota info if a user has read access to it
>>>   (sub list/qgroup show just lists the subvolumes which are readable by the 
>>> user)
>>>
>>> I think other commands not listed above (qgroup limit, send/receive etc.) 
>>> should
>>> be done by root and not be allowed for a normal user.
>>>
>>>
>>> - Outside the scope of this RFC
>>> There is a qualitative problem to use qgroup for limiting user disk amount;
>>> quota limit can easily be averted by creating a subvolume. I think that 
>>> forcing
>>> inheriting quota of parent subvolume is a solution, but I won't address nor
>>> discuss this here.
>>>
>>>
>>> - Proposal
>>>   (1) deletion of subvolume
>>>
>>>I want to change the default behavior to allow a user to delete their own
>>>subvolumes.
>>>
>>>This is not the same behavior as when user_subvol_rm_alowed mount option 
>>> is
>>>specified; in that case a user can delete the subvolume to which they 
>>> have
>>>write+exec right.
>>>
>>>Since snapshot creation is already restricted to the subvolume owner, it 
>>> is
>>>consistent that only the owner of the subvolume (or root) can delete it.
>>>
>>>The implementation should be straightforward.
>>
>> Personally speaking, I prefer to do the complex owner check in user daemon.
>>
>> And do the privilege in user daemon (call it btrfsd for example).
>>
>> So btrfs-progs will works in 2 modes, if root calls it, do as it used to do.
>> If normal user calls it, proxy the request to btrfsd, and btrfsd does
>> the privilege checking and call ioctl (with root privilege).
>>
>> Then no impact to kernel, all complex work is done in user space.
> Exactly how hard is it to just check ownership of the root inode of a 
> subvolume from the ioctl context?  You could just as easily push all the 
> checking out to the VFS layer by taking an open fd for the subvolume 
> root (and probably implicitly closing it) instead of taking a path, and 
> that would give you all the benefits of ACL's and whatever security 
> modules the local system is using.  Alternatively, quit treating 
> subvolumes specially for `unlink()`, and make them behave like regular 
> directories (and thus avoid the one possible complication resulting from 
> home directories being subvolumes but needing to be owned by the users).

Thanks to all for the comments.
Let me explain my stance clearly:

I'm for limiting the subvolume operation to btrfs-progs commands since a
subvolume looks like a regular directory, but is different in many points.
This is the reason I want to fix not only "sub delete" but also "sub show/list" 
etc.

I think "sub delete" should be the counterpart of "sub create" and "sub snap". 
Since "sub snap" can be used for the subvolume the user owns, "sub delete" 
should be allowed to the suvolume the user owns too. Also, using these
commands enables an instant creation/deletion of a  subvolume/snapshot, which
is one of the features of btrfs.

Regards,
Tomohiro
> 
> 
> This whole 'just make it a daemon' thing that's becoming the standard 
> solution everywhere these days is crap in most cases (note that I'm not 
> just singling out this proposal here).  Just 

Re: [RFC] Improve subvolume usability for a normal user

2017-12-05 Thread Goffredo Baroncelli
On 12/05/2017 09:17 PM, Austin S. Hemmelgarn wrote:
> On 2017-12-05 14:09, Goffredo Baroncelli wrote:
>> On 12/05/2017 07:46 PM, Graham Cobb wrote:
>>> On 05/12/17 18:01, Goffredo Baroncelli wrote:
 On 12/05/2017 04:42 PM, Graham Cobb wrote:
>> []
>>> Then no impact to kernel, all complex work is done in user space.
>> Exactly how hard is it to just check ownership of the root inode of a
>> subvolume from the ioctl context?  You could just as easily push all the
>> checking out to the VFS layer by taking an open fd for the subvolume
>> root (and probably implicitly closing it) instead of taking a path, and
>> that would give you all the benefits of ACL's and whatever security
>> modules the local system is using.
>
> +1 - stop inventing new access control rules for each different action!

 A subvolume is like a directory; In all filesystems you cannot remove an 
 inode if you don't have write access to the parent directory. I assume 
 that this is a POSIX requirement; and if so this should be true even for 
 BTRFS.
 This means that in order to remove a subvolume and you are not root, you 
 should check all the contained directories. It is not sufficient to check 
 one inode.

 In the past I create a patch [1][2] which extended the unlink (2) syscall 
 to allow removing a subvolume if:
 a) the user is root  or
 b.1) if the subvolume is empty and
 b.2) the user has the write capability to the parent directory
>>>
>>> That is also OK, as it follows a logical and consistent model without
>>> introducing new rules, but it has two disadvantages with snapshots the
>>> user creates and then wants to delete:
>>>
>>> i) they have to change the snapshot to readwrite to remove all the
>>> contents -- how many users will know that that can even be done?
>>
>> I think that this is an orthogonal question.  However the user should use 
>> btrfs set
>>
>> Anyway I am not against to use something more specific like "btrfs sub 
>> del...", where it could be possible to implement a more specific checks. I 
>> am only highlight that destroy a subvolume without looking to its content 
>> could break the POSIX rules
> No, it doesn't break POSIX rules, because you can't do it with POSIX defined 
> interfaces (not counting ioctls, but ioctls are by definition system 
> dependent).

user$ mkdir -p dir1/dir2
user$ touch dir1/dir2/file
user$ sudo chown foo:foo dir1/dir2

user$ rm -rf dir1/
rm: cannot remove 'dir1/dir2/file1': Permission denied

In a POSIX filesystem, you (as user) cannot remove file1.

But if instead of dir1 you have a subvolume (called sub1), and the check of 
removing the subvolume is performed only on the root of subvolume, you could 
remove it.

Is it a problem ? I think no, but I am not a security expert. But even if POSIX 
doesn't have the concept of subvolume, this is definitely a break of a POSIX 
rule.

On the other side, if the user makes a snapshot of a subvolume containing a not 
owned and not empty directory, for the user it would be impossible to delete 
its snapshot (!).


>>
>>>
>>> ii) if the snapshot contains a file the user cannot remove then the user
>>> cannot remove the snapshot at all (even though they could create it).
>>
>> It is the same for the other filesystem: if in a filesystem tree, there is a 
>> directory which is not changeable by the user, the user cannot remove all 
>> the tree. Again, I am only highlighting that there is a possible break of 
>> POSIX rules.
> POSIX says nothing about non-permissions related reasons for not being able 
> to remove a directory. 
Nobody has say the opposite

>  A subvolume is treated (too much in my opinion) like a mount point, 
>regardless of whether or not it's explicitly mounted, and that seems to be 
>most of why you can't remove it with unlink().



>>
>>>
>>> That is why I preferred Austin's first proposal. I suppose your proposal
>>> could be extended:
>>>
>>> a) the user is root  or
>>> b.1) if the subvolume is empty and
>>> b.2) the user has the write capability to the parent directory, or
>>> c) the subvolume is readonly
>>
>> If a subvolume is marked read-only, the user has to change to RW via "btrfs 
>> property set"
> I actually do agree with this assessment to a certain extent, but the user 
> has to be able to change the subvolume to be RW to begin with (and that 
> _does_ need an in-kernel permissions check beyond just knowing whether you 
> can access the subvolume).
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli 
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo 

Re: [RFC] Improve subvolume usability for a normal user

2017-12-05 Thread Austin S. Hemmelgarn

On 2017-12-05 14:09, Goffredo Baroncelli wrote:

On 12/05/2017 07:46 PM, Graham Cobb wrote:

On 05/12/17 18:01, Goffredo Baroncelli wrote:

On 12/05/2017 04:42 PM, Graham Cobb wrote:

[]

Then no impact to kernel, all complex work is done in user space.

Exactly how hard is it to just check ownership of the root inode of a
subvolume from the ioctl context?  You could just as easily push all the
checking out to the VFS layer by taking an open fd for the subvolume
root (and probably implicitly closing it) instead of taking a path, and
that would give you all the benefits of ACL's and whatever security
modules the local system is using.


+1 - stop inventing new access control rules for each different action!


A subvolume is like a directory; In all filesystems you cannot remove an inode 
if you don't have write access to the parent directory. I assume that this is a 
POSIX requirement; and if so this should be true even for BTRFS.
This means that in order to remove a subvolume and you are not root, you should 
check all the contained directories. It is not sufficient to check one inode.

In the past I create a patch [1][2] which extended the unlink (2) syscall to 
allow removing a subvolume if:
a) the user is root  or
b.1) if the subvolume is empty and
b.2) the user has the write capability to the parent directory


That is also OK, as it follows a logical and consistent model without
introducing new rules, but it has two disadvantages with snapshots the
user creates and then wants to delete:

i) they have to change the snapshot to readwrite to remove all the
contents -- how many users will know that that can even be done?


I think that this is an orthogonal question.  However the user should use btrfs 
set

Anyway I am not against to use something more specific like "btrfs sub del...", 
where it could be possible to implement a more specific checks. I am only highlight that 
destroy a subvolume without looking to its content could break the POSIX rules
No, it doesn't break POSIX rules, because you can't do it with POSIX 
defined interfaces (not counting ioctls, but ioctls are by definition 
system dependent).




ii) if the snapshot contains a file the user cannot remove then the user
cannot remove the snapshot at all (even though they could create it).


It is the same for the other filesystem: if in a filesystem tree, there is a 
directory which is not changeable by the user, the user cannot remove all the 
tree. Again, I am only highlighting that there is a possible break of POSIX 
rules.
POSIX says nothing about non-permissions related reasons for not being 
able to remove a directory.   A subvolume is treated (too much in my 
opinion) like a mount point, regardless of whether or not it's 
explicitly mounted, and that seems to be most of why you can't remove it 
with unlink().




That is why I preferred Austin's first proposal. I suppose your proposal
could be extended:

a) the user is root  or
b.1) if the subvolume is empty and
b.2) the user has the write capability to the parent directory, or
c) the subvolume is readonly


If a subvolume is marked read-only, the user has to change to RW via "btrfs property 
set"
I actually do agree with this assessment to a certain extent, but the 
user has to be able to change the subvolume to be RW to begin with (and 
that _does_ need an in-kernel permissions check beyond just knowing 
whether you can access the subvolume).

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Improve subvolume usability for a normal user

2017-12-05 Thread Goffredo Baroncelli
On 12/05/2017 07:46 PM, Graham Cobb wrote:
> On 05/12/17 18:01, Goffredo Baroncelli wrote:
>> On 12/05/2017 04:42 PM, Graham Cobb wrote:
[]
> Then no impact to kernel, all complex work is done in user space.
 Exactly how hard is it to just check ownership of the root inode of a
 subvolume from the ioctl context?  You could just as easily push all the
 checking out to the VFS layer by taking an open fd for the subvolume
 root (and probably implicitly closing it) instead of taking a path, and
 that would give you all the benefits of ACL's and whatever security
 modules the local system is using.  
>>>
>>> +1 - stop inventing new access control rules for each different action!
>>
>> A subvolume is like a directory; In all filesystems you cannot remove an 
>> inode if you don't have write access to the parent directory. I assume that 
>> this is a POSIX requirement; and if so this should be true even for BTRFS.
>> This means that in order to remove a subvolume and you are not root, you 
>> should check all the contained directories. It is not sufficient to check 
>> one inode.
>>
>> In the past I create a patch [1][2] which extended the unlink (2) syscall to 
>> allow removing a subvolume if:
>> a) the user is root  or
>> b.1) if the subvolume is empty and
>> b.2) the user has the write capability to the parent directory
> 
> That is also OK, as it follows a logical and consistent model without
> introducing new rules, but it has two disadvantages with snapshots the
> user creates and then wants to delete:
> 
> i) they have to change the snapshot to readwrite to remove all the
> contents -- how many users will know that that can even be done?

I think that this is an orthogonal question.  However the user should use btrfs 
set

Anyway I am not against to use something more specific like "btrfs sub del...", 
where it could be possible to implement a more specific checks. I am only 
highlight that destroy a subvolume without looking to its content could break 
the POSIX rules

> 
> ii) if the snapshot contains a file the user cannot remove then the user
> cannot remove the snapshot at all (even though they could create it).

It is the same for the other filesystem: if in a filesystem tree, there is a 
directory which is not changeable by the user, the user cannot remove all the 
tree. Again, I am only highlighting that there is a possible break of POSIX 
rules.

> 
> That is why I preferred Austin's first proposal. I suppose your proposal
> could be extended:
> 
> a) the user is root  or
> b.1) if the subvolume is empty and
> b.2) the user has the write capability to the parent directory, or
> c) the subvolume is readonly

If a subvolume is marked read-only, the user has to change to RW via "btrfs 
property set"

> 
> However it is done, I think it is important that all normal VFS access
> rules (such as ACLs) are followed for the subvolume itself.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli 
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Improve subvolume usability for a normal user

2017-12-05 Thread Graham Cobb
On 05/12/17 18:01, Goffredo Baroncelli wrote:
> On 12/05/2017 04:42 PM, Graham Cobb wrote:
>> On 05/12/17 12:41, Austin S. Hemmelgarn wrote:
>>> On 2017-12-05 03:43, Qu Wenruo wrote:


 On 2017年12月05日 16:25, Misono, Tomohiro wrote:
> Hello all,
>
> I want to address some issues of subvolume usability for a normal user.
> i.e. a user can create subvolumes, but
>   - Cannot delete their own subvolume (by default)
>   - Cannot tell subvolumes from directories (in a straightforward way)
>   - Cannot check the quota limit when qgroup is enabled
>
> Here I show the initial thoughts and approaches to this problem.
> I want to check if this is a right approach or not before I start
> writing code.
>
> Comments are welcome.
> Tomohiro Misono
>
> ==
> - Goal and current problem
> The goal of this RFC is to give a normal user more control to their
> own subvolumes.
> Currently the control to subvolumes for a normal user is restricted
> as below:
>
> +-+--+--+
> |   command   | root | user |
> +-+--+--+
> | sub create  | Y    | Y    |
> | sub snap    | Y    | Y    |
> | sub del | Y    | N    |
> | sub list    | Y    | N    |
> | sub show    | Y    | N    |
> | qgroup show | Y    | N    |
> +-+--+--+
>
> In short, I want to change this as below in order to improve user's
> usability:
>
> +-+--++
> |   command   | root | user   |
> +-+--++
> | sub create  | Y    | Y  |
> | sub snap    | Y    | Y  |
> | sub del | Y    | N -> Y |
> | sub list    | Y    | N -> Y |
> | sub show    | Y    | N -> Y |
> | qgroup show | Y    | N -> Y |
> +-+--++
>
> In words,
> (1) allow deletion of subvolume if a user owns it, and
> (2) allow getting subvolume/quota info if a user has read access to it
>   (sub list/qgroup show just lists the subvolumes which are readable
> by the user)
>
> I think other commands not listed above (qgroup limit, send/receive
> etc.) should
> be done by root and not be allowed for a normal user.
>
>
> - Outside the scope of this RFC
> There is a qualitative problem to use qgroup for limiting user disk
> amount;
> quota limit can easily be averted by creating a subvolume. I think
> that forcing
> inheriting quota of parent subvolume is a solution, but I won't
> address nor
> discuss this here.
>
>
> - Proposal
>   (1) deletion of subvolume
>
>    I want to change the default behavior to allow a user to delete
> their own
>    subvolumes.
>       This is not the same behavior as when user_subvol_rm_alowed
> mount option is
>    specified; in that case a user can delete the subvolume to which
> they have
>    write+exec right.
>       Since snapshot creation is already restricted to the subvolume
> owner, it is
>    consistent that only the owner of the subvolume (or root) can
> delete it.
>       The implementation should be straightforward.

 Personally speaking, I prefer to do the complex owner check in user
 daemon.

 And do the privilege in user daemon (call it btrfsd for example).

 So btrfs-progs will works in 2 modes, if root calls it, do as it used
 to do.
 If normal user calls it, proxy the request to btrfsd, and btrfsd does
 the privilege checking and call ioctl (with root privilege).

 Then no impact to kernel, all complex work is done in user space.
>>> Exactly how hard is it to just check ownership of the root inode of a
>>> subvolume from the ioctl context?  You could just as easily push all the
>>> checking out to the VFS layer by taking an open fd for the subvolume
>>> root (and probably implicitly closing it) instead of taking a path, and
>>> that would give you all the benefits of ACL's and whatever security
>>> modules the local system is using.  
>>
>> +1 - stop inventing new access control rules for each different action!
> 
> A subvolume is like a directory; In all filesystems you cannot remove an 
> inode if you don't have write access to the parent directory. I assume that 
> this is a POSIX requirement; and if so this should be true even for BTRFS.
> This means that in order to remove a subvolume and you are not root, you 
> should check all the contained directories. It is not sufficient to check one 
> inode.
> 
> In the past I create a patch [1][2] which extended the unlink (2) syscall to 
> allow removing a subvolume if:
> a) the user is root  or
> b.1) if the subvolume is empty and
> b.2) the user has the write capability to the parent directory

That is also OK, as it follows a logical and consistent model without
introducing new rules, but 

Re: [RFC] Improve subvolume usability for a normal user

2017-12-05 Thread Goffredo Baroncelli
On 12/05/2017 09:25 AM, Misono, Tomohiro wrote:
> Hello all,
> 
> I want to address some issues of subvolume usability for a normal user.
> i.e. a user can create subvolumes, but
>  - Cannot delete their own subvolume (by default)
>  - Cannot tell subvolumes from directories (in a straightforward way)
>  - Cannot check the quota limit when qgroup is enabled
> 
> Here I show the initial thoughts and approaches to this problem.
> I want to check if this is a right approach or not before I start writing 
> code.
> 
> Comments are welcome.
> Tomohiro Misono
> 
> ==
> - Goal and current problem
> The goal of this RFC is to give a normal user more control to their own 
> subvolumes.
> Currently the control to subvolumes for a normal user is restricted as below: 
> 
> +-+--+--+
> |   command   | root | user |
> +-+--+--+
> | sub create  | Y| Y|
> | sub snap| Y| Y|
> | sub del | Y| N|
> | sub list| Y| N|
> | sub show| Y| N|
> | qgroup show | Y| N|
> +-+--+--+
> 

I suggest to consider also 

   btrfs fi usage

see my patches [1] and [2]

> In short, I want to change this as below in order to improve user's usability:
> 
> +-+--++
> |   command   | root | user   |
> +-+--++
> | sub create  | Y| Y  |
> | sub snap| Y| Y  |
> | sub del | Y| N -> Y |
> | sub list| Y| N -> Y |
> | sub show| Y| N -> Y |
> | qgroup show | Y| N -> Y |
> +-+--++
> 
> In words,
> (1) allow deletion of subvolume if a user owns it, and 
I commented this in another part of this thread. My suggestion is to extend 
unlink(2) to remove a subvolume if it is empty, and if the user has needed 
privileges. This would be fully POSIX compliant.

[]

> 
>  (2) getting subvolume/quota info
> 
>   TREE_SEARCH ioctl is used to retrieve the subvolume/quota info by 
> btrfs-progs
>   (sub show/list, qgroup show etc.). This requires the root permission.
>   
>   The easiest way to allow a user to get subvolume/quota info is just relaxing
>   the permission of TREE_SEARCH. However, since all the tree information (inc.
>   file name) will be exposed, this poses a sequrity risk and is not 
> acceptable.
>   
>   So, I want to introduce 2 ioctls to get subvolume/quota info respectively 
> for
>   a normal user, which return only the info readable by the user:
>   
>[subvolume info]
> Mainly search ROOT tree for ROOT_ITEM/ROOT_BACKREF, but check its read
>right by searching FS/FILE tree and comparing it with caller's uid.

Is it POSIX compliant ? Currently I can't access an entry if its parent 
directory doesn't have the 'x' bit enabled.
>From another point of view, a subvolume is more like a "mount --bind". And I 
>can know which and where devices are mounted looking at /proc/self/mountinfo.

I fear that we should build the path at the kernel level, then check (again at 
the kernel level) if all path elements are accessible by the user, and then 
returning the full path to the user. I am incline to think that accessing 
element by element (on the basis of the parent directory access bit), would be 
a nightmare from posix compliance POV.

A more simple approach would be traversing all the filesystem tree (via 
opendir/readdir) and return all the idem with inode == 256 (something like 
'find / -inum 256'). Even tough it would be very inefficient


>   
>[quota info]
> Same as above, but mainly search QUOTA tree and only returns level 0 info.
>   
>   Also, in order to construct subvolume path, permission of INO_LOOKUP ioctl
>   needs to be relaxed for the user who has read access to the subvolume.
> 
> 
> - Summary of Proposal
>  - Change the default behavior to allow a user to delete their own subvolumes
>  - Add 2 new non-root ioctls to get subvolume/quota info for accessible 
> subvolumes
> ==
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

[1] https://www.spinics.net/lists/linux-btrfs/msg69564.html
[2] https://www.spinics.net/lists/linux-btrfs/msg69566.html

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli 
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Improve subvolume usability for a normal user

2017-12-05 Thread Goffredo Baroncelli
On 12/05/2017 04:42 PM, Graham Cobb wrote:
> On 05/12/17 12:41, Austin S. Hemmelgarn wrote:
>> On 2017-12-05 03:43, Qu Wenruo wrote:
>>>
>>>
>>> On 2017年12月05日 16:25, Misono, Tomohiro wrote:
 Hello all,

 I want to address some issues of subvolume usability for a normal user.
 i.e. a user can create subvolumes, but
   - Cannot delete their own subvolume (by default)
   - Cannot tell subvolumes from directories (in a straightforward way)
   - Cannot check the quota limit when qgroup is enabled

 Here I show the initial thoughts and approaches to this problem.
 I want to check if this is a right approach or not before I start
 writing code.

 Comments are welcome.
 Tomohiro Misono

 ==
 - Goal and current problem
 The goal of this RFC is to give a normal user more control to their
 own subvolumes.
 Currently the control to subvolumes for a normal user is restricted
 as below:

 +-+--+--+
 |   command   | root | user |
 +-+--+--+
 | sub create  | Y    | Y    |
 | sub snap    | Y    | Y    |
 | sub del | Y    | N    |
 | sub list    | Y    | N    |
 | sub show    | Y    | N    |
 | qgroup show | Y    | N    |
 +-+--+--+

 In short, I want to change this as below in order to improve user's
 usability:

 +-+--++
 |   command   | root | user   |
 +-+--++
 | sub create  | Y    | Y  |
 | sub snap    | Y    | Y  |
 | sub del | Y    | N -> Y |
 | sub list    | Y    | N -> Y |
 | sub show    | Y    | N -> Y |
 | qgroup show | Y    | N -> Y |
 +-+--++

 In words,
 (1) allow deletion of subvolume if a user owns it, and
 (2) allow getting subvolume/quota info if a user has read access to it
   (sub list/qgroup show just lists the subvolumes which are readable
 by the user)

 I think other commands not listed above (qgroup limit, send/receive
 etc.) should
 be done by root and not be allowed for a normal user.


 - Outside the scope of this RFC
 There is a qualitative problem to use qgroup for limiting user disk
 amount;
 quota limit can easily be averted by creating a subvolume. I think
 that forcing
 inheriting quota of parent subvolume is a solution, but I won't
 address nor
 discuss this here.


 - Proposal
   (1) deletion of subvolume

    I want to change the default behavior to allow a user to delete
 their own
    subvolumes.
       This is not the same behavior as when user_subvol_rm_alowed
 mount option is
    specified; in that case a user can delete the subvolume to which
 they have
    write+exec right.
       Since snapshot creation is already restricted to the subvolume
 owner, it is
    consistent that only the owner of the subvolume (or root) can
 delete it.
       The implementation should be straightforward.
>>>
>>> Personally speaking, I prefer to do the complex owner check in user
>>> daemon.
>>>
>>> And do the privilege in user daemon (call it btrfsd for example).
>>>
>>> So btrfs-progs will works in 2 modes, if root calls it, do as it used
>>> to do.
>>> If normal user calls it, proxy the request to btrfsd, and btrfsd does
>>> the privilege checking and call ioctl (with root privilege).
>>>
>>> Then no impact to kernel, all complex work is done in user space.
>> Exactly how hard is it to just check ownership of the root inode of a
>> subvolume from the ioctl context?  You could just as easily push all the
>> checking out to the VFS layer by taking an open fd for the subvolume
>> root (and probably implicitly closing it) instead of taking a path, and
>> that would give you all the benefits of ACL's and whatever security
>> modules the local system is using.  
> 
> +1 - stop inventing new access control rules for each different action!

A subvolume is like a directory; In all filesystems you cannot remove an inode 
if you don't have write access to the parent directory. I assume that this is a 
POSIX requirement; and if so this should be true even for BTRFS.
This means that in order to remove a subvolume and you are not root, you should 
check all the contained directories. It is not sufficient to check one inode.

In the past I create a patch [1][2] which extended the unlink (2) syscall to 
allow removing a subvolume if:
a) the user is root  or
b.1) if the subvolume is empty and
b.2) the user has the write capability to the parent directory

This will solve all the problems:
a) removing a subvolume is like removing a directory; no different check is 
performed; no new (and strange) rule
b) an ordinary user can remove a subvolume, like a directory
c) is quite easy to implement

Root still have another way (more 

Re: [RFC] Improve subvolume usability for a normal user

2017-12-05 Thread Graham Cobb
On 05/12/17 12:41, Austin S. Hemmelgarn wrote:
> On 2017-12-05 03:43, Qu Wenruo wrote:
>>
>>
>> On 2017年12月05日 16:25, Misono, Tomohiro wrote:
>>> Hello all,
>>>
>>> I want to address some issues of subvolume usability for a normal user.
>>> i.e. a user can create subvolumes, but
>>>   - Cannot delete their own subvolume (by default)
>>>   - Cannot tell subvolumes from directories (in a straightforward way)
>>>   - Cannot check the quota limit when qgroup is enabled
>>>
>>> Here I show the initial thoughts and approaches to this problem.
>>> I want to check if this is a right approach or not before I start
>>> writing code.
>>>
>>> Comments are welcome.
>>> Tomohiro Misono
>>>
>>> ==
>>> - Goal and current problem
>>> The goal of this RFC is to give a normal user more control to their
>>> own subvolumes.
>>> Currently the control to subvolumes for a normal user is restricted
>>> as below:
>>>
>>> +-+--+--+
>>> |   command   | root | user |
>>> +-+--+--+
>>> | sub create  | Y    | Y    |
>>> | sub snap    | Y    | Y    |
>>> | sub del | Y    | N    |
>>> | sub list    | Y    | N    |
>>> | sub show    | Y    | N    |
>>> | qgroup show | Y    | N    |
>>> +-+--+--+
>>>
>>> In short, I want to change this as below in order to improve user's
>>> usability:
>>>
>>> +-+--++
>>> |   command   | root | user   |
>>> +-+--++
>>> | sub create  | Y    | Y  |
>>> | sub snap    | Y    | Y  |
>>> | sub del | Y    | N -> Y |
>>> | sub list    | Y    | N -> Y |
>>> | sub show    | Y    | N -> Y |
>>> | qgroup show | Y    | N -> Y |
>>> +-+--++
>>>
>>> In words,
>>> (1) allow deletion of subvolume if a user owns it, and
>>> (2) allow getting subvolume/quota info if a user has read access to it
>>>   (sub list/qgroup show just lists the subvolumes which are readable
>>> by the user)
>>>
>>> I think other commands not listed above (qgroup limit, send/receive
>>> etc.) should
>>> be done by root and not be allowed for a normal user.
>>>
>>>
>>> - Outside the scope of this RFC
>>> There is a qualitative problem to use qgroup for limiting user disk
>>> amount;
>>> quota limit can easily be averted by creating a subvolume. I think
>>> that forcing
>>> inheriting quota of parent subvolume is a solution, but I won't
>>> address nor
>>> discuss this here.
>>>
>>>
>>> - Proposal
>>>   (1) deletion of subvolume
>>>
>>>    I want to change the default behavior to allow a user to delete
>>> their own
>>>    subvolumes.
>>>       This is not the same behavior as when user_subvol_rm_alowed
>>> mount option is
>>>    specified; in that case a user can delete the subvolume to which
>>> they have
>>>    write+exec right.
>>>       Since snapshot creation is already restricted to the subvolume
>>> owner, it is
>>>    consistent that only the owner of the subvolume (or root) can
>>> delete it.
>>>       The implementation should be straightforward.
>>
>> Personally speaking, I prefer to do the complex owner check in user
>> daemon.
>>
>> And do the privilege in user daemon (call it btrfsd for example).
>>
>> So btrfs-progs will works in 2 modes, if root calls it, do as it used
>> to do.
>> If normal user calls it, proxy the request to btrfsd, and btrfsd does
>> the privilege checking and call ioctl (with root privilege).
>>
>> Then no impact to kernel, all complex work is done in user space.
> Exactly how hard is it to just check ownership of the root inode of a
> subvolume from the ioctl context?  You could just as easily push all the
> checking out to the VFS layer by taking an open fd for the subvolume
> root (and probably implicitly closing it) instead of taking a path, and
> that would give you all the benefits of ACL's and whatever security
> modules the local system is using.  

+1 - stop inventing new access control rules for each different action!


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Improve subvolume usability for a normal user

2017-12-05 Thread Austin S. Hemmelgarn

On 2017-12-05 03:43, Qu Wenruo wrote:



On 2017年12月05日 16:25, Misono, Tomohiro wrote:

Hello all,

I want to address some issues of subvolume usability for a normal user.
i.e. a user can create subvolumes, but
  - Cannot delete their own subvolume (by default)
  - Cannot tell subvolumes from directories (in a straightforward way)
  - Cannot check the quota limit when qgroup is enabled

Here I show the initial thoughts and approaches to this problem.
I want to check if this is a right approach or not before I start writing code.

Comments are welcome.
Tomohiro Misono

==
- Goal and current problem
The goal of this RFC is to give a normal user more control to their own 
subvolumes.
Currently the control to subvolumes for a normal user is restricted as below:

+-+--+--+
|   command   | root | user |
+-+--+--+
| sub create  | Y| Y|
| sub snap| Y| Y|
| sub del | Y| N|
| sub list| Y| N|
| sub show| Y| N|
| qgroup show | Y| N|
+-+--+--+

In short, I want to change this as below in order to improve user's usability:

+-+--++
|   command   | root | user   |
+-+--++
| sub create  | Y| Y  |
| sub snap| Y| Y  |
| sub del | Y| N -> Y |
| sub list| Y| N -> Y |
| sub show| Y| N -> Y |
| qgroup show | Y| N -> Y |
+-+--++

In words,
(1) allow deletion of subvolume if a user owns it, and
(2) allow getting subvolume/quota info if a user has read access to it
  (sub list/qgroup show just lists the subvolumes which are readable by the 
user)

I think other commands not listed above (qgroup limit, send/receive etc.) should
be done by root and not be allowed for a normal user.


- Outside the scope of this RFC
There is a qualitative problem to use qgroup for limiting user disk amount;
quota limit can easily be averted by creating a subvolume. I think that forcing
inheriting quota of parent subvolume is a solution, but I won't address nor
discuss this here.


- Proposal
  (1) deletion of subvolume

   I want to change the default behavior to allow a user to delete their own
   subvolumes.
   
   This is not the same behavior as when user_subvol_rm_alowed mount option is

   specified; in that case a user can delete the subvolume to which they have
   write+exec right.
   
   Since snapshot creation is already restricted to the subvolume owner, it is

   consistent that only the owner of the subvolume (or root) can delete it.
   
   The implementation should be straightforward.


Personally speaking, I prefer to do the complex owner check in user daemon.

And do the privilege in user daemon (call it btrfsd for example).

So btrfs-progs will works in 2 modes, if root calls it, do as it used to do.
If normal user calls it, proxy the request to btrfsd, and btrfsd does
the privilege checking and call ioctl (with root privilege).

Then no impact to kernel, all complex work is done in user space.
Exactly how hard is it to just check ownership of the root inode of a 
subvolume from the ioctl context?  You could just as easily push all the 
checking out to the VFS layer by taking an open fd for the subvolume 
root (and probably implicitly closing it) instead of taking a path, and 
that would give you all the benefits of ACL's and whatever security 
modules the local system is using.  Alternatively, quit treating 
subvolumes specially for `unlink()`, and make them behave like regular 
directories (and thus avoid the one possible complication resulting from 
home directories being subvolumes but needing to be owned by the users).



This whole 'just make it a daemon' thing that's becoming the standard 
solution everywhere these days is crap in most cases (note that I'm not 
just singling out this proposal here).  Just because systemd does it 
doesn't make it a good idea, especially when it makes things like this 
far more opaque than they really need to be, which in turn makes it a 
pain in the arse to debug _anything_ to do with permissions beyond basic 
file access.



In almost all cases, the desired check is to allow root, possibly admin 
users who have filesystem management permissions (usually exactly one 
group), and the owner.  That is trivial to implement with a single ACL 
entry and basic filesystem permissions without needing some special 
daemon to check them provided we route things through the VFS layer 
somehow.  In the rare cases that that's not what's wanted, then the 
system is almost certainly using an LSM, and the checks can be made there.




  (2) getting subvolume/quota info

   TREE_SEARCH ioctl is used to retrieve the subvolume/quota info by btrfs-progs
   (sub show/list, qgroup show etc.). This requires the root permission.
   
   The easiest way to allow a user to get subvolume/quota info is just relaxing

   the permission of TREE_SEARCH. However, since all the 

Re: [RFC] Improve subvolume usability for a normal user

2017-12-05 Thread Qu Wenruo


On 2017年12月05日 16:25, Misono, Tomohiro wrote:
> Hello all,
> 
> I want to address some issues of subvolume usability for a normal user.
> i.e. a user can create subvolumes, but
>  - Cannot delete their own subvolume (by default)
>  - Cannot tell subvolumes from directories (in a straightforward way)
>  - Cannot check the quota limit when qgroup is enabled
> 
> Here I show the initial thoughts and approaches to this problem.
> I want to check if this is a right approach or not before I start writing 
> code.
> 
> Comments are welcome.
> Tomohiro Misono
> 
> ==
> - Goal and current problem
> The goal of this RFC is to give a normal user more control to their own 
> subvolumes.
> Currently the control to subvolumes for a normal user is restricted as below: 
> 
> +-+--+--+
> |   command   | root | user |
> +-+--+--+
> | sub create  | Y| Y|
> | sub snap| Y| Y|
> | sub del | Y| N|
> | sub list| Y| N|
> | sub show| Y| N|
> | qgroup show | Y| N|
> +-+--+--+
> 
> In short, I want to change this as below in order to improve user's usability:
> 
> +-+--++
> |   command   | root | user   |
> +-+--++
> | sub create  | Y| Y  |
> | sub snap| Y| Y  |
> | sub del | Y| N -> Y |
> | sub list| Y| N -> Y |
> | sub show| Y| N -> Y |
> | qgroup show | Y| N -> Y |
> +-+--++
> 
> In words,
> (1) allow deletion of subvolume if a user owns it, and 
> (2) allow getting subvolume/quota info if a user has read access to it
>  (sub list/qgroup show just lists the subvolumes which are readable by the 
> user)
> 
> I think other commands not listed above (qgroup limit, send/receive etc.) 
> should
> be done by root and not be allowed for a normal user.
> 
> 
> - Outside the scope of this RFC
> There is a qualitative problem to use qgroup for limiting user disk amount;
> quota limit can easily be averted by creating a subvolume. I think that 
> forcing
> inheriting quota of parent subvolume is a solution, but I won't address nor
> discuss this here. 
> 
> 
> - Proposal
>  (1) deletion of subvolume
> 
>   I want to change the default behavior to allow a user to delete their own
>   subvolumes.
>   
>   This is not the same behavior as when user_subvol_rm_alowed mount option is
>   specified; in that case a user can delete the subvolume to which they have
>   write+exec right.
>   
>   Since snapshot creation is already restricted to the subvolume owner, it is
>   consistent that only the owner of the subvolume (or root) can delete it.
>   
>   The implementation should be straightforward.

Personally speaking, I prefer to do the complex owner check in user daemon.

And do the privilege in user daemon (call it btrfsd for example).

So btrfs-progs will works in 2 modes, if root calls it, do as it used to do.
If normal user calls it, proxy the request to btrfsd, and btrfsd does
the privilege checking and call ioctl (with root privilege).

Then no impact to kernel, all complex work is done in user space.

> 
>  (2) getting subvolume/quota info
> 
>   TREE_SEARCH ioctl is used to retrieve the subvolume/quota info by 
> btrfs-progs
>   (sub show/list, qgroup show etc.). This requires the root permission.
>   
>   The easiest way to allow a user to get subvolume/quota info is just relaxing
>   the permission of TREE_SEARCH. However, since all the tree information (inc.
>   file name) will be exposed, this poses a sequrity risk and is not 
> acceptable.
>   
>   So, I want to introduce 2 ioctls to get subvolume/quota info respectively 
> for
>   a normal user, which return only the info readable by the user:
>   
>[subvolume info]
> Mainly search ROOT tree for ROOT_ITEM/ROOT_BACKREF, but check its read
>right by searching FS/FILE tree and comparing it with caller's uid.
>   
>[quota info]
> Same as above, but mainly search QUOTA tree and only returns level 0 info.
>   
>   Also, in order to construct subvolume path, permission of INO_LOOKUP ioctl
>   needs to be relaxed for the user who has read access to the subvolume.

Same method can apply to this either.

Thanks,
Qu
> 
> 
> - Summary of Proposal
>  - Change the default behavior to allow a user to delete their own subvolumes
>  - Add 2 new non-root ioctls to get subvolume/quota info for accessible 
> subvolumes
> ==
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



signature.asc
Description: OpenPGP digital signature