Re: btrfs and containers

Austin S. Hemmelgarn Thu, 10 Mar 2016 09:04:48 -0800

On 2016-03-09 21:55, Duncan wrote:

Austin S. Hemmelgarn posted on Wed, 09 Mar 2016 07:15:36 -0500 as
excerpted:

On 2016-03-08 16:28, Chris Murphy wrote:

Yes, it's a bit peculiar I can create subvolumes and snapshot them, but
can't 'btrfs sub list/show'

It's an open question why the user needs a subvolume, but I'm not
thinking of a human user necessarily but rather some service, maybe
it's httpd. Or maybe with the xdg-app stuff the Gnome folks are working
on it makes sense to encapsulate applications and their updates in
their own subvolume. *shrug*  I'm open to the idea that the use case
needs to be more compelling and detailed in order to get the
implementation right.

It's probably worth tossing out there that I use them on a regular basis
as a normal user (not root or some service) for:
1. Local copies of VCS repositories.
2. Build directories.
3. Staging areas for a variety of things.
4. Specifically isolating certain parts of my home directory from
backups.

1-3 are mostly because of the fact that deleting a subvolume is insanely
fast compared to recursive deletion of a directory, although 4 is
somewhat significant for those as well.


For #2 and possibly #3, depending on what's being staged and why, tmpfs
works well, and deleting should be even faster (AFAIK, subvolume deletion
returns immediately but the work continues in the background, so if
you're running other IO-bound jobs they'll still be affected even tho the
subvolume deletion command has returned... if it's all in memory as is
tmpfs, that problem's eliminated too), tho of course you need enough
memory so that tmpfs doesn't trigger swap-thrashing.

Yeah, most of the time I use subvolumes for item 2 or 3 it's eitherdealing with stuff that I specifically want persistent across reboots(for example, the build directory I keep in /usr/src for the kernel, orstaging directories for audio recordings), or things that are big enoughI really want to avoid the memory consumption from working on tmpfs (asof right now, the only package I have installed on any of my systemsthat fits this is LLVM/clang, I used to do this for some other softwarelike LibreOffice, webkit-gtk, and icedtea as well though).


But #1 and #4 of course don't work as well on tmpfs as you'll likely want
them around longer, and all four cases definitely make use of the the
fact that nested subvolumes wall off snapshotting and thus btrfs send,
for backup purposes.  And of course if you're on a limited-memory machine
and thus can't easily use tmpfs for building and other staging, and don't
need to care about the ongoing background IO, using subvolumes for #2 and
3 remains useful, as well.

In general I can see them being useful for any number of things from a
service perspective, although I feel that snapshots are likely more
useful there (the ability to atomically save the state of a set of files
is extremely useful for a lot of things).


I consider the current situation somewhat of a security (DoS) issue,
since users (or runaway scripts or malware) can create unlimited
subvolumes as an ordinary user, with that user then not being able to
delete them, requiring admin intervention to do so.  Of course as long as
it's a single-human-user with an admin-rights alter-ego login, it's not
/that/ much of a security issue, but I could see it being one for human
users who do not have that admin-rights alter-ego login.  So were I to be
running in such a situation, I'd probably use the mount option to let the
users delete their own subvolumes, unless of course that opens up other
security issues I'm not aware of.

IMO before btrfs can really be considered stable, this possible DoS needs
resolved by making the list/delete set the exact same as the create set,
either by giving users some way to deal with (only) their own subvolumes
just as they can their own directories, or by reserving subvolume
creation to superuser, because that's what's needed for listing and
deletion.  Because if not, I fear someone's going to take advantage of it
in some way, perhaps, as with many DoS vulns, using it to deny critical
resources as a way to simplify some other more critical attack, and it'll
be in the headlines as an attack that worked and a zero-day that still
works.

The part that makes this tricky is that the list ioctl can be considereda potential information leak (as evidenced by the issue that startedthis thread), so IMHO what really needs to happen is for the mountoption to be 'user_subvolume_ops', and control all three operations (orbetter yet, do something with ACL's in the btrfs xattr namespace tocontrol it on a per-subvolume basis).


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs and containers

Reply via email to