`MFD_NOEXEC_SEAL` should remove the executable bits and set `F_SEAL_EXEC`
to prevent further modifications to the executable bits as per the comment
in the uapi header file:

  not executable and sealed to prevent changing to executable

However, commit 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC")
that introduced this feature made it so that `MFD_NOEXEC_SEAL` unsets
`F_SEAL_SEAL`, essentially acting as a superset of `MFD_ALLOW_SEALING`.

Nothing implies that it should be so, and indeed up until the second version
of the of the patchset[0] that introduced `MFD_EXEC` and `MFD_NOEXEC_SEAL`,
`F_SEAL_SEAL` was not removed, however, it was changed in the third revision
of the patchset[1] without a clear explanation.

This behaviour is surprising for application developers, there is no
documentation that would reveal that `MFD_NOEXEC_SEAL` has the additional
effect of `MFD_ALLOW_SEALING`. Additionally, combined with `vm.memfd_noexec=2`
it has the effect of making all memfds initially sealable.

So do not remove `F_SEAL_SEAL` when `MFD_NOEXEC_SEAL` is requested,
thereby returning to the pre-Linux 6.3 behaviour of only allowing
sealing when `MFD_ALLOW_SEALING` is specified.

Now, this is technically a uapi break. However, the damage is expected
to be minimal. To trigger user visible change, a program has to do the
following steps:

 - create memfd:
   - with `MFD_NOEXEC_SEAL`,
   - without `MFD_ALLOW_SEALING`;
 - try to add seals / check the seals.

But that seems unlikely to happen intentionally since this change
essentially reverts the kernel's behaviour to that of Linux <6.3,
so if a program worked correctly on those older kernels, it will
likely work correctly after this change.

I have used Debian Code Search and GitHub to try to find potential
breakages, and I could only find a single one. dbus-broker's
memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING`
behaviour, and tries to work around it[2]. This workaround will
break. Luckily, this only affects the test suite, it does not affect
the normal operations of dbus-broker. There is a PR with a fix[3].

I also carried out a smoke test by building a kernel with this change
and booting an Arch Linux system into GNOME and Plasma sessions.

There was also a previous attempt to address this peculiarity by
introducing a new flag[4].

[0]: https://lore.kernel.org/lkml/[email protected]/
[1]: https://lore.kernel.org/lkml/[email protected]/
[2]: 
https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8784cb/src/util/misc.c#L114
[3]: https://github.com/bus1/dbus-broker/pull/366
[4]: https://lore.kernel.org/lkml/[email protected]/

Cc: [email protected]
Signed-off-by: Barnabás Pőcze <[email protected]>
---

* v3: 
https://lore.kernel.org/linux-mm/[email protected]/
* v2: 
https://lore.kernel.org/linux-mm/[email protected]/
* v1: 
https://lore.kernel.org/linux-mm/[email protected]/

This fourth version returns to removing the inconsistency as opposed to 
documenting
its existence, with the same code change as v1 but with a somewhat extended 
commit
message. This is sent because I believe it is worth at least a try; it can be 
easily
reverted if bigger application breakages are discovered than initially imagined.

---
 mm/memfd.c                                 | 9 ++++-----
 tools/testing/selftests/memfd/memfd_test.c | 2 +-
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/mm/memfd.c b/mm/memfd.c
index 7d8d3ab3fa37..8b7f6afee21d 100644
--- a/mm/memfd.c
+++ b/mm/memfd.c
@@ -356,12 +356,11 @@ SYSCALL_DEFINE2(memfd_create,
 
                inode->i_mode &= ~0111;
                file_seals = memfd_file_seals_ptr(file);
-               if (file_seals) {
-                       *file_seals &= ~F_SEAL_SEAL;
+               if (file_seals)
                        *file_seals |= F_SEAL_EXEC;
-               }
-       } else if (flags & MFD_ALLOW_SEALING) {
-               /* MFD_EXEC and MFD_ALLOW_SEALING are set */
+       }
+
+       if (flags & MFD_ALLOW_SEALING) {
                file_seals = memfd_file_seals_ptr(file);
                if (file_seals)
                        *file_seals &= ~F_SEAL_SEAL;
diff --git a/tools/testing/selftests/memfd/memfd_test.c 
b/tools/testing/selftests/memfd/memfd_test.c
index 95af2d78fd31..7b78329f65b6 100644
--- a/tools/testing/selftests/memfd/memfd_test.c
+++ b/tools/testing/selftests/memfd/memfd_test.c
@@ -1151,7 +1151,7 @@ static void test_noexec_seal(void)
                            mfd_def_size,
                            MFD_CLOEXEC | MFD_NOEXEC_SEAL);
        mfd_assert_mode(fd, 0666);
-       mfd_assert_has_seals(fd, F_SEAL_EXEC);
+       mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC);
        mfd_fail_chmod(fd, 0777);
        close(fd);
 }
-- 
2.45.2



Reply via email to