Hello.

I've been spending my time for addressing a thorny issue raised by Linux
kernel's memory allocator's behavior. In short, a local unprivileged user
can trivially lock up Linux systems unless memory cgroups are appropriately
used. A very very long summary (index to LWN.net articles) is at
http://marc.info/?l=linux-kernel&m=143239201905479 .

When I was working at NTT Open Source Software Center, I mainly had charge
of troubles caused by Linux kernels, especially kernel panics, unexpected
reboots and hang up. Since this issue pop up, I'm suspecting that some of
unexpected hang up troubles were caused by this issue. (Unfortunately, I
was not in time for passing along to customers how to correct debugging
information before they press reset button of their servers. Thus, I have
no evidence that they are actually hitting this issue.)

Currently, Michal Hocko is trying to reduce the possibility of hitting this
issue by allowing memory allocation requests without __GFP_FS flag to fail.
But this approach needs to be tested very carefully because a cleanup patch
which unexpectedly allowed memory allocation requests without __GFP_FS flag
to fail resulted in unstable systems. To me, Michal's approach will be too
late for customers to apply fixes of unexpected fallouts because they want to
use specific kernel version for as long as possible. (Well, that contributes
why I had charge of kernel panics and unexpected reboots.) I think that
introducing proactive countermeasure (like "use in-kernel access restriction
mechanisms such as SELinux") is an approach which customers can choose before
they decide specific kernel version to use. (That's the abovementioned patch.)

Speak of memory allocation requests without __GFP_FS flag, in-kernel access
restriction mechanisms (including TOMOYO/AKARI/CaitSith) are using it. This
means that if we go with Michal's approach, access requests from user space
will start failing with ENOMEM error when memory is tight. It is not happy
that access requests by critical processes are failed by inconsequential
process's memory consumption (whereas /proc/$pid/oom_score_adj can protect
critical processes from inconsequential process). Isolating all processes
into appropriate memory cgroup would be something like restricting all
processes with SELinux, which is not easy (impossible for most of systems).

I prefer fixing callers (adding __GFP_NORETRY to callers) in a step-by-step
fashion after adding proactive countermeasure over changing the default
behavior (implicitly applying __GFP_NORETRY inside).

I have no idea how this story is going to end...

_______________________________________________
tomoyo-users-en mailing list
tomoyo-users-en@lists.osdn.me
http://lists.osdn.me/mailman/listinfo/tomoyo-users-en

Reply via email to