On Sun, Jul 28, 2019 at 6:59 PM Daniel Axtens <d...@axtens.net> wrote: > > Currently, when a kernel stack overflow is detected via VMAP_STACK, > the task is killed with die(). > > This isn't safe, because we don't know how that process has affected > kernel state. In particular, we don't know what locks have been taken. > For example, we can hit a case with lkdtm where a thread takes a > stack overflow in printk() after taking the logbuf_lock. In that case, > we deadlock when the kernel next does a printk. > > Do not attempt to kill the process when a kernel stack overflow is > detected. The system state is unknown, the only safe thing to do is > panic(). (panic() also prints without taking locks so a useful debug > splat is printed even when logbuf_lock is held.)
The thing I don't like about this is that it reduces the chance that we successfully log anything to disk. PeterZ, do you have any useful input here? I wonder if we could do something like printk_oh_crap() that is just printk() except that it panics if it fails to return after a few seconds. --Andy