Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

2018-05-02 Thread Oleg Nesterov
On 04/30, Andrey Grodzovsky wrote: > > On 04/30/2018 12:00 PM, Oleg Nesterov wrote: > >On 04/30, Andrey Grodzovsky wrote: > >>What about changing PF_SIGNALED to  PF_EXITING in > >>drm_sched_entity_do_release > >> > >>-   if ((current->fl

Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

2018-05-01 Thread Oleg Nesterov
On 04/30, Christian König wrote: > > Well when the process is killed we don't care about correctness any more, we > just want to get rid of it as quickly as possible (OOM situation etc...). OK, > But it is perfectly possible that a process submits some render commands and > then calls exit() or

Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

2018-05-01 Thread Oleg Nesterov
On 04/30, Andrey Grodzovsky wrote: > > What about changing PF_SIGNALED to  PF_EXITING in > drm_sched_entity_do_release > > -   if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL) > +  if ((current->flags & PF_EXITING) && current->exit_code == SIGKILL) let me repeat,

Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

2018-04-25 Thread Oleg Nesterov
On 04/25, Andrey Grodzovsky wrote: > > here (drm_sched_entity_fini) is also a bad idea, but we still want to be > able to exit immediately > and not wait for GPU jobs completion when the reason for reaching this code > is because of KILL > signal to the user process who opened the device file.

Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

2018-04-25 Thread Oleg Nesterov
On 04/25, Daniel Vetter wrote: > > On Wed, Apr 25, 2018 at 3:22 PM, Oleg Nesterov <o...@redhat.com> wrote: > > On 04/24, Daniel Vetter wrote: > >> > >> wait_event_killabel doesn't check for fatal_signal_pending before calling > >> schedule, so definitel

Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

2018-04-25 Thread Oleg Nesterov
On 04/24, Daniel Vetter wrote: > > wait_event_killabel doesn't check for fatal_signal_pending before calling > schedule, so definitely has a nice race there. This is fine. See the signal_pending_state() check in __schedule(). And this doesn't differ from wait_event_interruptible(), it too

Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

2018-04-25 Thread Oleg Nesterov
On 04/24, Eric W. Biederman wrote: > > Let me respectfully suggest that the wait_event_killable on that code > path is wrong. I tend to agree even if I don't know this code. But if it can be called from f_op->release() then any usage of "current" or signals looks suspicious. Simply because

Re: [PATCH 1/3] signals: Allow generation of SIGKILL to exiting task.

2018-04-25 Thread Oleg Nesterov
On 04/24, Andrey Grodzovsky wrote: > > Currently calling wait_event_killable as part of exiting process > will stall forever since SIGKILL generation is suppresed by PF_EXITING. See my reply to 2/3, > In our partilaur case AMDGPU driver wants to flush all GPU jobs in > flight before shutting

Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

2018-04-25 Thread Oleg Nesterov
On 04/24, Andrey Grodzovsky wrote: > > --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c > +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c > @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct > drm_gpu_scheduler *sched, > return; > /** >* The client will not