On 04/30, Andrey Grodzovsky wrote:
>
> On 04/30/2018 12:00 PM, Oleg Nesterov wrote:
> >On 04/30, Andrey Grodzovsky wrote:
> >>What about changing PF_SIGNALED to PF_EXITING in
> >>drm_sched_entity_do_release
> >>
> >>- if ((current->fl
On 04/30, Christian König wrote:
>
> Well when the process is killed we don't care about correctness any more, we
> just want to get rid of it as quickly as possible (OOM situation etc...).
OK,
> But it is perfectly possible that a process submits some render commands and
> then calls exit() or
On 04/30, Andrey Grodzovsky wrote:
>
> What about changing PF_SIGNALED to PF_EXITING in
> drm_sched_entity_do_release
>
> - if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
> + if ((current->flags & PF_EXITING) && current->exit_code == SIGKILL)
let me repeat,
On 04/25, Andrey Grodzovsky wrote:
>
> here (drm_sched_entity_fini) is also a bad idea, but we still want to be
> able to exit immediately
> and not wait for GPU jobs completion when the reason for reaching this code
> is because of KILL
> signal to the user process who opened the device file.
On 04/25, Daniel Vetter wrote:
>
> On Wed, Apr 25, 2018 at 3:22 PM, Oleg Nesterov <o...@redhat.com> wrote:
> > On 04/24, Daniel Vetter wrote:
> >>
> >> wait_event_killabel doesn't check for fatal_signal_pending before calling
> >> schedule, so definitel
On 04/24, Daniel Vetter wrote:
>
> wait_event_killabel doesn't check for fatal_signal_pending before calling
> schedule, so definitely has a nice race there.
This is fine. See the signal_pending_state() check in __schedule().
And this doesn't differ from wait_event_interruptible(), it too
On 04/24, Eric W. Biederman wrote:
>
> Let me respectfully suggest that the wait_event_killable on that code
> path is wrong.
I tend to agree even if I don't know this code.
But if it can be called from f_op->release() then any usage of "current" or
signals looks suspicious. Simply because
On 04/24, Andrey Grodzovsky wrote:
>
> Currently calling wait_event_killable as part of exiting process
> will stall forever since SIGKILL generation is suppresed by PF_EXITING.
See my reply to 2/3,
> In our partilaur case AMDGPU driver wants to flush all GPU jobs in
> flight before shutting
On 04/24, Andrey Grodzovsky wrote:
>
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct
> drm_gpu_scheduler *sched,
> return;
> /**
>* The client will not