Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-31 Thread Qiang Yu
On Fri, Jun 1, 2018 at 2:04 AM, Keith Packard  wrote:
> Eric Anholt  writes:
>
>> Just wait for all tasks to complete when any object is freed?  That's
>> going to be bad for performance.  Or are you saying that you already
>> have the connection between the task and its objects (and, if so, why
>> aren't you just doing refcounting correctly through that path?)
>
> How about wait on close of the DRM device?
Yeah, that's what this patch for: get preclose back and do task wait
in it before free buffers. I still think this is best way in my case (and
may be other drivers).
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-31 Thread Qiang Yu
On Fri, Jun 1, 2018 at 1:51 AM, Eric Anholt  wrote:
> Qiang Yu  writes:
>

 It is OK if evil user free/unmap the buffer when task is not done
 in my implementation. It will generate a MMU fault in that case and kernel
 driver will do recovery.

 So does the Ctrl+C case, if don't deal with it, just get some noisy MMU
 fault warning and a HW reset recovery.
>>>
>>> How about an app rendering to shared buffers, which glFlush()es and
>>> exits cleanly but doesn't close the DRI screen?  What would cause that
>>> app's rendering to get completed succesfully instead of faulting to
>>> death?
>> Do you mean the same case as Ctrl+C when an app exit without waiting
>> all task finished in userspace?
>
> Basically the same, but I'm saying that the app is doing everything
> right and terminating successfully, rather than being interrupted (which
> you might otherwise use to justify its rendering failing)
I won't justify Ctrl+C. In fact I think it's also a good case which should
not get a MMU fault and GPU reset because it happens even user sapce
driver is right. I only think when user driver is not right deserve a MMU
fault like bug/evil code which free/unmap bo before task done. I think
this is also the difference of user free bo and drm close free bo in my case.

>
>>> You really do need to refcount the buffers used in a rendering job so
>>> they don't get freed early.
>> Do you mean refcount the buffers in kernel space job? This is OK but
>> not necessary, I can wait task complete in gem_close_object which
>> will be called by drm_release for each buffer too (I still think better
>> waiting in preclose at once but it's gone).
>
> Just wait for all tasks to complete when any object is freed?  That's
> going to be bad for performance.
In my case this doesn't affect performance. My implementation is user
space driver will record task buffers and free them when task done.
So in normal usage when the buffer is freed, there should be no task
from this process using it. Wait should only happen when Ctrl+C and
not close screen case.

> Or are you saying that you already
> have the connection between the task and its objects (and, if so, why
> aren't you just doing refcounting correctly through that path?)
This is by the reservation object of the buffer, I can wait on the fences
in the reservation object for task done when buffer free.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-31 Thread Keith Packard
Eric Anholt  writes:

> Just wait for all tasks to complete when any object is freed?  That's
> going to be bad for performance.  Or are you saying that you already
> have the connection between the task and its objects (and, if so, why
> aren't you just doing refcounting correctly through that path?)

How about wait on close of the DRM device?

-- 
-keith


signature.asc
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-31 Thread Eric Anholt
Qiang Yu  writes:

>>>
>>> It is OK if evil user free/unmap the buffer when task is not done
>>> in my implementation. It will generate a MMU fault in that case and kernel
>>> driver will do recovery.
>>>
>>> So does the Ctrl+C case, if don't deal with it, just get some noisy MMU
>>> fault warning and a HW reset recovery.
>>
>> How about an app rendering to shared buffers, which glFlush()es and
>> exits cleanly but doesn't close the DRI screen?  What would cause that
>> app's rendering to get completed succesfully instead of faulting to
>> death?
> Do you mean the same case as Ctrl+C when an app exit without waiting
> all task finished in userspace?

Basically the same, but I'm saying that the app is doing everything
right and terminating successfully, rather than being interrupted (which
you might otherwise use to justify its rendering failing)

>> You really do need to refcount the buffers used in a rendering job so
>> they don't get freed early.
> Do you mean refcount the buffers in kernel space job? This is OK but
> not necessary, I can wait task complete in gem_close_object which
> will be called by drm_release for each buffer too (I still think better
> waiting in preclose at once but it's gone).

Just wait for all tasks to complete when any object is freed?  That's
going to be bad for performance.  Or are you saying that you already
have the connection between the task and its objects (and, if so, why
aren't you just doing refcounting correctly through that path?)


signature.asc
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-31 Thread Qiang Yu
>>
>> It is OK if evil user free/unmap the buffer when task is not done
>> in my implementation. It will generate a MMU fault in that case and kernel
>> driver will do recovery.
>>
>> So does the Ctrl+C case, if don't deal with it, just get some noisy MMU
>> fault warning and a HW reset recovery.
>
> How about an app rendering to shared buffers, which glFlush()es and
> exits cleanly but doesn't close the DRI screen?  What would cause that
> app's rendering to get completed succesfully instead of faulting to
> death?
Do you mean the same case as Ctrl+C when an app exit without waiting
all task finished in userspace?

>
> You really do need to refcount the buffers used in a rendering job so
> they don't get freed early.
Do you mean refcount the buffers in kernel space job? This is OK but
not necessary, I can wait task complete in gem_close_object which
will be called by drm_release for each buffer too (I still think better
waiting in preclose at once but it's gone).
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-30 Thread Eric Anholt
Qiang Yu  writes:

> On Thu, May 24, 2018 at 3:51 PM, Daniel Vetter  wrote:
>> On Thu, May 24, 2018 at 09:18:04AM +0800, Qiang Yu wrote:
>>> On Thu, May 24, 2018 at 4:31 AM, Daniel Vetter  wrote:
>>> > On Wed, May 23, 2018 at 2:59 PM, Qiang Yu  wrote:
>>> >> On Wed, May 23, 2018 at 5:04 PM, Daniel Vetter  wrote:
>>> >>> On Tue, May 22, 2018 at 09:04:17AM +0800, Qiang Yu wrote:
>>>  On Tue, May 22, 2018 at 3:37 AM, Eric Anholt  wrote:
>>>  > Qiang Yu  writes:
>>>  >
>>>  >> This reverts commit 45c3d213a400c952ab7119f394c5293bb6877e6b.
>>>  >>
>>>  >> lima driver need preclose to wait all task in the context
>>>  >> created within closing file to finish before free all the
>>>  >> buffer object. Otherwise pending tesk may fail and get
>>>  >> noisy MMU fault message.
>>>  >>
>>>  >> Move this wait to each buffer object free function can
>>>  >> achieve the same result but some buffer object is shared
>>>  >> with other file context, but we only want to wait the
>>>  >> closing file context's tasks. So the implementation is
>>>  >> not that straight forword compared to the preclose one.
>>>  >
>>>  > You should just separate your MMU structures from drm_file, and have
>>>  > drm_file and the jobs using it keep a reference on them.  This is 
>>>  > what
>>>  > I've done in V3D as well.
>>> 
>>>  It's not the VM/MMU struct that causes this problem, it's each buffer
>>>  object that gets freed before task is done (postclose is after buffer 
>>>  free).
>>>  If you mean I should keep reference of all buffers for tasks, that's 
>>>  not
>>>  as simple as just waiting task done before free buffers.
>>> >>>
>>> >>> Why can't you do that waiting in the postclose hook? If it's the lack of
>>> >>> reference-counting in your driver for gem bo, then I'd say you need to
>>> >>> roll out some reference counting. Relying on the implicit reference
>>> >>> provided by the core is kinda not so great (which was the reason I've
>>> >>> thrown out the preclose hook). There's also per-bo open/close hooks.
>>> >>
>>> >> It's possible to not use preclose, but the implementation is not as 
>>> >> simple
>>> >> and straight forward as the preclose I think. There're two method I can
>>> >> think of:
>>> >> 1. do wait when free buffers callback unmap buffer from this process's
>>> >> lima VM (wait buffer reservation object), this is fine and simple, 
>>> >> but
>>> >> there's case that this buffer is shared between two processes, so the
>>> >> best way should be only waiting fences from this process, so we'd
>>> >> better do some record for fences for a "perfect waiting"
>>> >> 2. keep a reference of involved buffers for a task, unreference it when
>>> >> task done, also keep a reference of the buffer mapping in this 
>>> >> process's
>>> >> lima VM (this is more complicated to implement)
>>> >>
>>> >> But if there's a preclose, just wait all this process's task done, then
>>> >> unmap/free buffers, it's simple and straight forward. I'd like to hear if
>>> >> there's other better way for only use postclose.
>>> >
>>> > Refcount your buffers. Borrowing references from other places tends to
>>> > result in a maintenance headache with no end. So solution 2.
>>>
>>> In current lima implementation, refcount involved buffer for task is done
>>> in user space. So kernel's task object don't keep that. User space
>>> driver is responsible not unmap/free buffer before task is complete. This
>>> works simple and fine except the case that user press Ctrl+C to terminate
>>> the application which will force to close drm fd. I really don't think 
>>> adding
>>> buffer refcount for tasks in kernel just for this case is valuable because
>>> it has no benefits for normal case but some extra load.
>>
>> If kernel correctness relies on refcounting you have a giantic security
>> problem. You need to fix that. Kernel _must_ assume that userspace is
>> evil, trying to pull it over the table.
>
> It is OK if evil user free/unmap the buffer when task is not done
> in my implementation. It will generate a MMU fault in that case and kernel
> driver will do recovery.
>
> So does the Ctrl+C case, if don't deal with it, just get some noisy MMU
> fault warning and a HW reset recovery.

How about an app rendering to shared buffers, which glFlush()es and
exits cleanly but doesn't close the DRI screen?  What would cause that
app's rendering to get completed succesfully instead of faulting to
death?

You really do need to refcount the buffers used in a rendering job so
they don't get freed early.


signature.asc
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-24 Thread Qiang Yu
On Thu, May 24, 2018 at 5:41 PM, Christian König
 wrote:
> Am 24.05.2018 um 11:24 schrieb Qiang Yu:
>>
>> On Thu, May 24, 2018 at 2:46 PM, Christian König
>>  wrote:
>> [SNIP]
>>>
>>> Because of this we have a separate tracking in amdgpu so that we not only
>>> know who is using which BO, who is using which VM.
>>
>> amdgpu's VM implementation seems too complicated for this simple mali GPU,
>> but I may investigate it more to see if I can make it better.
>
>
> Yeah, completely agree.
>
> The VM handling in amdgpu is really complicated because we had to tune it
> for multiple use cases. E.g. partial resident textures, delayed updates etc
> etc
>
> But you should at least be able to take the lessons learned we had with that
> VM code and not make the same mistakes again.
>
>>> We intentionally removed the preclose callback to prevent certain use
>>> cases,
>>> bringing it back to allow your use case looks rather fishy to me.
>>
>> Seems other drivers do either the deffer or wait way to adopt the drop
>> of preclose. I can do the same as you suggested, but just not understand
>> why
>> we make our life harder. Can I know what's the case you want to prevent?
>
>
> I think what matters most for your case is the issue is that drivers should
> handle closing a BO because userspace said so in the same way it handles
> closing a BO because of a process termination, but see below.
>
>>> BTW: What exactly is the issue with using the postclose callback?
>>
>> The issue is, when Ctrl+C to terminate an application, if no wait or
>> deffer
>> unmap, buffer just gets unmapped before task is done, so kernel driver
>> gets MMU fault and HW reset to recover the GPU.
>
>
> Yeah, that sounds like exactly one of the reasons we had the callback in the
> first place and worked on to removing it.
>
> See the intention is to have reliable handling, e.g. use the same code path
> for closing a BO because of an IOCTL and closing a BO because of process
> termination.
>
> In other words what happens when userspace closes a BO while the GPU is
> still using it? Would you then run into a GPU reset as well?

Yes, also a MMU fault and GPU reset when user space driver error usage like
this. I think I don't need to avoid this case because it's user error
usage which deserve a GPU reset, but process termination is not. But you
remind me they indeed share the same code path if remove preclose now.

Regards,
Qiang

>
> I mean it's your driver stack, so I'm not against it as long as you can live
> with it. But it's exactly the thing we wanted to avoid here.

Seems

>
> Regards,
> Christian.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-24 Thread Qiang Yu
On Thu, May 24, 2018 at 3:51 PM, Daniel Vetter  wrote:
> On Thu, May 24, 2018 at 09:18:04AM +0800, Qiang Yu wrote:
>> On Thu, May 24, 2018 at 4:31 AM, Daniel Vetter  wrote:
>> > On Wed, May 23, 2018 at 2:59 PM, Qiang Yu  wrote:
>> >> On Wed, May 23, 2018 at 5:04 PM, Daniel Vetter  wrote:
>> >>> On Tue, May 22, 2018 at 09:04:17AM +0800, Qiang Yu wrote:
>>  On Tue, May 22, 2018 at 3:37 AM, Eric Anholt  wrote:
>>  > Qiang Yu  writes:
>>  >
>>  >> This reverts commit 45c3d213a400c952ab7119f394c5293bb6877e6b.
>>  >>
>>  >> lima driver need preclose to wait all task in the context
>>  >> created within closing file to finish before free all the
>>  >> buffer object. Otherwise pending tesk may fail and get
>>  >> noisy MMU fault message.
>>  >>
>>  >> Move this wait to each buffer object free function can
>>  >> achieve the same result but some buffer object is shared
>>  >> with other file context, but we only want to wait the
>>  >> closing file context's tasks. So the implementation is
>>  >> not that straight forword compared to the preclose one.
>>  >
>>  > You should just separate your MMU structures from drm_file, and have
>>  > drm_file and the jobs using it keep a reference on them.  This is what
>>  > I've done in V3D as well.
>> 
>>  It's not the VM/MMU struct that causes this problem, it's each buffer
>>  object that gets freed before task is done (postclose is after buffer 
>>  free).
>>  If you mean I should keep reference of all buffers for tasks, that's not
>>  as simple as just waiting task done before free buffers.
>> >>>
>> >>> Why can't you do that waiting in the postclose hook? If it's the lack of
>> >>> reference-counting in your driver for gem bo, then I'd say you need to
>> >>> roll out some reference counting. Relying on the implicit reference
>> >>> provided by the core is kinda not so great (which was the reason I've
>> >>> thrown out the preclose hook). There's also per-bo open/close hooks.
>> >>
>> >> It's possible to not use preclose, but the implementation is not as simple
>> >> and straight forward as the preclose I think. There're two method I can
>> >> think of:
>> >> 1. do wait when free buffers callback unmap buffer from this process's
>> >> lima VM (wait buffer reservation object), this is fine and simple, but
>> >> there's case that this buffer is shared between two processes, so the
>> >> best way should be only waiting fences from this process, so we'd
>> >> better do some record for fences for a "perfect waiting"
>> >> 2. keep a reference of involved buffers for a task, unreference it when
>> >> task done, also keep a reference of the buffer mapping in this 
>> >> process's
>> >> lima VM (this is more complicated to implement)
>> >>
>> >> But if there's a preclose, just wait all this process's task done, then
>> >> unmap/free buffers, it's simple and straight forward. I'd like to hear if
>> >> there's other better way for only use postclose.
>> >
>> > Refcount your buffers. Borrowing references from other places tends to
>> > result in a maintenance headache with no end. So solution 2.
>>
>> In current lima implementation, refcount involved buffer for task is done
>> in user space. So kernel's task object don't keep that. User space
>> driver is responsible not unmap/free buffer before task is complete. This
>> works simple and fine except the case that user press Ctrl+C to terminate
>> the application which will force to close drm fd. I really don't think adding
>> buffer refcount for tasks in kernel just for this case is valuable because
>> it has no benefits for normal case but some extra load.
>
> If kernel correctness relies on refcounting you have a giantic security
> problem. You need to fix that. Kernel _must_ assume that userspace is
> evil, trying to pull it over the table.

It is OK if evil user free/unmap the buffer when task is not done
in my implementation. It will generate a MMU fault in that case and kernel
driver will do recovery.

So does the Ctrl+C case, if don't deal with it, just get some noisy MMU
fault warning and a HW reset recovery.

Regards,
Qiang

>
> Yes, you need refcounting.
> -Daniel
>>
>> Regards,
>> Qiang
>>
>> > -Daniel
>> > --
>> > Daniel Vetter
>> > Software Engineer, Intel Corporation
>> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-24 Thread Christian König

Am 24.05.2018 um 11:24 schrieb Qiang Yu:

On Thu, May 24, 2018 at 2:46 PM, Christian König
 wrote:
[SNIP]

Because of this we have a separate tracking in amdgpu so that we not only
know who is using which BO, who is using which VM.

amdgpu's VM implementation seems too complicated for this simple mali GPU,
but I may investigate it more to see if I can make it better.


Yeah, completely agree.

The VM handling in amdgpu is really complicated because we had to tune 
it for multiple use cases. E.g. partial resident textures, delayed 
updates etc etc


But you should at least be able to take the lessons learned we had with 
that VM code and not make the same mistakes again.



We intentionally removed the preclose callback to prevent certain use cases,
bringing it back to allow your use case looks rather fishy to me.

Seems other drivers do either the deffer or wait way to adopt the drop
of preclose. I can do the same as you suggested, but just not understand why
we make our life harder. Can I know what's the case you want to prevent?


I think what matters most for your case is the issue is that drivers 
should handle closing a BO because userspace said so in the same way it 
handles closing a BO because of a process termination, but see below.



BTW: What exactly is the issue with using the postclose callback?

The issue is, when Ctrl+C to terminate an application, if no wait or deffer
unmap, buffer just gets unmapped before task is done, so kernel driver
gets MMU fault and HW reset to recover the GPU.


Yeah, that sounds like exactly one of the reasons we had the callback in 
the first place and worked on to removing it.


See the intention is to have reliable handling, e.g. use the same code 
path for closing a BO because of an IOCTL and closing a BO because of 
process termination.


In other words what happens when userspace closes a BO while the GPU is 
still using it? Would you then run into a GPU reset as well?


I mean it's your driver stack, so I'm not against it as long as you can 
live with it. But it's exactly the thing we wanted to avoid here.


Regards,
Christian.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-24 Thread Qiang Yu
On Thu, May 24, 2018 at 2:46 PM, Christian König
 wrote:
> Am 24.05.2018 um 03:38 schrieb Qiang Yu:
>
> [SNIP]
>
> Adding fence is done already, and I did wait it before unmap. But then
> I see when
> the buffer is shared between processes, the "perfect wait" is just
> wait the fence
> from this process's task, so it's better to also distinguish fences.
> If so, I just think
> why we don't just wait tasks from this process in the preclose before
> unmap/free
> buffer in the drm_release()?
>
> Well it depends on your VM management. When userspace expects that the VM
> space the BO used is reusable immediately than the TTM callback won't work.
>
> On the other hand you can just grab the list of fences on a BO and filter
> out the ones from your current process and wait for those. See
> amdgpu_sync_resv() as an example how to do that.
>
> In current lima implementation, user space driver is responsible not
> unmap/free
> buffer before task is complete. And VM map/unmap is not differed.
>
>
> Well it's up to you how to design userspace, but in the past doing it like
> that turned out to be a rather bad design decision.
>
> Keep in mind that the kernel driver must guarantee that a shaders can never
> access freed up memory.
>
> Otherwise taking over the system from an unprivileged processes becomes just
> a typing exercise when you manage to access freed memory which is now used
> for a page table.

Right, I know this has to be avoided.

>
> Because of this we have a separate tracking in amdgpu so that we not only
> know who is using which BO, who is using which VM.

amdgpu's VM implementation seems too complicated for this simple mali GPU,
but I may investigate it more to see if I can make it better.

>
> This works simple and fine except the case that user press Ctrl+C to
> terminate
> the application which will force to close drm fd.
>
>
> I'm not sure if that actually works as fine as you think.
>
> For an example of what we had to add to prevent security breaches, take a
> look at amdgpu_gem_object_close().
>
> I'd more prefer to wait buffer fence before vm unmap and filter like
> amdgpu_sync_resv() compared to implement refcount in kernel task.
> But these two ways are all not as simple as preclose.
>
>
> Well, I would rather say you should either delay VM unmap operations until
> all users of the VM are done with their work using the ttm_bo_destroy
> callback.
>
> Or you block in the gem_close_object callback until all tasks using the BO
> are done with it.
>
> So I still don't understand why you don't want to get preclose back even
> have to introduce other complicated mechanism to cover the case free/unmap
> buffer before this process's task is done?
>
>
> We intentionally removed the preclose callback to prevent certain use cases,
> bringing it back to allow your use case looks rather fishy to me.

Seems other drivers do either the deffer or wait way to adopt the drop
of preclose. I can do the same as you suggested, but just not understand why
we make our life harder. Can I know what's the case you want to prevent?

>
> BTW: What exactly is the issue with using the postclose callback?

The issue is, when Ctrl+C to terminate an application, if no wait or deffer
unmap, buffer just gets unmapped before task is done, so kernel driver
gets MMU fault and HW reset to recover the GPU.

Regards,
Qiang

>
> Regards,
> Christian.
>
>
> Regards,
> Qiang
>
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-24 Thread Daniel Vetter
On Thu, May 24, 2018 at 09:18:04AM +0800, Qiang Yu wrote:
> On Thu, May 24, 2018 at 4:31 AM, Daniel Vetter  wrote:
> > On Wed, May 23, 2018 at 2:59 PM, Qiang Yu  wrote:
> >> On Wed, May 23, 2018 at 5:04 PM, Daniel Vetter  wrote:
> >>> On Tue, May 22, 2018 at 09:04:17AM +0800, Qiang Yu wrote:
>  On Tue, May 22, 2018 at 3:37 AM, Eric Anholt  wrote:
>  > Qiang Yu  writes:
>  >
>  >> This reverts commit 45c3d213a400c952ab7119f394c5293bb6877e6b.
>  >>
>  >> lima driver need preclose to wait all task in the context
>  >> created within closing file to finish before free all the
>  >> buffer object. Otherwise pending tesk may fail and get
>  >> noisy MMU fault message.
>  >>
>  >> Move this wait to each buffer object free function can
>  >> achieve the same result but some buffer object is shared
>  >> with other file context, but we only want to wait the
>  >> closing file context's tasks. So the implementation is
>  >> not that straight forword compared to the preclose one.
>  >
>  > You should just separate your MMU structures from drm_file, and have
>  > drm_file and the jobs using it keep a reference on them.  This is what
>  > I've done in V3D as well.
> 
>  It's not the VM/MMU struct that causes this problem, it's each buffer
>  object that gets freed before task is done (postclose is after buffer 
>  free).
>  If you mean I should keep reference of all buffers for tasks, that's not
>  as simple as just waiting task done before free buffers.
> >>>
> >>> Why can't you do that waiting in the postclose hook? If it's the lack of
> >>> reference-counting in your driver for gem bo, then I'd say you need to
> >>> roll out some reference counting. Relying on the implicit reference
> >>> provided by the core is kinda not so great (which was the reason I've
> >>> thrown out the preclose hook). There's also per-bo open/close hooks.
> >>
> >> It's possible to not use preclose, but the implementation is not as simple
> >> and straight forward as the preclose I think. There're two method I can
> >> think of:
> >> 1. do wait when free buffers callback unmap buffer from this process's
> >> lima VM (wait buffer reservation object), this is fine and simple, but
> >> there's case that this buffer is shared between two processes, so the
> >> best way should be only waiting fences from this process, so we'd
> >> better do some record for fences for a "perfect waiting"
> >> 2. keep a reference of involved buffers for a task, unreference it when
> >> task done, also keep a reference of the buffer mapping in this 
> >> process's
> >> lima VM (this is more complicated to implement)
> >>
> >> But if there's a preclose, just wait all this process's task done, then
> >> unmap/free buffers, it's simple and straight forward. I'd like to hear if
> >> there's other better way for only use postclose.
> >
> > Refcount your buffers. Borrowing references from other places tends to
> > result in a maintenance headache with no end. So solution 2.
> 
> In current lima implementation, refcount involved buffer for task is done
> in user space. So kernel's task object don't keep that. User space
> driver is responsible not unmap/free buffer before task is complete. This
> works simple and fine except the case that user press Ctrl+C to terminate
> the application which will force to close drm fd. I really don't think adding
> buffer refcount for tasks in kernel just for this case is valuable because
> it has no benefits for normal case but some extra load.

If kernel correctness relies on refcounting you have a giantic security
problem. You need to fix that. Kernel _must_ assume that userspace is
evil, trying to pull it over the table.

Yes, you need refcounting.
-Daniel
> 
> Regards,
> Qiang
> 
> > -Daniel
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-24 Thread Christian König

Am 24.05.2018 um 03:38 schrieb Qiang Yu:

[SNIP]

Adding fence is done already, and I did wait it before unmap. But then
I see when
the buffer is shared between processes, the "perfect wait" is just
wait the fence
from this process's task, so it's better to also distinguish fences.
If so, I just think
why we don't just wait tasks from this process in the preclose before
unmap/free
buffer in the drm_release()?


Well it depends on your VM management. When userspace expects that the VM
space the BO used is reusable immediately than the TTM callback won't work.

On the other hand you can just grab the list of fences on a BO and filter
out the ones from your current process and wait for those. See
amdgpu_sync_resv() as an example how to do that.

In current lima implementation, user space driver is responsible not unmap/free
buffer before task is complete. And VM map/unmap is not differed.


Well it's up to you how to design userspace, but in the past doing it 
like that turned out to be a rather bad design decision.


Keep in mind that the kernel driver must guarantee that a shaders can 
never access freed up memory.


Otherwise taking over the system from an unprivileged processes becomes 
just a typing exercise when you manage to access freed memory which is 
now used for a page table.


Because of this we have a separate tracking in amdgpu so that we not 
only know who is using which BO, who is using which VM.



This works simple and fine except the case that user press Ctrl+C to terminate
the application which will force to close drm fd.


I'm not sure if that actually works as fine as you think.

For an example of what we had to add to prevent security breaches, take 
a look at amdgpu_gem_object_close().



I'd more prefer to wait buffer fence before vm unmap and filter like
amdgpu_sync_resv() compared to implement refcount in kernel task.
But these two ways are all not as simple as preclose.


Well, I would rather say you should either delay VM unmap operations 
until all users of the VM are done with their work using the 
ttm_bo_destroy callback.


Or you block in the gem_close_object callback until all tasks using the 
BO are done with it.



So I still don't understand why you don't want to get preclose back even
have to introduce other complicated mechanism to cover the case free/unmap
buffer before this process's task is done?


We intentionally removed the preclose callback to prevent certain use 
cases, bringing it back to allow your use case looks rather fishy to me.


BTW: What exactly is the issue with using the postclose callback?

Regards,
Christian.



Regards,
Qiang



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-23 Thread Qiang Yu
On Wed, May 23, 2018 at 9:41 PM, Christian König
 wrote:
> Am 23.05.2018 um 15:13 schrieb Qiang Yu:
>>
>> On Wed, May 23, 2018 at 5:35 PM, Christian König
>>  wrote:
>>>
>>> Well NAK, that brings back a callback we worked quite hard on getting rid
>>> of.
>>>
>>> It looks like the problem isn't that you need the preclose callback, but
>>> you
>>> rather seem to misunderstood how TTM works.
>>>
>>> All you need to do is to cleanup your command submission path so that the
>>> caller of lima_sched_context_queue_task() adds the resulting scheduler
>>> fence
>>> to TTMs buffer objects.
>>
>> You mean adding the finished dma fence to the buffer's reservation object
>> then
>> waiting it before unmap the buffer from GPU VM in the drm_release()'s
>> buffer
>> close callback?
>
>
> That is one possibility, but also not necessary.
>
> TTM has a destroy callback which is called from a workqueue when all fences
> on that BOs have signaled.
>
> Depending on your VM management you can use it to delay unmapping the buffer
> until it is actually not used any more.
>
>> Adding fence is done already, and I did wait it before unmap. But then
>> I see when
>> the buffer is shared between processes, the "perfect wait" is just
>> wait the fence
>> from this process's task, so it's better to also distinguish fences.
>> If so, I just think
>> why we don't just wait tasks from this process in the preclose before
>> unmap/free
>> buffer in the drm_release()?
>
>
> Well it depends on your VM management. When userspace expects that the VM
> space the BO used is reusable immediately than the TTM callback won't work.
>
> On the other hand you can just grab the list of fences on a BO and filter
> out the ones from your current process and wait for those. See
> amdgpu_sync_resv() as an example how to do that.

In current lima implementation, user space driver is responsible not unmap/free
buffer before task is complete. And VM map/unmap is not differed.

This works simple and fine except the case that user press Ctrl+C to terminate
the application which will force to close drm fd.

I'd more prefer to wait buffer fence before vm unmap and filter like
amdgpu_sync_resv() compared to implement refcount in kernel task.
But these two ways are all not as simple as preclose.

So I still don't understand why you don't want to get preclose back even
have to introduce other complicated mechanism to cover the case free/unmap
buffer before this process's task is done?

Regards,
Qiang

>
> Christian.
>
>
>>
>> Regards,
>> Qiang
>>
>>>
>>> Am 18.05.2018 um 11:27 schrieb Qiang Yu:

 This reverts commit 45c3d213a400c952ab7119f394c5293bb6877e6b.

 lima driver need preclose to wait all task in the context
 created within closing file to finish before free all the
 buffer object. Otherwise pending tesk may fail and get
 noisy MMU fault message.

 Move this wait to each buffer object free function can
 achieve the same result but some buffer object is shared
 with other file context, but we only want to wait the
 closing file context's tasks. So the implementation is
 not that straight forword compared to the preclose one.

 Signed-off-by: Qiang Yu 
 ---
drivers/gpu/drm/drm_file.c |  8 
include/drm/drm_drv.h  | 23 +--
2 files changed, 25 insertions(+), 6 deletions(-)

 diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
 index e394799979a6..0a43107396b9 100644
 --- a/drivers/gpu/drm/drm_file.c
 +++ b/drivers/gpu/drm/drm_file.c
 @@ -361,8 +361,9 @@ void drm_lastclose(struct drm_device * dev)
 *
 * This function must be used by drivers as their
 _operations.release
 * method. It frees any resources associated with the open file, and
 calls the
 - * _driver.postclose driver callback. If this is the last open file
 for the
 - * DRM device also proceeds to call the _driver.lastclose driver
 callback.
 + * _driver.preclose and _driver.lastclose driver callbacks. If
 this is
 + * the last open file for the DRM device also proceeds to call the
 + * _driver.lastclose driver callback.
 *
 * RETURNS:
 *
 @@ -382,8 +383,7 @@ int drm_release(struct inode *inode, struct file
 *filp)
  list_del(_priv->lhead);
  mutex_unlock(>filelist_mutex);
- if (drm_core_check_feature(dev, DRIVER_LEGACY) &&
 -   dev->driver->preclose)
 +   if (dev->driver->preclose)
  dev->driver->preclose(dev, file_priv);
  /* 
 diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
 index d23dcdd1bd95..8d6080f97ed4 100644
 --- a/include/drm/drm_drv.h
 +++ b/include/drm/drm_drv.h
 @@ -107,6 +107,23 @@ struct 

Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-23 Thread Qiang Yu
On Thu, May 24, 2018 at 4:31 AM, Daniel Vetter  wrote:
> On Wed, May 23, 2018 at 2:59 PM, Qiang Yu  wrote:
>> On Wed, May 23, 2018 at 5:04 PM, Daniel Vetter  wrote:
>>> On Tue, May 22, 2018 at 09:04:17AM +0800, Qiang Yu wrote:
 On Tue, May 22, 2018 at 3:37 AM, Eric Anholt  wrote:
 > Qiang Yu  writes:
 >
 >> This reverts commit 45c3d213a400c952ab7119f394c5293bb6877e6b.
 >>
 >> lima driver need preclose to wait all task in the context
 >> created within closing file to finish before free all the
 >> buffer object. Otherwise pending tesk may fail and get
 >> noisy MMU fault message.
 >>
 >> Move this wait to each buffer object free function can
 >> achieve the same result but some buffer object is shared
 >> with other file context, but we only want to wait the
 >> closing file context's tasks. So the implementation is
 >> not that straight forword compared to the preclose one.
 >
 > You should just separate your MMU structures from drm_file, and have
 > drm_file and the jobs using it keep a reference on them.  This is what
 > I've done in V3D as well.

 It's not the VM/MMU struct that causes this problem, it's each buffer
 object that gets freed before task is done (postclose is after buffer 
 free).
 If you mean I should keep reference of all buffers for tasks, that's not
 as simple as just waiting task done before free buffers.
>>>
>>> Why can't you do that waiting in the postclose hook? If it's the lack of
>>> reference-counting in your driver for gem bo, then I'd say you need to
>>> roll out some reference counting. Relying on the implicit reference
>>> provided by the core is kinda not so great (which was the reason I've
>>> thrown out the preclose hook). There's also per-bo open/close hooks.
>>
>> It's possible to not use preclose, but the implementation is not as simple
>> and straight forward as the preclose I think. There're two method I can
>> think of:
>> 1. do wait when free buffers callback unmap buffer from this process's
>> lima VM (wait buffer reservation object), this is fine and simple, but
>> there's case that this buffer is shared between two processes, so the
>> best way should be only waiting fences from this process, so we'd
>> better do some record for fences for a "perfect waiting"
>> 2. keep a reference of involved buffers for a task, unreference it when
>> task done, also keep a reference of the buffer mapping in this process's
>> lima VM (this is more complicated to implement)
>>
>> But if there's a preclose, just wait all this process's task done, then
>> unmap/free buffers, it's simple and straight forward. I'd like to hear if
>> there's other better way for only use postclose.
>
> Refcount your buffers. Borrowing references from other places tends to
> result in a maintenance headache with no end. So solution 2.

In current lima implementation, refcount involved buffer for task is done
in user space. So kernel's task object don't keep that. User space
driver is responsible not unmap/free buffer before task is complete. This
works simple and fine except the case that user press Ctrl+C to terminate
the application which will force to close drm fd. I really don't think adding
buffer refcount for tasks in kernel just for this case is valuable because
it has no benefits for normal case but some extra load.

Regards,
Qiang

> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-23 Thread Daniel Vetter
On Wed, May 23, 2018 at 2:59 PM, Qiang Yu  wrote:
> On Wed, May 23, 2018 at 5:04 PM, Daniel Vetter  wrote:
>> On Tue, May 22, 2018 at 09:04:17AM +0800, Qiang Yu wrote:
>>> On Tue, May 22, 2018 at 3:37 AM, Eric Anholt  wrote:
>>> > Qiang Yu  writes:
>>> >
>>> >> This reverts commit 45c3d213a400c952ab7119f394c5293bb6877e6b.
>>> >>
>>> >> lima driver need preclose to wait all task in the context
>>> >> created within closing file to finish before free all the
>>> >> buffer object. Otherwise pending tesk may fail and get
>>> >> noisy MMU fault message.
>>> >>
>>> >> Move this wait to each buffer object free function can
>>> >> achieve the same result but some buffer object is shared
>>> >> with other file context, but we only want to wait the
>>> >> closing file context's tasks. So the implementation is
>>> >> not that straight forword compared to the preclose one.
>>> >
>>> > You should just separate your MMU structures from drm_file, and have
>>> > drm_file and the jobs using it keep a reference on them.  This is what
>>> > I've done in V3D as well.
>>>
>>> It's not the VM/MMU struct that causes this problem, it's each buffer
>>> object that gets freed before task is done (postclose is after buffer free).
>>> If you mean I should keep reference of all buffers for tasks, that's not
>>> as simple as just waiting task done before free buffers.
>>
>> Why can't you do that waiting in the postclose hook? If it's the lack of
>> reference-counting in your driver for gem bo, then I'd say you need to
>> roll out some reference counting. Relying on the implicit reference
>> provided by the core is kinda not so great (which was the reason I've
>> thrown out the preclose hook). There's also per-bo open/close hooks.
>
> It's possible to not use preclose, but the implementation is not as simple
> and straight forward as the preclose I think. There're two method I can
> think of:
> 1. do wait when free buffers callback unmap buffer from this process's
> lima VM (wait buffer reservation object), this is fine and simple, but
> there's case that this buffer is shared between two processes, so the
> best way should be only waiting fences from this process, so we'd
> better do some record for fences for a "perfect waiting"
> 2. keep a reference of involved buffers for a task, unreference it when
> task done, also keep a reference of the buffer mapping in this process's
> lima VM (this is more complicated to implement)
>
> But if there's a preclose, just wait all this process's task done, then
> unmap/free buffers, it's simple and straight forward. I'd like to hear if
> there's other better way for only use postclose.

Refcount your buffers. Borrowing references from other places tends to
result in a maintenance headache with no end. So solution 2.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-23 Thread Christian König

Am 23.05.2018 um 15:13 schrieb Qiang Yu:

On Wed, May 23, 2018 at 5:35 PM, Christian König
 wrote:

Well NAK, that brings back a callback we worked quite hard on getting rid
of.

It looks like the problem isn't that you need the preclose callback, but you
rather seem to misunderstood how TTM works.

All you need to do is to cleanup your command submission path so that the
caller of lima_sched_context_queue_task() adds the resulting scheduler fence
to TTMs buffer objects.

You mean adding the finished dma fence to the buffer's reservation object then
waiting it before unmap the buffer from GPU VM in the drm_release()'s buffer
close callback?


That is one possibility, but also not necessary.

TTM has a destroy callback which is called from a workqueue when all 
fences on that BOs have signaled.


Depending on your VM management you can use it to delay unmapping the 
buffer until it is actually not used any more.



Adding fence is done already, and I did wait it before unmap. But then
I see when
the buffer is shared between processes, the "perfect wait" is just
wait the fence
from this process's task, so it's better to also distinguish fences.
If so, I just think
why we don't just wait tasks from this process in the preclose before unmap/free
buffer in the drm_release()?


Well it depends on your VM management. When userspace expects that the 
VM space the BO used is reusable immediately than the TTM callback won't 
work.


On the other hand you can just grab the list of fences on a BO and 
filter out the ones from your current process and wait for those. See 
amdgpu_sync_resv() as an example how to do that.


Christian.



Regards,
Qiang



Am 18.05.2018 um 11:27 schrieb Qiang Yu:

This reverts commit 45c3d213a400c952ab7119f394c5293bb6877e6b.

lima driver need preclose to wait all task in the context
created within closing file to finish before free all the
buffer object. Otherwise pending tesk may fail and get
noisy MMU fault message.

Move this wait to each buffer object free function can
achieve the same result but some buffer object is shared
with other file context, but we only want to wait the
closing file context's tasks. So the implementation is
not that straight forword compared to the preclose one.

Signed-off-by: Qiang Yu 
---
   drivers/gpu/drm/drm_file.c |  8 
   include/drm/drm_drv.h  | 23 +--
   2 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index e394799979a6..0a43107396b9 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -361,8 +361,9 @@ void drm_lastclose(struct drm_device * dev)
*
* This function must be used by drivers as their
_operations.release
* method. It frees any resources associated with the open file, and
calls the
- * _driver.postclose driver callback. If this is the last open file
for the
- * DRM device also proceeds to call the _driver.lastclose driver
callback.
+ * _driver.preclose and _driver.lastclose driver callbacks. If
this is
+ * the last open file for the DRM device also proceeds to call the
+ * _driver.lastclose driver callback.
*
* RETURNS:
*
@@ -382,8 +383,7 @@ int drm_release(struct inode *inode, struct file
*filp)
 list_del(_priv->lhead);
 mutex_unlock(>filelist_mutex);
   - if (drm_core_check_feature(dev, DRIVER_LEGACY) &&
-   dev->driver->preclose)
+   if (dev->driver->preclose)
 dev->driver->preclose(dev, file_priv);
 /* 
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index d23dcdd1bd95..8d6080f97ed4 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -107,6 +107,23 @@ struct drm_driver {
  */
 int (*open) (struct drm_device *, struct drm_file *);
   + /**
+* @preclose:
+*
+* One of the driver callbacks when a new  drm_file is
closed.
+* Useful for tearing down driver-private data structures
allocated in
+* @open like buffer allocators, execution contexts or similar
things.
+*
+* Since the display/modeset side of DRM can only be owned by
exactly
+* one  drm_file (see _file.is_master and
_device.master)
+* there should never be a need to tear down any modeset related
+* resources in this callback. Doing so would be a driver design
bug.
+*
+* FIXME: It is not really clear why there's both @preclose and
+* @postclose. Without a really good reason, use @postclose only.
+*/
+   void (*preclose) (struct drm_device *, struct drm_file
*file_priv);
+
 /**
  * @postclose:
  *
@@ -118,6 +135,9 @@ struct drm_driver {
  * one  drm_file (see _file.is_master and
_device.master)
  * there should never be a need to tear down any modeset related
  * 

Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-23 Thread Qiang Yu
On Wed, May 23, 2018 at 5:35 PM, Christian König
 wrote:
> Well NAK, that brings back a callback we worked quite hard on getting rid
> of.
>
> It looks like the problem isn't that you need the preclose callback, but you
> rather seem to misunderstood how TTM works.
>
> All you need to do is to cleanup your command submission path so that the
> caller of lima_sched_context_queue_task() adds the resulting scheduler fence
> to TTMs buffer objects.

You mean adding the finished dma fence to the buffer's reservation object then
waiting it before unmap the buffer from GPU VM in the drm_release()'s buffer
close callback?

Adding fence is done already, and I did wait it before unmap. But then
I see when
the buffer is shared between processes, the "perfect wait" is just
wait the fence
from this process's task, so it's better to also distinguish fences.
If so, I just think
why we don't just wait tasks from this process in the preclose before unmap/free
buffer in the drm_release()?

Regards,
Qiang

>
>
> Am 18.05.2018 um 11:27 schrieb Qiang Yu:
>>
>> This reverts commit 45c3d213a400c952ab7119f394c5293bb6877e6b.
>>
>> lima driver need preclose to wait all task in the context
>> created within closing file to finish before free all the
>> buffer object. Otherwise pending tesk may fail and get
>> noisy MMU fault message.
>>
>> Move this wait to each buffer object free function can
>> achieve the same result but some buffer object is shared
>> with other file context, but we only want to wait the
>> closing file context's tasks. So the implementation is
>> not that straight forword compared to the preclose one.
>>
>> Signed-off-by: Qiang Yu 
>> ---
>>   drivers/gpu/drm/drm_file.c |  8 
>>   include/drm/drm_drv.h  | 23 +--
>>   2 files changed, 25 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
>> index e394799979a6..0a43107396b9 100644
>> --- a/drivers/gpu/drm/drm_file.c
>> +++ b/drivers/gpu/drm/drm_file.c
>> @@ -361,8 +361,9 @@ void drm_lastclose(struct drm_device * dev)
>>*
>>* This function must be used by drivers as their
>> _operations.release
>>* method. It frees any resources associated with the open file, and
>> calls the
>> - * _driver.postclose driver callback. If this is the last open file
>> for the
>> - * DRM device also proceeds to call the _driver.lastclose driver
>> callback.
>> + * _driver.preclose and _driver.lastclose driver callbacks. If
>> this is
>> + * the last open file for the DRM device also proceeds to call the
>> + * _driver.lastclose driver callback.
>>*
>>* RETURNS:
>>*
>> @@ -382,8 +383,7 @@ int drm_release(struct inode *inode, struct file
>> *filp)
>> list_del(_priv->lhead);
>> mutex_unlock(>filelist_mutex);
>>   - if (drm_core_check_feature(dev, DRIVER_LEGACY) &&
>> -   dev->driver->preclose)
>> +   if (dev->driver->preclose)
>> dev->driver->preclose(dev, file_priv);
>> /* 
>> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
>> index d23dcdd1bd95..8d6080f97ed4 100644
>> --- a/include/drm/drm_drv.h
>> +++ b/include/drm/drm_drv.h
>> @@ -107,6 +107,23 @@ struct drm_driver {
>>  */
>> int (*open) (struct drm_device *, struct drm_file *);
>>   + /**
>> +* @preclose:
>> +*
>> +* One of the driver callbacks when a new  drm_file is
>> closed.
>> +* Useful for tearing down driver-private data structures
>> allocated in
>> +* @open like buffer allocators, execution contexts or similar
>> things.
>> +*
>> +* Since the display/modeset side of DRM can only be owned by
>> exactly
>> +* one  drm_file (see _file.is_master and
>> _device.master)
>> +* there should never be a need to tear down any modeset related
>> +* resources in this callback. Doing so would be a driver design
>> bug.
>> +*
>> +* FIXME: It is not really clear why there's both @preclose and
>> +* @postclose. Without a really good reason, use @postclose only.
>> +*/
>> +   void (*preclose) (struct drm_device *, struct drm_file
>> *file_priv);
>> +
>> /**
>>  * @postclose:
>>  *
>> @@ -118,6 +135,9 @@ struct drm_driver {
>>  * one  drm_file (see _file.is_master and
>> _device.master)
>>  * there should never be a need to tear down any modeset related
>>  * resources in this callback. Doing so would be a driver design
>> bug.
>> +*
>> +* FIXME: It is not really clear why there's both @preclose and
>> +* @postclose. Without a really good reason, use @postclose only.
>>  */
>> void (*postclose) (struct drm_device *, struct drm_file *);
>>   @@ -134,7 +154,7 @@ struct drm_driver {
>>  * state changes, e.g. in conjunction with the
>> 

Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-23 Thread Qiang Yu
On Wed, May 23, 2018 at 5:04 PM, Daniel Vetter  wrote:
> On Tue, May 22, 2018 at 09:04:17AM +0800, Qiang Yu wrote:
>> On Tue, May 22, 2018 at 3:37 AM, Eric Anholt  wrote:
>> > Qiang Yu  writes:
>> >
>> >> This reverts commit 45c3d213a400c952ab7119f394c5293bb6877e6b.
>> >>
>> >> lima driver need preclose to wait all task in the context
>> >> created within closing file to finish before free all the
>> >> buffer object. Otherwise pending tesk may fail and get
>> >> noisy MMU fault message.
>> >>
>> >> Move this wait to each buffer object free function can
>> >> achieve the same result but some buffer object is shared
>> >> with other file context, but we only want to wait the
>> >> closing file context's tasks. So the implementation is
>> >> not that straight forword compared to the preclose one.
>> >
>> > You should just separate your MMU structures from drm_file, and have
>> > drm_file and the jobs using it keep a reference on them.  This is what
>> > I've done in V3D as well.
>>
>> It's not the VM/MMU struct that causes this problem, it's each buffer
>> object that gets freed before task is done (postclose is after buffer free).
>> If you mean I should keep reference of all buffers for tasks, that's not
>> as simple as just waiting task done before free buffers.
>
> Why can't you do that waiting in the postclose hook? If it's the lack of
> reference-counting in your driver for gem bo, then I'd say you need to
> roll out some reference counting. Relying on the implicit reference
> provided by the core is kinda not so great (which was the reason I've
> thrown out the preclose hook). There's also per-bo open/close hooks.

It's possible to not use preclose, but the implementation is not as simple
and straight forward as the preclose I think. There're two method I can
think of:
1. do wait when free buffers callback unmap buffer from this process's
lima VM (wait buffer reservation object), this is fine and simple, but
there's case that this buffer is shared between two processes, so the
best way should be only waiting fences from this process, so we'd
better do some record for fences for a "perfect waiting"
2. keep a reference of involved buffers for a task, unreference it when
task done, also keep a reference of the buffer mapping in this process's
lima VM (this is more complicated to implement)

But if there's a preclose, just wait all this process's task done, then
unmap/free buffers, it's simple and straight forward. I'd like to hear if
there's other better way for only use postclose.

Regards,
Qiang
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-23 Thread Christian König
Well NAK, that brings back a callback we worked quite hard on getting 
rid of.


It looks like the problem isn't that you need the preclose callback, but 
you rather seem to misunderstood how TTM works.


All you need to do is to cleanup your command submission path so that 
the caller of lima_sched_context_queue_task() adds the resulting 
scheduler fence to TTMs buffer objects.


Regards,
Christian.

Am 18.05.2018 um 11:27 schrieb Qiang Yu:

This reverts commit 45c3d213a400c952ab7119f394c5293bb6877e6b.

lima driver need preclose to wait all task in the context
created within closing file to finish before free all the
buffer object. Otherwise pending tesk may fail and get
noisy MMU fault message.

Move this wait to each buffer object free function can
achieve the same result but some buffer object is shared
with other file context, but we only want to wait the
closing file context's tasks. So the implementation is
not that straight forword compared to the preclose one.

Signed-off-by: Qiang Yu 
---
  drivers/gpu/drm/drm_file.c |  8 
  include/drm/drm_drv.h  | 23 +--
  2 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index e394799979a6..0a43107396b9 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -361,8 +361,9 @@ void drm_lastclose(struct drm_device * dev)
   *
   * This function must be used by drivers as their _operations.release
   * method. It frees any resources associated with the open file, and calls the
- * _driver.postclose driver callback. If this is the last open file for the
- * DRM device also proceeds to call the _driver.lastclose driver callback.
+ * _driver.preclose and _driver.lastclose driver callbacks. If this is
+ * the last open file for the DRM device also proceeds to call the
+ * _driver.lastclose driver callback.
   *
   * RETURNS:
   *
@@ -382,8 +383,7 @@ int drm_release(struct inode *inode, struct file *filp)
list_del(_priv->lhead);
mutex_unlock(>filelist_mutex);
  
-	if (drm_core_check_feature(dev, DRIVER_LEGACY) &&

-   dev->driver->preclose)
+   if (dev->driver->preclose)
dev->driver->preclose(dev, file_priv);
  
  	/* 

diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index d23dcdd1bd95..8d6080f97ed4 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -107,6 +107,23 @@ struct drm_driver {
 */
int (*open) (struct drm_device *, struct drm_file *);
  
+	/**

+* @preclose:
+*
+* One of the driver callbacks when a new  drm_file is closed.
+* Useful for tearing down driver-private data structures allocated in
+* @open like buffer allocators, execution contexts or similar things.
+*
+* Since the display/modeset side of DRM can only be owned by exactly
+* one  drm_file (see _file.is_master and _device.master)
+* there should never be a need to tear down any modeset related
+* resources in this callback. Doing so would be a driver design bug.
+*
+* FIXME: It is not really clear why there's both @preclose and
+* @postclose. Without a really good reason, use @postclose only.
+*/
+   void (*preclose) (struct drm_device *, struct drm_file *file_priv);
+
/**
 * @postclose:
 *
@@ -118,6 +135,9 @@ struct drm_driver {
 * one  drm_file (see _file.is_master and _device.master)
 * there should never be a need to tear down any modeset related
 * resources in this callback. Doing so would be a driver design bug.
+*
+* FIXME: It is not really clear why there's both @preclose and
+* @postclose. Without a really good reason, use @postclose only.
 */
void (*postclose) (struct drm_device *, struct drm_file *);
  
@@ -134,7 +154,7 @@ struct drm_driver {

 * state changes, e.g. in conjunction with the :ref:`vga_switcheroo`
 * infrastructure.
 *
-* This is called after @postclose hook has been called.
+* This is called after @preclose and @postclose have been called.
 *
 * NOTE:
 *
@@ -601,7 +621,6 @@ struct drm_driver {
/* List of devices hanging off this driver with stealth attach. */
struct list_head legacy_dev_list;
int (*firstopen) (struct drm_device *);
-   void (*preclose) (struct drm_device *, struct drm_file *file_priv);
int (*dma_ioctl) (struct drm_device *dev, void *data, struct drm_file 
*file_priv);
int (*dma_quiescent) (struct drm_device *);
int (*context_dtor) (struct drm_device *dev, int context);


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-23 Thread Daniel Vetter
On Tue, May 22, 2018 at 09:04:17AM +0800, Qiang Yu wrote:
> On Tue, May 22, 2018 at 3:37 AM, Eric Anholt  wrote:
> > Qiang Yu  writes:
> >
> >> This reverts commit 45c3d213a400c952ab7119f394c5293bb6877e6b.
> >>
> >> lima driver need preclose to wait all task in the context
> >> created within closing file to finish before free all the
> >> buffer object. Otherwise pending tesk may fail and get
> >> noisy MMU fault message.
> >>
> >> Move this wait to each buffer object free function can
> >> achieve the same result but some buffer object is shared
> >> with other file context, but we only want to wait the
> >> closing file context's tasks. So the implementation is
> >> not that straight forword compared to the preclose one.
> >
> > You should just separate your MMU structures from drm_file, and have
> > drm_file and the jobs using it keep a reference on them.  This is what
> > I've done in V3D as well.
> 
> It's not the VM/MMU struct that causes this problem, it's each buffer
> object that gets freed before task is done (postclose is after buffer free).
> If you mean I should keep reference of all buffers for tasks, that's not
> as simple as just waiting task done before free buffers.

Why can't you do that waiting in the postclose hook? If it's the lack of
reference-counting in your driver for gem bo, then I'd say you need to
roll out some reference counting. Relying on the implicit reference
provided by the core is kinda not so great (which was the reason I've
thrown out the preclose hook). There's also per-bo open/close hooks.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-21 Thread Qiang Yu
On Tue, May 22, 2018 at 3:37 AM, Eric Anholt  wrote:
> Qiang Yu  writes:
>
>> This reverts commit 45c3d213a400c952ab7119f394c5293bb6877e6b.
>>
>> lima driver need preclose to wait all task in the context
>> created within closing file to finish before free all the
>> buffer object. Otherwise pending tesk may fail and get
>> noisy MMU fault message.
>>
>> Move this wait to each buffer object free function can
>> achieve the same result but some buffer object is shared
>> with other file context, but we only want to wait the
>> closing file context's tasks. So the implementation is
>> not that straight forword compared to the preclose one.
>
> You should just separate your MMU structures from drm_file, and have
> drm_file and the jobs using it keep a reference on them.  This is what
> I've done in V3D as well.

It's not the VM/MMU struct that causes this problem, it's each buffer
object that gets freed before task is done (postclose is after buffer free).
If you mean I should keep reference of all buffers for tasks, that's not
as simple as just waiting task done before free buffers.

Regards,
Qiang
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-21 Thread Eric Anholt
Qiang Yu  writes:

> This reverts commit 45c3d213a400c952ab7119f394c5293bb6877e6b.
>
> lima driver need preclose to wait all task in the context
> created within closing file to finish before free all the
> buffer object. Otherwise pending tesk may fail and get
> noisy MMU fault message.
>
> Move this wait to each buffer object free function can
> achieve the same result but some buffer object is shared
> with other file context, but we only want to wait the
> closing file context's tasks. So the implementation is
> not that straight forword compared to the preclose one.

You should just separate your MMU structures from drm_file, and have
drm_file and the jobs using it keep a reference on them.  This is what
I've done in V3D as well.


signature.asc
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-21 Thread Qiang Yu
This reverts commit 45c3d213a400c952ab7119f394c5293bb6877e6b.

lima driver need preclose to wait all task in the context
created within closing file to finish before free all the
buffer object. Otherwise pending tesk may fail and get
noisy MMU fault message.

Move this wait to each buffer object free function can
achieve the same result but some buffer object is shared
with other file context, but we only want to wait the
closing file context's tasks. So the implementation is
not that straight forword compared to the preclose one.

Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/drm_file.c |  8 
 include/drm/drm_drv.h  | 23 +--
 2 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index e394799979a6..0a43107396b9 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -361,8 +361,9 @@ void drm_lastclose(struct drm_device * dev)
  *
  * This function must be used by drivers as their _operations.release
  * method. It frees any resources associated with the open file, and calls the
- * _driver.postclose driver callback. If this is the last open file for the
- * DRM device also proceeds to call the _driver.lastclose driver callback.
+ * _driver.preclose and _driver.lastclose driver callbacks. If this is
+ * the last open file for the DRM device also proceeds to call the
+ * _driver.lastclose driver callback.
  *
  * RETURNS:
  *
@@ -382,8 +383,7 @@ int drm_release(struct inode *inode, struct file *filp)
list_del(_priv->lhead);
mutex_unlock(>filelist_mutex);
 
-   if (drm_core_check_feature(dev, DRIVER_LEGACY) &&
-   dev->driver->preclose)
+   if (dev->driver->preclose)
dev->driver->preclose(dev, file_priv);
 
/* 
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index d23dcdd1bd95..8d6080f97ed4 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -107,6 +107,23 @@ struct drm_driver {
 */
int (*open) (struct drm_device *, struct drm_file *);
 
+   /**
+* @preclose:
+*
+* One of the driver callbacks when a new  drm_file is closed.
+* Useful for tearing down driver-private data structures allocated in
+* @open like buffer allocators, execution contexts or similar things.
+*
+* Since the display/modeset side of DRM can only be owned by exactly
+* one  drm_file (see _file.is_master and _device.master)
+* there should never be a need to tear down any modeset related
+* resources in this callback. Doing so would be a driver design bug.
+*
+* FIXME: It is not really clear why there's both @preclose and
+* @postclose. Without a really good reason, use @postclose only.
+*/
+   void (*preclose) (struct drm_device *, struct drm_file *file_priv);
+
/**
 * @postclose:
 *
@@ -118,6 +135,9 @@ struct drm_driver {
 * one  drm_file (see _file.is_master and _device.master)
 * there should never be a need to tear down any modeset related
 * resources in this callback. Doing so would be a driver design bug.
+*
+* FIXME: It is not really clear why there's both @preclose and
+* @postclose. Without a really good reason, use @postclose only.
 */
void (*postclose) (struct drm_device *, struct drm_file *);
 
@@ -134,7 +154,7 @@ struct drm_driver {
 * state changes, e.g. in conjunction with the :ref:`vga_switcheroo`
 * infrastructure.
 *
-* This is called after @postclose hook has been called.
+* This is called after @preclose and @postclose have been called.
 *
 * NOTE:
 *
@@ -601,7 +621,6 @@ struct drm_driver {
/* List of devices hanging off this driver with stealth attach. */
struct list_head legacy_dev_list;
int (*firstopen) (struct drm_device *);
-   void (*preclose) (struct drm_device *, struct drm_file *file_priv);
int (*dma_ioctl) (struct drm_device *dev, void *data, struct drm_file 
*file_priv);
int (*dma_quiescent) (struct drm_device *);
int (*context_dtor) (struct drm_device *dev, int context);
-- 
2.17.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH RFC 05/24] Revert "drm: Nerf the preclose callback for modern drivers"

2018-05-19 Thread Qiang Yu
This reverts commit 45c3d213a400c952ab7119f394c5293bb6877e6b.

lima driver need preclose to wait all task in the context
created within closing file to finish before free all the
buffer object. Otherwise pending tesk may fail and get
noisy MMU fault message.

Move this wait to each buffer object free function can
achieve the same result but some buffer object is shared
with other file context, but we only want to wait the
closing file context's tasks. So the implementation is
not that straight forword compared to the preclose one.

Signed-off-by: Qiang Yu 
---
 drivers/gpu/drm/drm_file.c |  8 
 include/drm/drm_drv.h  | 23 +--
 2 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index e394799979a6..0a43107396b9 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -361,8 +361,9 @@ void drm_lastclose(struct drm_device * dev)
  *
  * This function must be used by drivers as their _operations.release
  * method. It frees any resources associated with the open file, and calls the
- * _driver.postclose driver callback. If this is the last open file for the
- * DRM device also proceeds to call the _driver.lastclose driver callback.
+ * _driver.preclose and _driver.lastclose driver callbacks. If this is
+ * the last open file for the DRM device also proceeds to call the
+ * _driver.lastclose driver callback.
  *
  * RETURNS:
  *
@@ -382,8 +383,7 @@ int drm_release(struct inode *inode, struct file *filp)
list_del(_priv->lhead);
mutex_unlock(>filelist_mutex);
 
-   if (drm_core_check_feature(dev, DRIVER_LEGACY) &&
-   dev->driver->preclose)
+   if (dev->driver->preclose)
dev->driver->preclose(dev, file_priv);
 
/* 
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index d23dcdd1bd95..8d6080f97ed4 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -107,6 +107,23 @@ struct drm_driver {
 */
int (*open) (struct drm_device *, struct drm_file *);
 
+   /**
+* @preclose:
+*
+* One of the driver callbacks when a new  drm_file is closed.
+* Useful for tearing down driver-private data structures allocated in
+* @open like buffer allocators, execution contexts or similar things.
+*
+* Since the display/modeset side of DRM can only be owned by exactly
+* one  drm_file (see _file.is_master and _device.master)
+* there should never be a need to tear down any modeset related
+* resources in this callback. Doing so would be a driver design bug.
+*
+* FIXME: It is not really clear why there's both @preclose and
+* @postclose. Without a really good reason, use @postclose only.
+*/
+   void (*preclose) (struct drm_device *, struct drm_file *file_priv);
+
/**
 * @postclose:
 *
@@ -118,6 +135,9 @@ struct drm_driver {
 * one  drm_file (see _file.is_master and _device.master)
 * there should never be a need to tear down any modeset related
 * resources in this callback. Doing so would be a driver design bug.
+*
+* FIXME: It is not really clear why there's both @preclose and
+* @postclose. Without a really good reason, use @postclose only.
 */
void (*postclose) (struct drm_device *, struct drm_file *);
 
@@ -134,7 +154,7 @@ struct drm_driver {
 * state changes, e.g. in conjunction with the :ref:`vga_switcheroo`
 * infrastructure.
 *
-* This is called after @postclose hook has been called.
+* This is called after @preclose and @postclose have been called.
 *
 * NOTE:
 *
@@ -601,7 +621,6 @@ struct drm_driver {
/* List of devices hanging off this driver with stealth attach. */
struct list_head legacy_dev_list;
int (*firstopen) (struct drm_device *);
-   void (*preclose) (struct drm_device *, struct drm_file *file_priv);
int (*dma_ioctl) (struct drm_device *dev, void *data, struct drm_file 
*file_priv);
int (*dma_quiescent) (struct drm_device *);
int (*context_dtor) (struct drm_device *dev, int context);
-- 
2.17.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel