TTM Locking order of bo::reserve -> vm::mmap_sem

2013-10-21 Thread Thomas Hellstrom
On 10/21/2013 03:21 PM, Maarten Lankhorst wrote:
> op 21-10-13 13:10, Thomas Hellstrom schreef:
>> On 10/21/2013 12:24 PM, Maarten Lankhorst wrote:
>>> op 21-10-13 12:10, Thomas Hellstrom schreef:
 On 10/21/2013 11:48 AM, Maarten Lankhorst wrote:
> op 21-10-13 11:37, Thomas Hellstrom schreef:
>> On 10/21/2013 11:01 AM, Maarten Lankhorst wrote:
>>> op 21-10-13 10:48, Thomas Hellstrom schreef:
 Hi!

 As discussed previously the current locking order in TTM of these 
 locks is bo::reserve -> vm::mmap_sem. This leads to a hack in
 the TTM fault() handle to try and revert the locking order. If a 
 tryreserve failed, we tried to have the vm code release the mmap_sem() 
 and then schedule, to give the holder of bo::reserve a chance to 
 release the lock. This solution is no longer legal, since we've been 
 more or less kindly asked to remove the set_need_resched() call.

 Maarten has proposed to invert the locking order. I've previously said 
 I had no strong preference. The current locking order dates back from 
 the time when TTM wasn't using unmap_mapping_range() but walked the 
 page tables itself, updating PTEs as needed. Furthermore it was needed 
 for user bos that used get_user_pages() in the TTM populate and 
 swap-in methods. User-bos were removed some time ago but I'm looking 
 at re-adding them. They would suite the VMware model of cached-only 
 pages very well. I see uses both in the gallium API, XA's DMA 
 functionality and openCL.

 We would then need a somewhat nicer way to invert the locking order. 
 I've attached a solution that ups the mmap_sem and then reserves, but 
 due to how the fault API is done, we then need to release the reserve 
 and retry the fault. This of course opens up for starvation, but I 
 don't think starvation at this point is very likely: One thread being 
 refused to write or read from a buffer object because the GPU is 
 continously busy with it. If this *would* become a problem, it's 
 probably possible to modify the fault code to allow us to hold locks 
 until the retried fault, but that would be a bit invasive, since it 
 touches the arch code

 Basically I'm proposing to keep the current locking order.
>>> I'm not sure why we have to worry about mmap_sem lock being taken 
>>> before bo::reserve. If we already hold mmap_sem,
>>> no extra locking is needed for get_user_pages.
>> Typically, they are populated outside of fault, as part of execbuf, 
>> where we don't hold and don't want to hold mmap_sem(). In fact,
>> user bo's should not be remappable through the TTM VM system. Anyway, we 
>> need to grab the mmap_sem inside ttm_populate for user buffers.
> If we don't allow mmapping user bo's through TTM, we can use special 
> lockdep annotation when user-bo's are used. Normal bo's would have
> mmap_sem outer lock, bo::reserve inner lock, while those bo's would have 
> the other way around.
>
> This might complicate validation a little, since you would have to 
> reserve and validate all user-bo's before any normal bo's are reserved. 
> But since this
> is meant to be a vmwgfx specific optimization I think it might be worth 
> it.
 Would that work (lockdep-wise) with user BO swapout as part of a normal BO 
 validation? During user BO swapout, we don't need mmap_sem, but the BO 
 needs to be reserved. But I guess it would work, since we use tryreserve 
 when walking the LRU lists?
>>> Correct.
>>>
>>>  Releasing it is a bit silly. I think we should keep mmap_sem as 
>>> outer
>>> lock, and have bo::reserve as inner, even if it might complicate 
>>> support for user-bo's. I'm not sure what you can do
>>> with user-bo's that can't be done by allocating the same bo from kernel 
>>> first and map + populate it.
>>>
>>> ~Maarten
>> Using DMA API analogy, user BOs correspond to using streaming DMA 
>> whereas normal BOs correspond to alloced DMA memory buffers.
>> We boost performance and save resources.
> Yeah but it's vmwgfx specific. Nouveau and radeon have dedicated copy 
> engines that can be used. Flushing the vm and stalling is probably more
> expensive than performing a memcpy
 In the end, I'm not sure it will be vmwgfx specific once there is a 
 working example, and there are user-space APIs that will benefit from it. 
 There are other examples out there today that uses streaming DMA to feed 
 the DMA engines, although not through TTM, see for example via_dmablit.c.

 Also if we need a separate locking order for User BOs, what would be the 
 big benefit of having the locking order mmap_sem()->bo_reserve() ?
>>> Deterministic 

TTM Locking order of bo::reserve -> vm::mmap_sem

2013-10-21 Thread Maarten Lankhorst
op 21-10-13 13:10, Thomas Hellstrom schreef:
> On 10/21/2013 12:24 PM, Maarten Lankhorst wrote:
>> op 21-10-13 12:10, Thomas Hellstrom schreef:
>>> On 10/21/2013 11:48 AM, Maarten Lankhorst wrote:
 op 21-10-13 11:37, Thomas Hellstrom schreef:
> On 10/21/2013 11:01 AM, Maarten Lankhorst wrote:
>> op 21-10-13 10:48, Thomas Hellstrom schreef:
>>> Hi!
>>>
>>> As discussed previously the current locking order in TTM of these locks 
>>> is bo::reserve -> vm::mmap_sem. This leads to a hack in
>>> the TTM fault() handle to try and revert the locking order. If a 
>>> tryreserve failed, we tried to have the vm code release the mmap_sem() 
>>> and then schedule, to give the holder of bo::reserve a chance to 
>>> release the lock. This solution is no longer legal, since we've been 
>>> more or less kindly asked to remove the set_need_resched() call.
>>>
>>> Maarten has proposed to invert the locking order. I've previously said 
>>> I had no strong preference. The current locking order dates back from 
>>> the time when TTM wasn't using unmap_mapping_range() but walked the 
>>> page tables itself, updating PTEs as needed. Furthermore it was needed 
>>> for user bos that used get_user_pages() in the TTM populate and swap-in 
>>> methods. User-bos were removed some time ago but I'm looking at 
>>> re-adding them. They would suite the VMware model of cached-only pages 
>>> very well. I see uses both in the gallium API, XA's DMA functionality 
>>> and openCL.
>>>
>>> We would then need a somewhat nicer way to invert the locking order. 
>>> I've attached a solution that ups the mmap_sem and then reserves, but 
>>> due to how the fault API is done, we then need to release the reserve 
>>> and retry the fault. This of course opens up for starvation, but I 
>>> don't think starvation at this point is very likely: One thread being 
>>> refused to write or read from a buffer object because the GPU is 
>>> continously busy with it. If this *would* become a problem, it's 
>>> probably possible to modify the fault code to allow us to hold locks 
>>> until the retried fault, but that would be a bit invasive, since it 
>>> touches the arch code
>>>
>>> Basically I'm proposing to keep the current locking order.
>> I'm not sure why we have to worry about mmap_sem lock being taken before 
>> bo::reserve. If we already hold mmap_sem,
>> no extra locking is needed for get_user_pages.
> Typically, they are populated outside of fault, as part of execbuf, where 
> we don't hold and don't want to hold mmap_sem(). In fact,
> user bo's should not be remappable through the TTM VM system. Anyway, we 
> need to grab the mmap_sem inside ttm_populate for user buffers.
 If we don't allow mmapping user bo's through TTM, we can use special 
 lockdep annotation when user-bo's are used. Normal bo's would have
 mmap_sem outer lock, bo::reserve inner lock, while those bo's would have 
 the other way around.

 This might complicate validation a little, since you would have to reserve 
 and validate all user-bo's before any normal bo's are reserved. But since 
 this
 is meant to be a vmwgfx specific optimization I think it might be worth it.
>>> Would that work (lockdep-wise) with user BO swapout as part of a normal BO 
>>> validation? During user BO swapout, we don't need mmap_sem, but the BO 
>>> needs to be reserved. But I guess it would work, since we use tryreserve 
>>> when walking the LRU lists?
>> Correct.
>>
>> Releasing it is a bit silly. I think we should keep mmap_sem as outer
>> lock, and have bo::reserve as inner, even if it might complicate support 
>> for user-bo's. I'm not sure what you can do
>> with user-bo's that can't be done by allocating the same bo from kernel 
>> first and map + populate it.
>>
>> ~Maarten
> Using DMA API analogy, user BOs correspond to using streaming DMA whereas 
> normal BOs correspond to alloced DMA memory buffers.
> We boost performance and save resources.
 Yeah but it's vmwgfx specific. Nouveau and radeon have dedicated copy 
 engines that can be used. Flushing the vm and stalling is probably more
 expensive than performing a memcpy
>>> In the end, I'm not sure it will be vmwgfx specific once there is a working 
>>> example, and there are user-space APIs that will benefit from it. There are 
>>> other examples out there today that uses streaming DMA to feed the DMA 
>>> engines, although not through TTM, see for example via_dmablit.c.
>>>
>>> Also if we need a separate locking order for User BOs, what would be the 
>>> big benefit of having the locking order mmap_sem()->bo_reserve() ?
>> Deterministic locking. I fear about livelocks that could happen otherwise 
>> with the right timing. Especially but not exclusively on -rt kernels.
> But livelocks 

TTM Locking order of bo::reserve -> vm::mmap_sem

2013-10-21 Thread Thomas Hellstrom
On 10/21/2013 12:24 PM, Maarten Lankhorst wrote:
> op 21-10-13 12:10, Thomas Hellstrom schreef:
>> On 10/21/2013 11:48 AM, Maarten Lankhorst wrote:
>>> op 21-10-13 11:37, Thomas Hellstrom schreef:
 On 10/21/2013 11:01 AM, Maarten Lankhorst wrote:
> op 21-10-13 10:48, Thomas Hellstrom schreef:
>> Hi!
>>
>> As discussed previously the current locking order in TTM of these locks 
>> is bo::reserve -> vm::mmap_sem. This leads to a hack in
>> the TTM fault() handle to try and revert the locking order. If a 
>> tryreserve failed, we tried to have the vm code release the mmap_sem() 
>> and then schedule, to give the holder of bo::reserve a chance to release 
>> the lock. This solution is no longer legal, since we've been more or 
>> less kindly asked to remove the set_need_resched() call.
>>
>> Maarten has proposed to invert the locking order. I've previously said I 
>> had no strong preference. The current locking order dates back from the 
>> time when TTM wasn't using unmap_mapping_range() but walked the page 
>> tables itself, updating PTEs as needed. Furthermore it was needed for 
>> user bos that used get_user_pages() in the TTM populate and swap-in 
>> methods. User-bos were removed some time ago but I'm looking at 
>> re-adding them. They would suite the VMware model of cached-only pages 
>> very well. I see uses both in the gallium API, XA's DMA functionality 
>> and openCL.
>>
>> We would then need a somewhat nicer way to invert the locking order. 
>> I've attached a solution that ups the mmap_sem and then reserves, but 
>> due to how the fault API is done, we then need to release the reserve 
>> and retry the fault. This of course opens up for starvation, but I don't 
>> think starvation at this point is very likely: One thread being refused 
>> to write or read from a buffer object because the GPU is continously 
>> busy with it. If this *would* become a problem, it's probably possible 
>> to modify the fault code to allow us to hold locks until the retried 
>> fault, but that would be a bit invasive, since it touches the arch 
>> code
>>
>> Basically I'm proposing to keep the current locking order.
> I'm not sure why we have to worry about mmap_sem lock being taken before 
> bo::reserve. If we already hold mmap_sem,
> no extra locking is needed for get_user_pages.
 Typically, they are populated outside of fault, as part of execbuf, where 
 we don't hold and don't want to hold mmap_sem(). In fact,
 user bo's should not be remappable through the TTM VM system. Anyway, we 
 need to grab the mmap_sem inside ttm_populate for user buffers.
>>> If we don't allow mmapping user bo's through TTM, we can use special 
>>> lockdep annotation when user-bo's are used. Normal bo's would have
>>> mmap_sem outer lock, bo::reserve inner lock, while those bo's would have 
>>> the other way around.
>>>
>>> This might complicate validation a little, since you would have to reserve 
>>> and validate all user-bo's before any normal bo's are reserved. But since 
>>> this
>>> is meant to be a vmwgfx specific optimization I think it might be worth it.
>> Would that work (lockdep-wise) with user BO swapout as part of a normal BO 
>> validation? During user BO swapout, we don't need mmap_sem, but the BO needs 
>> to be reserved. But I guess it would work, since we use tryreserve when 
>> walking the LRU lists?
> Correct.
>
> Releasing it is a bit silly. I think we should keep mmap_sem as outer
> lock, and have bo::reserve as inner, even if it might complicate support 
> for user-bo's. I'm not sure what you can do
> with user-bo's that can't be done by allocating the same bo from kernel 
> first and map + populate it.
>
> ~Maarten
 Using DMA API analogy, user BOs correspond to using streaming DMA whereas 
 normal BOs correspond to alloced DMA memory buffers.
 We boost performance and save resources.
>>> Yeah but it's vmwgfx specific. Nouveau and radeon have dedicated copy 
>>> engines that can be used. Flushing the vm and stalling is probably more
>>> expensive than performing a memcpy
>> In the end, I'm not sure it will be vmwgfx specific once there is a working 
>> example, and there are user-space APIs that will benefit from it. There are 
>> other examples out there today that uses streaming DMA to feed the DMA 
>> engines, although not through TTM, see for example via_dmablit.c.
>>
>> Also if we need a separate locking order for User BOs, what would be the big 
>> benefit of having the locking order mmap_sem()->bo_reserve() ?
> Deterministic locking. I fear about livelocks that could happen otherwise 
> with the right timing. Especially but not exclusively on -rt kernels.
But livelocks wouldn't be an issue anymore since we use a waiting 
reserve in the retry path, right? The only issue we might theoretically 
be 

TTM Locking order of bo::reserve -> vm::mmap_sem

2013-10-21 Thread Maarten Lankhorst
op 21-10-13 12:10, Thomas Hellstrom schreef:
> On 10/21/2013 11:48 AM, Maarten Lankhorst wrote:
>> op 21-10-13 11:37, Thomas Hellstrom schreef:
>>> On 10/21/2013 11:01 AM, Maarten Lankhorst wrote:
 op 21-10-13 10:48, Thomas Hellstrom schreef:
> Hi!
>
> As discussed previously the current locking order in TTM of these locks 
> is bo::reserve -> vm::mmap_sem. This leads to a hack in
> the TTM fault() handle to try and revert the locking order. If a 
> tryreserve failed, we tried to have the vm code release the mmap_sem() 
> and then schedule, to give the holder of bo::reserve a chance to release 
> the lock. This solution is no longer legal, since we've been more or less 
> kindly asked to remove the set_need_resched() call.
>
> Maarten has proposed to invert the locking order. I've previously said I 
> had no strong preference. The current locking order dates back from the 
> time when TTM wasn't using unmap_mapping_range() but walked the page 
> tables itself, updating PTEs as needed. Furthermore it was needed for 
> user bos that used get_user_pages() in the TTM populate and swap-in 
> methods. User-bos were removed some time ago but I'm looking at re-adding 
> them. They would suite the VMware model of cached-only pages very well. I 
> see uses both in the gallium API, XA's DMA functionality and openCL.
>
> We would then need a somewhat nicer way to invert the locking order. I've 
> attached a solution that ups the mmap_sem and then reserves, but due to 
> how the fault API is done, we then need to release the reserve and retry 
> the fault. This of course opens up for starvation, but I don't think 
> starvation at this point is very likely: One thread being refused to 
> write or read from a buffer object because the GPU is continously busy 
> with it. If this *would* become a problem, it's probably possible to 
> modify the fault code to allow us to hold locks until the retried fault, 
> but that would be a bit invasive, since it touches the arch code
>
> Basically I'm proposing to keep the current locking order.
 I'm not sure why we have to worry about mmap_sem lock being taken before 
 bo::reserve. If we already hold mmap_sem,
 no extra locking is needed for get_user_pages.
>>> Typically, they are populated outside of fault, as part of execbuf, where 
>>> we don't hold and don't want to hold mmap_sem(). In fact,
>>> user bo's should not be remappable through the TTM VM system. Anyway, we 
>>> need to grab the mmap_sem inside ttm_populate for user buffers.
>> If we don't allow mmapping user bo's through TTM, we can use special lockdep 
>> annotation when user-bo's are used. Normal bo's would have
>> mmap_sem outer lock, bo::reserve inner lock, while those bo's would have the 
>> other way around.
>>
>> This might complicate validation a little, since you would have to reserve 
>> and validate all user-bo's before any normal bo's are reserved. But since 
>> this
>> is meant to be a vmwgfx specific optimization I think it might be worth it.
>
> Would that work (lockdep-wise) with user BO swapout as part of a normal BO 
> validation? During user BO swapout, we don't need mmap_sem, but the BO needs 
> to be reserved. But I guess it would work, since we use tryreserve when 
> walking the LRU lists?
Correct.

>>
Releasing it is a bit silly. I think we should keep mmap_sem as outer
 lock, and have bo::reserve as inner, even if it might complicate support 
 for user-bo's. I'm not sure what you can do
 with user-bo's that can't be done by allocating the same bo from kernel 
 first and map + populate it.

 ~Maarten
>>> Using DMA API analogy, user BOs correspond to using streaming DMA whereas 
>>> normal BOs correspond to alloced DMA memory buffers.
>>> We boost performance and save resources.
>> Yeah but it's vmwgfx specific. Nouveau and radeon have dedicated copy 
>> engines that can be used. Flushing the vm and stalling is probably more
>> expensive than performing a memcpy
> In the end, I'm not sure it will be vmwgfx specific once there is a working 
> example, and there are user-space APIs that will benefit from it. There are 
> other examples out there today that uses streaming DMA to feed the DMA 
> engines, although not through TTM, see for example via_dmablit.c.
>
> Also if we need a separate locking order for User BOs, what would be the big 
> benefit of having the locking order mmap_sem()->bo_reserve() ?
Deterministic locking. I fear about livelocks that could happen otherwise with 
the right timing. Especially but not exclusively on -rt kernels.

~Maarten


TTM Locking order of bo::reserve -> vm::mmap_sem

2013-10-21 Thread Thomas Hellstrom
On 10/21/2013 11:48 AM, Maarten Lankhorst wrote:
> op 21-10-13 11:37, Thomas Hellstrom schreef:
>> On 10/21/2013 11:01 AM, Maarten Lankhorst wrote:
>>> op 21-10-13 10:48, Thomas Hellstrom schreef:
 Hi!

 As discussed previously the current locking order in TTM of these locks is 
 bo::reserve -> vm::mmap_sem. This leads to a hack in
 the TTM fault() handle to try and revert the locking order. If a 
 tryreserve failed, we tried to have the vm code release the mmap_sem() and 
 then schedule, to give the holder of bo::reserve a chance to release the 
 lock. This solution is no longer legal, since we've been more or less 
 kindly asked to remove the set_need_resched() call.

 Maarten has proposed to invert the locking order. I've previously said I 
 had no strong preference. The current locking order dates back from the 
 time when TTM wasn't using unmap_mapping_range() but walked the page 
 tables itself, updating PTEs as needed. Furthermore it was needed for user 
 bos that used get_user_pages() in the TTM populate and swap-in methods. 
 User-bos were removed some time ago but I'm looking at re-adding them. 
 They would suite the VMware model of cached-only pages very well. I see 
 uses both in the gallium API, XA's DMA functionality and openCL.

 We would then need a somewhat nicer way to invert the locking order. I've 
 attached a solution that ups the mmap_sem and then reserves, but due to 
 how the fault API is done, we then need to release the reserve and retry 
 the fault. This of course opens up for starvation, but I don't think 
 starvation at this point is very likely: One thread being refused to write 
 or read from a buffer object because the GPU is continously busy with it. 
 If this *would* become a problem, it's probably possible to modify the 
 fault code to allow us to hold locks until the retried fault, but that 
 would be a bit invasive, since it touches the arch code

 Basically I'm proposing to keep the current locking order.
>>> I'm not sure why we have to worry about mmap_sem lock being taken before 
>>> bo::reserve. If we already hold mmap_sem,
>>> no extra locking is needed for get_user_pages.
>> Typically, they are populated outside of fault, as part of execbuf, where we 
>> don't hold and don't want to hold mmap_sem(). In fact,
>> user bo's should not be remappable through the TTM VM system. Anyway, we 
>> need to grab the mmap_sem inside ttm_populate for user buffers.
> If we don't allow mmapping user bo's through TTM, we can use special lockdep 
> annotation when user-bo's are used. Normal bo's would have
> mmap_sem outer lock, bo::reserve inner lock, while those bo's would have the 
> other way around.
>
> This might complicate validation a little, since you would have to reserve 
> and validate all user-bo's before any normal bo's are reserved. But since this
> is meant to be a vmwgfx specific optimization I think it might be worth it.

Would that work (lockdep-wise) with user BO swapout as part of a normal 
BO validation? During user BO swapout, we don't need mmap_sem, but the 
BO needs to be reserved. But I guess it would work, since we use 
tryreserve when walking the LRU lists?

>
>>>Releasing it is a bit silly. I think we should keep mmap_sem as outer
>>> lock, and have bo::reserve as inner, even if it might complicate support 
>>> for user-bo's. I'm not sure what you can do
>>> with user-bo's that can't be done by allocating the same bo from kernel 
>>> first and map + populate it.
>>>
>>> ~Maarten
>> Using DMA API analogy, user BOs correspond to using streaming DMA whereas 
>> normal BOs correspond to alloced DMA memory buffers.
>> We boost performance and save resources.
> Yeah but it's vmwgfx specific. Nouveau and radeon have dedicated copy engines 
> that can be used. Flushing the vm and stalling is probably more
> expensive than performing a memcpy
In the end, I'm not sure it will be vmwgfx specific once there is a 
working example, and there are user-space APIs that will benefit from 
it. There are other examples out there today that uses streaming DMA to 
feed the DMA engines, although not through TTM, see for example 
via_dmablit.c.

Also if we need a separate locking order for User BOs, what would be the 
big benefit of having the locking order mmap_sem()->bo_reserve() ?

/Thomas


TTM Locking order of bo::reserve -> vm::mmap_sem

2013-10-21 Thread Maarten Lankhorst
op 21-10-13 11:37, Thomas Hellstrom schreef:
> On 10/21/2013 11:01 AM, Maarten Lankhorst wrote:
>> op 21-10-13 10:48, Thomas Hellstrom schreef:
>>> Hi!
>>>
>>> As discussed previously the current locking order in TTM of these locks is 
>>> bo::reserve -> vm::mmap_sem. This leads to a hack in
>>> the TTM fault() handle to try and revert the locking order. If a tryreserve 
>>> failed, we tried to have the vm code release the mmap_sem() and then 
>>> schedule, to give the holder of bo::reserve a chance to release the lock. 
>>> This solution is no longer legal, since we've been more or less kindly 
>>> asked to remove the set_need_resched() call.
>>>
>>> Maarten has proposed to invert the locking order. I've previously said I 
>>> had no strong preference. The current locking order dates back from the 
>>> time when TTM wasn't using unmap_mapping_range() but walked the page tables 
>>> itself, updating PTEs as needed. Furthermore it was needed for user bos 
>>> that used get_user_pages() in the TTM populate and swap-in methods. 
>>> User-bos were removed some time ago but I'm looking at re-adding them. They 
>>> would suite the VMware model of cached-only pages very well. I see uses 
>>> both in the gallium API, XA's DMA functionality and openCL.
>>>
>>> We would then need a somewhat nicer way to invert the locking order. I've 
>>> attached a solution that ups the mmap_sem and then reserves, but due to how 
>>> the fault API is done, we then need to release the reserve and retry the 
>>> fault. This of course opens up for starvation, but I don't think starvation 
>>> at this point is very likely: One thread being refused to write or read 
>>> from a buffer object because the GPU is continously busy with it. If this 
>>> *would* become a problem, it's probably possible to modify the fault code 
>>> to allow us to hold locks until the retried fault, but that would be a bit 
>>> invasive, since it touches the arch code
>>>
>>> Basically I'm proposing to keep the current locking order.
>> I'm not sure why we have to worry about mmap_sem lock being taken before 
>> bo::reserve. If we already hold mmap_sem,
>> no extra locking is needed for get_user_pages.
>
> Typically, they are populated outside of fault, as part of execbuf, where we 
> don't hold and don't want to hold mmap_sem(). In fact,
> user bo's should not be remappable through the TTM VM system. Anyway, we need 
> to grab the mmap_sem inside ttm_populate for user buffers.
If we don't allow mmapping user bo's through TTM, we can use special lockdep 
annotation when user-bo's are used. Normal bo's would have
mmap_sem outer lock, bo::reserve inner lock, while those bo's would have the 
other way around.

This might complicate validation a little, since you would have to reserve and 
validate all user-bo's before any normal bo's are reserved. But since this
is meant to be a vmwgfx specific optimization I think it might be worth it.

>>   Releasing it is a bit silly. I think we should keep mmap_sem as outer
>> lock, and have bo::reserve as inner, even if it might complicate support for 
>> user-bo's. I'm not sure what you can do
>> with user-bo's that can't be done by allocating the same bo from kernel 
>> first and map + populate it.
>>
>> ~Maarten
> Using DMA API analogy, user BOs correspond to using streaming DMA whereas 
> normal BOs correspond to alloced DMA memory buffers.
> We boost performance and save resources.
Yeah but it's vmwgfx specific. Nouveau and radeon have dedicated copy engines 
that can be used. Flushing the vm and stalling is probably more
expensive than performing a memcpy.


TTM Locking order of bo::reserve -> vm::mmap_sem

2013-10-21 Thread Thomas Hellstrom
On 10/21/2013 11:01 AM, Maarten Lankhorst wrote:
> op 21-10-13 10:48, Thomas Hellstrom schreef:
>> Hi!
>>
>> As discussed previously the current locking order in TTM of these locks is 
>> bo::reserve -> vm::mmap_sem. This leads to a hack in
>> the TTM fault() handle to try and revert the locking order. If a tryreserve 
>> failed, we tried to have the vm code release the mmap_sem() and then 
>> schedule, to give the holder of bo::reserve a chance to release the lock. 
>> This solution is no longer legal, since we've been more or less kindly asked 
>> to remove the set_need_resched() call.
>>
>> Maarten has proposed to invert the locking order. I've previously said I had 
>> no strong preference. The current locking order dates back from the time 
>> when TTM wasn't using unmap_mapping_range() but walked the page tables 
>> itself, updating PTEs as needed. Furthermore it was needed for user bos that 
>> used get_user_pages() in the TTM populate and swap-in methods. User-bos were 
>> removed some time ago but I'm looking at re-adding them. They would suite 
>> the VMware model of cached-only pages very well. I see uses both in the 
>> gallium API, XA's DMA functionality and openCL.
>>
>> We would then need a somewhat nicer way to invert the locking order. I've 
>> attached a solution that ups the mmap_sem and then reserves, but due to how 
>> the fault API is done, we then need to release the reserve and retry the 
>> fault. This of course opens up for starvation, but I don't think starvation 
>> at this point is very likely: One thread being refused to write or read from 
>> a buffer object because the GPU is continously busy with it. If this *would* 
>> become a problem, it's probably possible to modify the fault code to allow 
>> us to hold locks until the retried fault, but that would be a bit invasive, 
>> since it touches the arch code
>>
>> Basically I'm proposing to keep the current locking order.
> I'm not sure why we have to worry about mmap_sem lock being taken before 
> bo::reserve. If we already hold mmap_sem,
> no extra locking is needed for get_user_pages.

Typically, they are populated outside of fault, as part of execbuf, 
where we don't hold and don't want to hold mmap_sem(). In fact,
user bo's should not be remappable through the TTM VM system. Anyway, we 
need to grab the mmap_sem inside ttm_populate for user buffers.

>   Releasing it is a bit silly. I think we should keep mmap_sem as outer
> lock, and have bo::reserve as inner, even if it might complicate support for 
> user-bo's. I'm not sure what you can do
> with user-bo's that can't be done by allocating the same bo from kernel first 
> and map + populate it.
>
> ~Maarten
Using DMA API analogy, user BOs correspond to using streaming DMA 
whereas normal BOs correspond to alloced DMA memory buffers.
We boost performance and save resources.

Thanks,
/Thomas


>
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel


TTM Locking order of bo::reserve -> vm::mmap_sem

2013-10-21 Thread Jakob Bornecrantz
On Mon, Oct 21, 2013 at 10:48 AM, Thomas Hellstrom
 wrote:
> Hi!
>
> As discussed previously the current locking order in TTM of these locks is
> bo::reserve -> vm::mmap_sem. This leads to a hack in
> the TTM fault() handle to try and revert the locking order. If a tryreserve
> failed, we tried to have the vm code release the mmap_sem() and then
> schedule, to give the holder of bo::reserve a chance to release the lock.
> This solution is no longer legal, since we've been more or less kindly asked
> to remove the set_need_resched() call.
>
> Maarten has proposed to invert the locking order. I've previously said I had
> no strong preference. The current locking order dates back from the time
> when TTM wasn't using unmap_mapping_range() but walked the page tables
> itself, updating PTEs as needed. Furthermore it was needed for user bos that
> used get_user_pages() in the TTM populate and swap-in methods. User-bos were
> removed some time ago but I'm looking at re-adding them. They would suite
> the VMware model of cached-only pages very well. I see uses both in the
> gallium API, XA's DMA functionality and openCL.

In particular OpenCL's clEnqueueWriteBuffer[1] function and friends
which can move quite a bit of data in and out of VRAM.

>
> We would then need a somewhat nicer way to invert the locking order. I've
> attached a solution that ups the mmap_sem and then reserves, but due to how
> the fault API is done, we then need to release the reserve and retry the
> fault. This of course opens up for starvation, but I don't think starvation
> at this point is very likely: One thread being refused to write or read from
> a buffer object because the GPU is continously busy with it. If this *would*
> become a problem, it's probably possible to modify the fault code to allow
> us to hold locks until the retried fault, but that would be a bit invasive,
> since it touches the arch code
>
> Basically I'm proposing to keep the current locking order.

Cheers, Jakob.

[1] 
http://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/clEnqueueWriteBuffer.html


TTM Locking order of bo::reserve -> vm::mmap_sem

2013-10-21 Thread Maarten Lankhorst
op 21-10-13 10:48, Thomas Hellstrom schreef:
> Hi!
>
> As discussed previously the current locking order in TTM of these locks is 
> bo::reserve -> vm::mmap_sem. This leads to a hack in
> the TTM fault() handle to try and revert the locking order. If a tryreserve 
> failed, we tried to have the vm code release the mmap_sem() and then 
> schedule, to give the holder of bo::reserve a chance to release the lock. 
> This solution is no longer legal, since we've been more or less kindly asked 
> to remove the set_need_resched() call.
>
> Maarten has proposed to invert the locking order. I've previously said I had 
> no strong preference. The current locking order dates back from the time when 
> TTM wasn't using unmap_mapping_range() but walked the page tables itself, 
> updating PTEs as needed. Furthermore it was needed for user bos that used 
> get_user_pages() in the TTM populate and swap-in methods. User-bos were 
> removed some time ago but I'm looking at re-adding them. They would suite the 
> VMware model of cached-only pages very well. I see uses both in the gallium 
> API, XA's DMA functionality and openCL.
>
> We would then need a somewhat nicer way to invert the locking order. I've 
> attached a solution that ups the mmap_sem and then reserves, but due to how 
> the fault API is done, we then need to release the reserve and retry the 
> fault. This of course opens up for starvation, but I don't think starvation 
> at this point is very likely: One thread being refused to write or read from 
> a buffer object because the GPU is continously busy with it. If this *would* 
> become a problem, it's probably possible to modify the fault code to allow us 
> to hold locks until the retried fault, but that would be a bit invasive, 
> since it touches the arch code
>
> Basically I'm proposing to keep the current locking order.

I'm not sure why we have to worry about mmap_sem lock being taken before 
bo::reserve. If we already hold mmap_sem,
no extra locking is needed for get_user_pages. Releasing it is a bit silly. I 
think we should keep mmap_sem as outer
lock, and have bo::reserve as inner, even if it might complicate support for 
user-bo's. I'm not sure what you can do
with user-bo's that can't be done by allocating the same bo from kernel first 
and map + populate it.

~Maarten



TTM Locking order of bo::reserve -> vm::mmap_sem

2013-10-21 Thread Thomas Hellstrom
Hi!

As discussed previously the current locking order in TTM of these locks 
is bo::reserve -> vm::mmap_sem. This leads to a hack in
the TTM fault() handle to try and revert the locking order. If a 
tryreserve failed, we tried to have the vm code release the mmap_sem() 
and then schedule, to give the holder of bo::reserve a chance to release 
the lock. This solution is no longer legal, since we've been more or 
less kindly asked to remove the set_need_resched() call.

Maarten has proposed to invert the locking order. I've previously said I 
had no strong preference. The current locking order dates back from the 
time when TTM wasn't using unmap_mapping_range() but walked the page 
tables itself, updating PTEs as needed. Furthermore it was needed for 
user bos that used get_user_pages() in the TTM populate and swap-in 
methods. User-bos were removed some time ago but I'm looking at 
re-adding them. They would suite the VMware model of cached-only pages 
very well. I see uses both in the gallium API, XA's DMA functionality 
and openCL.

We would then need a somewhat nicer way to invert the locking order. 
I've attached a solution that ups the mmap_sem and then reserves, but 
due to how the fault API is done, we then need to release the reserve 
and retry the fault. This of course opens up for starvation, but I don't 
think starvation at this point is very likely: One thread being refused 
to write or read from a buffer object because the GPU is continously 
busy with it. If this *would* become a problem, it's probably possible 
to modify the fault code to allow us to hold locks until the retried 
fault, but that would be a bit invasive, since it touches the arch code

Basically I'm proposing to keep the current locking order.


/Thomas


-- next part --
A non-text attachment was scrubbed...
Name: vm_lock.diff
Type: text/x-patch
Size: 1050 bytes
Desc: not available
URL: