Re: [PATCH v3 3/4] drm/ttm: convert to unified vma offset manager
Sorry, I forgot to CC correctly. On Mon, Jul 22, 2013 at 12:53 PM, David Herrmann dh.herrm...@gmail.com wrote: Hi On Fri, Jul 19, 2013 at 11:13 AM, Thomas Hellstrom thellst...@vmware.com wrote: On 07/18/2013 10:54 PM, David Herrmann wrote: Hi On Thu, Jul 18, 2013 at 1:24 PM, Thomas Hellstrom thellst...@vmware.com wrote: ... I think that if there are good reasons to keep locking internal, I'm fine with that, (And also, of course, with Daniel's proposal). Currently the add / remove / lookup paths are mostly used by TTM during object creation and destruction. However, if the lookup path is ever used by pread / pwrite, that situation might change and we would then like to minimize the locking. I tried to keep the change as minimal as I could. Follow-up patches are welcome. I just thought pushing the lock into drm_vma_* would simplify things. If there are benchmarks that prove me wrong, I'll gladly spend some time optimizing that. In the general case, one reason for designing the locking outside of a utilities like this, is that different callers may have different requirements. For example, the call path is known not to be multithreaded at all, or the caller may prefer a mutex over a spinlock for various reasons. It might also be that some callers will want to use RCU locking in the future if the lookup path becomes busy, and that would require *all* users to adapt to RCU object destruction... I haven't looked at the code closely enough to say that any of this applies in this particular case, though. Some notes why I went with local locking: - mmap offset creation is done once and is independent of any other operations you might do on the BO in parallel - mmap lookup is also only done once in most cases. User-space rarely maps a buffer twice (setup/teardown of mmaps is expensive, but keeping them around is very cheap). And for shared buffers only the writer maps it as the reader normally passes it to the kernel without mapping/touching it. Only for software-rendering we have two or more mappings of the same object. - creating/mapping/destroying buffer objects is expensive in its current form and buffers tend to stay around for a long time I couldn't find a situation were the vma-manager was part of a performance critical path. But on the other hand, the paths were it is used are quite heavy. That's why I don't want to lock the whole buffer or even device. Instead, we need some management lock which is used for mmap-setup or similar things that don't affect other operations on the buffer or device. We don't have such a lock, so I introduced local locking. If we end up with more use-cases, I volunteer to refactor this. But I am no big fan of overgeneralizing it before more uses are known. Anyway, I will resend with vm_lock removed (+_unlocked() helpers) so we keep the status-quo regarding locks. Thanks for the comments on TTM buffer transfer. Thanks David ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v3 3/4] drm/ttm: convert to unified vma offset manager
On 07/22/2013 12:55 PM, David Herrmann wrote: Sorry, I forgot to CC correctly. On Mon, Jul 22, 2013 at 12:53 PM, David Herrmann dh.herrm...@gmail.com wrote: Hi On Fri, Jul 19, 2013 at 11:13 AM, Thomas Hellstrom thellst...@vmware.com wrote: On 07/18/2013 10:54 PM, David Herrmann wrote: Hi On Thu, Jul 18, 2013 at 1:24 PM, Thomas Hellstrom thellst...@vmware.com wrote: ... I think that if there are good reasons to keep locking internal, I'm fine with that, (And also, of course, with Daniel's proposal). Currently the add / remove / lookup paths are mostly used by TTM during object creation and destruction. However, if the lookup path is ever used by pread / pwrite, that situation might change and we would then like to minimize the locking. I tried to keep the change as minimal as I could. Follow-up patches are welcome. I just thought pushing the lock into drm_vma_* would simplify things. If there are benchmarks that prove me wrong, I'll gladly spend some time optimizing that. In the general case, one reason for designing the locking outside of a utilities like this, is that different callers may have different requirements. For example, the call path is known not to be multithreaded at all, or the caller may prefer a mutex over a spinlock for various reasons. It might also be that some callers will want to use RCU locking in the future if the lookup path becomes busy, and that would require *all* users to adapt to RCU object destruction... I haven't looked at the code closely enough to say that any of this applies in this particular case, though. Some notes why I went with local locking: - mmap offset creation is done once and is independent of any other operations you might do on the BO in parallel - mmap lookup is also only done once in most cases. User-space rarely maps a buffer twice (setup/teardown of mmaps is expensive, but keeping them around is very cheap). And for shared buffers only the writer maps it as the reader normally passes it to the kernel without mapping/touching it. Only for software-rendering we have two or more mappings of the same object. - creating/mapping/destroying buffer objects is expensive in its current form and buffers tend to stay around for a long time I couldn't find a situation were the vma-manager was part of a performance critical path. But on the other hand, the paths were it is used are quite heavy. That's why I don't want to lock the whole buffer or even device. Instead, we need some management lock which is used for mmap-setup or similar things that don't affect other operations on the buffer or device. We don't have such a lock, so I introduced local locking. If we end up with more use-cases, I volunteer to refactor this. But I am no big fan of overgeneralizing it before more uses are known. I think we are discussing two different things here: 1) Having a separate lock for vma management vs 2) building that lock into the vma manager utility. You're arguing for 1) and I completely agree with you, and although I still think one generally should avoid building locks into utilities like this (avoid 2), I'm fine with the current approach. Thanks, Thomas ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v3 3/4] drm/ttm: convert to unified vma offset manager
Op 18-07-13 22:54, David Herrmann schreef: Hi On Thu, Jul 18, 2013 at 1:24 PM, Thomas Hellstrom thellst...@vmware.com wrote: On 07/18/2013 01:07 PM, David Herrmann wrote: Hi On Thu, Jul 18, 2013 at 10:53 AM, Thomas Hellstrom thellst...@vmware.com wrote: A quick look, but not a full review: Looks mostly good, but it looks like the TTM vm lock isn't needed at all anymore (provided the vma offset manager is properly protected), since kref_get_unless_zero() is used when a reference after lookup is taken. (please see the kref_get_unless_zero documentation examples). When drm_vma_offset_remove() is called, the kref value is unconditionally zero, and that should block any successful lookup. If we drop vm_lock without modifying TTM, this will not work. Even kref_get_unless_zero() needs some guarantee that the object is still valid. Assume this scenario: drm_vma_offset_lookup() returns a valid node, however, before we call kref_get_*(), another thread destroys the last reference to the bo so it gets kfree()d. kref_get_unless_zero() will thus reference memory which can be _anything_ and is not guarantee to stay 0. (Documentation/kref.txt mentions this, too, and I guess it was you who wrote that). I cannot see any locking around the mmap call that could guarantee this except vm_lock. You're right. My apologies. Parental leave has taken its toll. Otherwise, if the vma offset manager always needs external locking to make lookup + referencing atomic, then perhaps locking should be completely left to the caller? I would actually prefer to keep it in the VMA manager. It allows some fast-paths which otherwise would need special semantics for the caller (for instance see the access-management patches which are based on this patchset or the reduced locking during setup/release). We'd also require the caller to use rwsems for good performance, which is not the case for gem. So how about Daniel's proposal to add an _unlocked() version and provide _lock_lookup() and _unlock_lookup() helpers which just wrap the read_lock() and read_unlock? Thanks! David I think that if there are good reasons to keep locking internal, I'm fine with that, (And also, of course, with Daniel's proposal). Currently the add / remove / lookup paths are mostly used by TTM during object creation and destruction. However, if the lookup path is ever used by pread / pwrite, that situation might change and we would then like to minimize the locking. I tried to keep the change as minimal as I could. Follow-up patches are welcome. I just thought pushing the lock into drm_vma_* would simplify things. If there are benchmarks that prove me wrong, I'll gladly spend some time optimizing that. Apart from this, I'd like to see someone ack the ttm_buffer_object_transfer() changes. I am not very confident with that. Everything else should be trivial. The transfer object only exists to kill off the memory backing during asynchronous eviction, by using the delayed destroyed mechanism. The re-initialization there looks correct to me. ~Maarten ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v3 3/4] drm/ttm: convert to unified vma offset manager
On 07/18/2013 10:54 PM, David Herrmann wrote: Hi On Thu, Jul 18, 2013 at 1:24 PM, Thomas Hellstrom thellst...@vmware.com wrote: ... I think that if there are good reasons to keep locking internal, I'm fine with that, (And also, of course, with Daniel's proposal). Currently the add / remove / lookup paths are mostly used by TTM during object creation and destruction. However, if the lookup path is ever used by pread / pwrite, that situation might change and we would then like to minimize the locking. I tried to keep the change as minimal as I could. Follow-up patches are welcome. I just thought pushing the lock into drm_vma_* would simplify things. If there are benchmarks that prove me wrong, I'll gladly spend some time optimizing that. In the general case, one reason for designing the locking outside of a utilities like this, is that different callers may have different requirements. For example, the call path is known not to be multithreaded at all, or the caller may prefer a mutex over a spinlock for various reasons. It might also be that some callers will want to use RCU locking in the future if the lookup path becomes busy, and that would require *all* users to adapt to RCU object destruction... I haven't looked at the code closely enough to say that any of this applies in this particular case, though. Thanks, Thomas Thanks David ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v3 3/4] drm/ttm: convert to unified vma offset manager
A quick look, but not a full review: Looks mostly good, but it looks like the TTM vm lock isn't needed at all anymore (provided the vma offset manager is properly protected), since kref_get_unless_zero() is used when a reference after lookup is taken. (please see the kref_get_unless_zero documentation examples). When drm_vma_offset_remove() is called, the kref value is unconditionally zero, and that should block any successful lookup. Otherwise, if the vma offset manager always needs external locking to make lookup + referencing atomic, then perhaps locking should be completely left to the caller? Thanks, Thomas On 07/17/2013 08:14 PM, David Herrmann wrote: Use the new vma-manager infrastructure. This doesn't change any implementation details as the vma-offset-manager is nearly copied 1-to-1 from TTM. Even though the vma-manager uses its own locks, we still need bo-vm_lock to prevent bos from being destroyed before we can get a reference during lookup. However, this lock is not needed during vm-setup as we still hold a reference there. This also drops the addr_space_offset member as it is a copy of vm_start in vma_node objects. Use the accessor functions instead. Cc: Dave Airlie airl...@redhat.com Cc: Ben Skeggs bske...@redhat.com Cc: Maarten Lankhorst maarten.lankho...@canonical.com Cc: Martin Peres martin.pe...@labri.fr Cc: Alex Deucher alexander.deuc...@amd.com Cc: Thomas Hellstrom thellst...@vmware.com Signed-off-by: David Herrmann dh.herrm...@gmail.com --- drivers/gpu/drm/ast/ast_main.c| 2 +- drivers/gpu/drm/cirrus/cirrus_main.c | 2 +- drivers/gpu/drm/mgag200/mgag200_main.c| 2 +- drivers/gpu/drm/nouveau/nouveau_display.c | 2 +- drivers/gpu/drm/nouveau/nouveau_gem.c | 2 +- drivers/gpu/drm/qxl/qxl_object.h | 2 +- drivers/gpu/drm/qxl/qxl_release.c | 2 +- drivers/gpu/drm/radeon/radeon_object.h| 5 +- drivers/gpu/drm/ttm/ttm_bo.c | 84 ++- drivers/gpu/drm/ttm/ttm_bo_util.c | 3 +- drivers/gpu/drm/ttm/ttm_bo_vm.c | 81 - drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 4 +- include/drm/ttm/ttm_bo_api.h | 15 ++ include/drm/ttm/ttm_bo_driver.h | 7 +-- 14 files changed, 65 insertions(+), 148 deletions(-) diff --git a/drivers/gpu/drm/ast/ast_main.c b/drivers/gpu/drm/ast/ast_main.c index f60fd7b..c195dc2 100644 --- a/drivers/gpu/drm/ast/ast_main.c +++ b/drivers/gpu/drm/ast/ast_main.c @@ -487,7 +487,7 @@ void ast_gem_free_object(struct drm_gem_object *obj) static inline u64 ast_bo_mmap_offset(struct ast_bo *bo) { - return bo-bo.addr_space_offset; + return drm_vma_node_offset_addr(bo-bo.vma_node); } int ast_dumb_mmap_offset(struct drm_file *file, diff --git a/drivers/gpu/drm/cirrus/cirrus_main.c b/drivers/gpu/drm/cirrus/cirrus_main.c index 35cbae8..3a7a0ef 100644 --- a/drivers/gpu/drm/cirrus/cirrus_main.c +++ b/drivers/gpu/drm/cirrus/cirrus_main.c @@ -294,7 +294,7 @@ void cirrus_gem_free_object(struct drm_gem_object *obj) static inline u64 cirrus_bo_mmap_offset(struct cirrus_bo *bo) { - return bo-bo.addr_space_offset; + return drm_vma_node_offset_addr(bo-bo.vma_node); } int diff --git a/drivers/gpu/drm/mgag200/mgag200_main.c b/drivers/gpu/drm/mgag200/mgag200_main.c index 9fa5685..1a75ea3 100644 --- a/drivers/gpu/drm/mgag200/mgag200_main.c +++ b/drivers/gpu/drm/mgag200/mgag200_main.c @@ -349,7 +349,7 @@ void mgag200_gem_free_object(struct drm_gem_object *obj) static inline u64 mgag200_bo_mmap_offset(struct mgag200_bo *bo) { - return bo-bo.addr_space_offset; + return drm_vma_node_offset_addr(bo-bo.vma_node); } int diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c index 708b2d1..7a8caa1 100644 --- a/drivers/gpu/drm/nouveau/nouveau_display.c +++ b/drivers/gpu/drm/nouveau/nouveau_display.c @@ -697,7 +697,7 @@ nouveau_display_dumb_map_offset(struct drm_file *file_priv, gem = drm_gem_object_lookup(dev, file_priv, handle); if (gem) { struct nouveau_bo *bo = gem-driver_private; - *poffset = bo-bo.addr_space_offset; + *poffset = drm_vma_node_offset_addr(bo-bo.vma_node); drm_gem_object_unreference_unlocked(gem); return 0; } diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c index e72d09c..86597eb 100644 --- a/drivers/gpu/drm/nouveau/nouveau_gem.c +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c @@ -226,7 +226,7 @@ nouveau_gem_info(struct drm_file *file_priv, struct drm_gem_object *gem, } rep-size = nvbo-bo.mem.num_pages PAGE_SHIFT; - rep-map_handle = nvbo-bo.addr_space_offset; + rep-map_handle = drm_vma_node_offset_addr(nvbo-bo.vma_node); rep-tile_mode = nvbo-tile_mode; rep-tile_flags = nvbo-tile_flags;
Re: [PATCH v3 3/4] drm/ttm: convert to unified vma offset manager
On Thu, Jul 18, 2013 at 10:53 AM, Thomas Hellstrom thellst...@vmware.com wrote: A quick look, but not a full review: Looks mostly good, but it looks like the TTM vm lock isn't needed at all anymore (provided the vma offset manager is properly protected), since kref_get_unless_zero() is used when a reference after lookup is taken. (please see the kref_get_unless_zero documentation examples). When drm_vma_offset_remove() is called, the kref value is unconditionally zero, and that should block any successful lookup. Otherwise, if the vma offset manager always needs external locking to make lookup + referencing atomic, then perhaps locking should be completely left to the caller? Somehow I've thought plain gem drivers had this fixed, but looks like I've never gotten around to it. So we need to rework things anyway. What about a drm_vma_offset_lookup_unlocked which just checks that the caller is holding the offset manager's lock? That way everyone can follow up with the get_unless_zero dance correctly. And ttm could drop it's own vm lock. I've considered whether we should just move the vma node into struct drm_gem_object and let the offset manager do the dance, but that's much more invasive. And I'm not sure it's the right thing to do yet. So I think we should consider this only as a follow-up series. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v3 3/4] drm/ttm: convert to unified vma offset manager
Hi On Thu, Jul 18, 2013 at 10:53 AM, Thomas Hellstrom thellst...@vmware.com wrote: A quick look, but not a full review: Looks mostly good, but it looks like the TTM vm lock isn't needed at all anymore (provided the vma offset manager is properly protected), since kref_get_unless_zero() is used when a reference after lookup is taken. (please see the kref_get_unless_zero documentation examples). When drm_vma_offset_remove() is called, the kref value is unconditionally zero, and that should block any successful lookup. If we drop vm_lock without modifying TTM, this will not work. Even kref_get_unless_zero() needs some guarantee that the object is still valid. Assume this scenario: drm_vma_offset_lookup() returns a valid node, however, before we call kref_get_*(), another thread destroys the last reference to the bo so it gets kfree()d. kref_get_unless_zero() will thus reference memory which can be _anything_ and is not guarantee to stay 0. (Documentation/kref.txt mentions this, too, and I guess it was you who wrote that). I cannot see any locking around the mmap call that could guarantee this except vm_lock. Otherwise, if the vma offset manager always needs external locking to make lookup + referencing atomic, then perhaps locking should be completely left to the caller? I would actually prefer to keep it in the VMA manager. It allows some fast-paths which otherwise would need special semantics for the caller (for instance see the access-management patches which are based on this patchset or the reduced locking during setup/release). We'd also require the caller to use rwsems for good performance, which is not the case for gem. So how about Daniel's proposal to add an _unlocked() version and provide _lock_lookup() and _unlock_lookup() helpers which just wrap the read_lock() and read_unlock? Thanks! David Thanks, Thomas On 07/17/2013 08:14 PM, David Herrmann wrote: Use the new vma-manager infrastructure. This doesn't change any implementation details as the vma-offset-manager is nearly copied 1-to-1 from TTM. Even though the vma-manager uses its own locks, we still need bo-vm_lock to prevent bos from being destroyed before we can get a reference during lookup. However, this lock is not needed during vm-setup as we still hold a reference there. This also drops the addr_space_offset member as it is a copy of vm_start in vma_node objects. Use the accessor functions instead. Cc: Dave Airlie airl...@redhat.com Cc: Ben Skeggs bske...@redhat.com Cc: Maarten Lankhorst maarten.lankho...@canonical.com Cc: Martin Peres martin.pe...@labri.fr Cc: Alex Deucher alexander.deuc...@amd.com Cc: Thomas Hellstrom thellst...@vmware.com Signed-off-by: David Herrmann dh.herrm...@gmail.com --- drivers/gpu/drm/ast/ast_main.c| 2 +- drivers/gpu/drm/cirrus/cirrus_main.c | 2 +- drivers/gpu/drm/mgag200/mgag200_main.c| 2 +- drivers/gpu/drm/nouveau/nouveau_display.c | 2 +- drivers/gpu/drm/nouveau/nouveau_gem.c | 2 +- drivers/gpu/drm/qxl/qxl_object.h | 2 +- drivers/gpu/drm/qxl/qxl_release.c | 2 +- drivers/gpu/drm/radeon/radeon_object.h| 5 +- drivers/gpu/drm/ttm/ttm_bo.c | 84 ++- drivers/gpu/drm/ttm/ttm_bo_util.c | 3 +- drivers/gpu/drm/ttm/ttm_bo_vm.c | 81 - drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 4 +- include/drm/ttm/ttm_bo_api.h | 15 ++ include/drm/ttm/ttm_bo_driver.h | 7 +-- 14 files changed, 65 insertions(+), 148 deletions(-) diff --git a/drivers/gpu/drm/ast/ast_main.c b/drivers/gpu/drm/ast/ast_main.c index f60fd7b..c195dc2 100644 --- a/drivers/gpu/drm/ast/ast_main.c +++ b/drivers/gpu/drm/ast/ast_main.c @@ -487,7 +487,7 @@ void ast_gem_free_object(struct drm_gem_object *obj) static inline u64 ast_bo_mmap_offset(struct ast_bo *bo) { - return bo-bo.addr_space_offset; + return drm_vma_node_offset_addr(bo-bo.vma_node); } int ast_dumb_mmap_offset(struct drm_file *file, diff --git a/drivers/gpu/drm/cirrus/cirrus_main.c b/drivers/gpu/drm/cirrus/cirrus_main.c index 35cbae8..3a7a0ef 100644 --- a/drivers/gpu/drm/cirrus/cirrus_main.c +++ b/drivers/gpu/drm/cirrus/cirrus_main.c @@ -294,7 +294,7 @@ void cirrus_gem_free_object(struct drm_gem_object *obj) static inline u64 cirrus_bo_mmap_offset(struct cirrus_bo *bo) { - return bo-bo.addr_space_offset; + return drm_vma_node_offset_addr(bo-bo.vma_node); } int diff --git a/drivers/gpu/drm/mgag200/mgag200_main.c b/drivers/gpu/drm/mgag200/mgag200_main.c index 9fa5685..1a75ea3 100644 --- a/drivers/gpu/drm/mgag200/mgag200_main.c +++ b/drivers/gpu/drm/mgag200/mgag200_main.c @@ -349,7 +349,7 @@ void mgag200_gem_free_object(struct drm_gem_object *obj) static inline u64 mgag200_bo_mmap_offset(struct mgag200_bo *bo) { - return bo-bo.addr_space_offset; +
Re: [PATCH v3 3/4] drm/ttm: convert to unified vma offset manager
On 07/18/2013 01:07 PM, David Herrmann wrote: Hi On Thu, Jul 18, 2013 at 10:53 AM, Thomas Hellstrom thellst...@vmware.com wrote: A quick look, but not a full review: Looks mostly good, but it looks like the TTM vm lock isn't needed at all anymore (provided the vma offset manager is properly protected), since kref_get_unless_zero() is used when a reference after lookup is taken. (please see the kref_get_unless_zero documentation examples). When drm_vma_offset_remove() is called, the kref value is unconditionally zero, and that should block any successful lookup. If we drop vm_lock without modifying TTM, this will not work. Even kref_get_unless_zero() needs some guarantee that the object is still valid. Assume this scenario: drm_vma_offset_lookup() returns a valid node, however, before we call kref_get_*(), another thread destroys the last reference to the bo so it gets kfree()d. kref_get_unless_zero() will thus reference memory which can be _anything_ and is not guarantee to stay 0. (Documentation/kref.txt mentions this, too, and I guess it was you who wrote that). I cannot see any locking around the mmap call that could guarantee this except vm_lock. You're right. My apologies. Parental leave has taken its toll. Otherwise, if the vma offset manager always needs external locking to make lookup + referencing atomic, then perhaps locking should be completely left to the caller? I would actually prefer to keep it in the VMA manager. It allows some fast-paths which otherwise would need special semantics for the caller (for instance see the access-management patches which are based on this patchset or the reduced locking during setup/release). We'd also require the caller to use rwsems for good performance, which is not the case for gem. So how about Daniel's proposal to add an _unlocked() version and provide _lock_lookup() and _unlock_lookup() helpers which just wrap the read_lock() and read_unlock? Thanks! David I think that if there are good reasons to keep locking internal, I'm fine with that, (And also, of course, with Daniel's proposal). Currently the add / remove / lookup paths are mostly used by TTM during object creation and destruction. However, if the lookup path is ever used by pread / pwrite, that situation might change and we would then like to minimize the locking. Thanks, Thomas Thanks, Thomas On 07/17/2013 08:14 PM, David Herrmann wrote: Use the new vma-manager infrastructure. This doesn't change any implementation details as the vma-offset-manager is nearly copied 1-to-1 from TTM. Even though the vma-manager uses its own locks, we still need bo-vm_lock to prevent bos from being destroyed before we can get a reference during lookup. However, this lock is not needed during vm-setup as we still hold a reference there. This also drops the addr_space_offset member as it is a copy of vm_start in vma_node objects. Use the accessor functions instead. Cc: Dave Airlie airl...@redhat.com Cc: Ben Skeggs bske...@redhat.com Cc: Maarten Lankhorst maarten.lankho...@canonical.com Cc: Martin Peres martin.pe...@labri.fr Cc: Alex Deucher alexander.deuc...@amd.com Cc: Thomas Hellstrom thellst...@vmware.com Signed-off-by: David Herrmann dh.herrm...@gmail.com --- drivers/gpu/drm/ast/ast_main.c| 2 +- drivers/gpu/drm/cirrus/cirrus_main.c | 2 +- drivers/gpu/drm/mgag200/mgag200_main.c| 2 +- drivers/gpu/drm/nouveau/nouveau_display.c | 2 +- drivers/gpu/drm/nouveau/nouveau_gem.c | 2 +- drivers/gpu/drm/qxl/qxl_object.h | 2 +- drivers/gpu/drm/qxl/qxl_release.c | 2 +- drivers/gpu/drm/radeon/radeon_object.h| 5 +- drivers/gpu/drm/ttm/ttm_bo.c | 84 ++- drivers/gpu/drm/ttm/ttm_bo_util.c | 3 +- drivers/gpu/drm/ttm/ttm_bo_vm.c | 81 - drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 4 +- include/drm/ttm/ttm_bo_api.h | 15 ++ include/drm/ttm/ttm_bo_driver.h | 7 +-- 14 files changed, 65 insertions(+), 148 deletions(-) diff --git a/drivers/gpu/drm/ast/ast_main.c b/drivers/gpu/drm/ast/ast_main.c index f60fd7b..c195dc2 100644 --- a/drivers/gpu/drm/ast/ast_main.c +++ b/drivers/gpu/drm/ast/ast_main.c @@ -487,7 +487,7 @@ void ast_gem_free_object(struct drm_gem_object *obj) static inline u64 ast_bo_mmap_offset(struct ast_bo *bo) { - return bo-bo.addr_space_offset; + return drm_vma_node_offset_addr(bo-bo.vma_node); } int ast_dumb_mmap_offset(struct drm_file *file, diff --git a/drivers/gpu/drm/cirrus/cirrus_main.c b/drivers/gpu/drm/cirrus/cirrus_main.c index 35cbae8..3a7a0ef 100644 --- a/drivers/gpu/drm/cirrus/cirrus_main.c +++ b/drivers/gpu/drm/cirrus/cirrus_main.c @@ -294,7 +294,7 @@ void cirrus_gem_free_object(struct drm_gem_object *obj) static inline u64 cirrus_bo_mmap_offset(struct cirrus_bo *bo) { - return bo-bo.addr_space_offset; + return
Re: [PATCH v3 3/4] drm/ttm: convert to unified vma offset manager
Hi On Thu, Jul 18, 2013 at 1:24 PM, Thomas Hellstrom thellst...@vmware.com wrote: On 07/18/2013 01:07 PM, David Herrmann wrote: Hi On Thu, Jul 18, 2013 at 10:53 AM, Thomas Hellstrom thellst...@vmware.com wrote: A quick look, but not a full review: Looks mostly good, but it looks like the TTM vm lock isn't needed at all anymore (provided the vma offset manager is properly protected), since kref_get_unless_zero() is used when a reference after lookup is taken. (please see the kref_get_unless_zero documentation examples). When drm_vma_offset_remove() is called, the kref value is unconditionally zero, and that should block any successful lookup. If we drop vm_lock without modifying TTM, this will not work. Even kref_get_unless_zero() needs some guarantee that the object is still valid. Assume this scenario: drm_vma_offset_lookup() returns a valid node, however, before we call kref_get_*(), another thread destroys the last reference to the bo so it gets kfree()d. kref_get_unless_zero() will thus reference memory which can be _anything_ and is not guarantee to stay 0. (Documentation/kref.txt mentions this, too, and I guess it was you who wrote that). I cannot see any locking around the mmap call that could guarantee this except vm_lock. You're right. My apologies. Parental leave has taken its toll. Otherwise, if the vma offset manager always needs external locking to make lookup + referencing atomic, then perhaps locking should be completely left to the caller? I would actually prefer to keep it in the VMA manager. It allows some fast-paths which otherwise would need special semantics for the caller (for instance see the access-management patches which are based on this patchset or the reduced locking during setup/release). We'd also require the caller to use rwsems for good performance, which is not the case for gem. So how about Daniel's proposal to add an _unlocked() version and provide _lock_lookup() and _unlock_lookup() helpers which just wrap the read_lock() and read_unlock? Thanks! David I think that if there are good reasons to keep locking internal, I'm fine with that, (And also, of course, with Daniel's proposal). Currently the add / remove / lookup paths are mostly used by TTM during object creation and destruction. However, if the lookup path is ever used by pread / pwrite, that situation might change and we would then like to minimize the locking. I tried to keep the change as minimal as I could. Follow-up patches are welcome. I just thought pushing the lock into drm_vma_* would simplify things. If there are benchmarks that prove me wrong, I'll gladly spend some time optimizing that. Apart from this, I'd like to see someone ack the ttm_buffer_object_transfer() changes. I am not very confident with that. Everything else should be trivial. Thanks David ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v3 3/4] drm/ttm: convert to unified vma offset manager
On Thu, Jul 18, 2013 at 4:54 PM, David Herrmann dh.herrm...@gmail.com wrote: Hi On Thu, Jul 18, 2013 at 1:24 PM, Thomas Hellstrom thellst...@vmware.com wrote: On 07/18/2013 01:07 PM, David Herrmann wrote: Hi On Thu, Jul 18, 2013 at 10:53 AM, Thomas Hellstrom thellst...@vmware.com wrote: A quick look, but not a full review: Looks mostly good, but it looks like the TTM vm lock isn't needed at all anymore (provided the vma offset manager is properly protected), since kref_get_unless_zero() is used when a reference after lookup is taken. (please see the kref_get_unless_zero documentation examples). When drm_vma_offset_remove() is called, the kref value is unconditionally zero, and that should block any successful lookup. If we drop vm_lock without modifying TTM, this will not work. Even kref_get_unless_zero() needs some guarantee that the object is still valid. Assume this scenario: drm_vma_offset_lookup() returns a valid node, however, before we call kref_get_*(), another thread destroys the last reference to the bo so it gets kfree()d. kref_get_unless_zero() will thus reference memory which can be _anything_ and is not guarantee to stay 0. (Documentation/kref.txt mentions this, too, and I guess it was you who wrote that). I cannot see any locking around the mmap call that could guarantee this except vm_lock. You're right. My apologies. Parental leave has taken its toll. Otherwise, if the vma offset manager always needs external locking to make lookup + referencing atomic, then perhaps locking should be completely left to the caller? I would actually prefer to keep it in the VMA manager. It allows some fast-paths which otherwise would need special semantics for the caller (for instance see the access-management patches which are based on this patchset or the reduced locking during setup/release). We'd also require the caller to use rwsems for good performance, which is not the case for gem. So how about Daniel's proposal to add an _unlocked() version and provide _lock_lookup() and _unlock_lookup() helpers which just wrap the read_lock() and read_unlock? Thanks! David I think that if there are good reasons to keep locking internal, I'm fine with that, (And also, of course, with Daniel's proposal). Currently the add / remove / lookup paths are mostly used by TTM during object creation and destruction. However, if the lookup path is ever used by pread / pwrite, that situation might change and we would then like to minimize the locking. I tried to keep the change as minimal as I could. Follow-up patches are welcome. I just thought pushing the lock into drm_vma_* would simplify things. If there are benchmarks that prove me wrong, I'll gladly spend some time optimizing that. Apart from this, I'd like to see someone ack the ttm_buffer_object_transfer() changes. I am not very confident with that. Everything else should be trivial. Thanks David Looks good to me, the transfer object must have an empty vm_node/vma_node as we only interested in tying the system ram or vram to the ghost object so that delayed delete the vram or system ram but the original buffer is still valid and thus its original vm_node/vma_node must not be free. Cheers, Jerome ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel