Re: [PATCH v2 1/1] drm/mm: optimize rb_hole_addr rbtree search
On 5/2/2020 10:00 AM, Chris Wilson wrote: Quoting Nirmoy (2020-04-30 11:30:43) On 4/30/20 12:15 PM, Chris Wilson wrote: Quoting Nirmoy Das (2020-04-30 10:58:39) +void insert_hole_addr(struct rb_root *root, struct drm_mm_node *node) ^ static Ah I forgot! (sparse [make C=1] or make W=1 will spot this) Thanks for the tip :) Nirmoy Looks good, I'll check more carefully later. Reviewed-by: Chris Wilson If you do find some time to add some more tests, that would be great! Even converting your example into a test-drm_mm benchmark [spending a few seconds runtime max!] will be very useful. Thanks Chris for reviewing this. I already started looking into adding some fragmentation tests :) Regards, Nirmoy -Chris ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-develdata=02%7C01%7Cnirmoy.das%40amd.com%7Ce73eb7b87f8a4c4c180108d7ee77389c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637240068162932922sdata=%2FthV3slEerwVUYVwGzfiOb%2BtpboXQwYca%2BlAJ3L4Ad0%3Dreserved=0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2 1/1] drm/mm: optimize rb_hole_addr rbtree search
Quoting Nirmoy (2020-04-30 11:30:43) > > On 4/30/20 12:15 PM, Chris Wilson wrote: > > Quoting Nirmoy Das (2020-04-30 10:58:39) > >> +void insert_hole_addr(struct rb_root *root, struct drm_mm_node *node) > > ^ static > > > Ah I forgot! > > > > > (sparse [make C=1] or make W=1 will spot this) > > > Thanks for the tip :) > > Nirmoy > > > > > Looks good, I'll check more carefully later. Reviewed-by: Chris Wilson If you do find some time to add some more tests, that would be great! Even converting your example into a test-drm_mm benchmark [spending a few seconds runtime max!] will be very useful. -Chris ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2 1/1] drm/mm: optimize rb_hole_addr rbtree search
Hi Nirmoy, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on drm-intel/for-linux-next] [also build test WARNING on drm-exynos/exynos-drm-next tegra-drm/drm/tegra/for-next drm-tip/drm-tip linus/master v5.7-rc3 next-20200501] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/Nirmoy-Das/drm-mm-optimize-rb_hole_addr-rbtree-search/20200501-165835 base: git://anongit.freedesktop.org/drm-intel for-linux-next reproduce: # apt-get install sparse # sparse version: v0.6.1-191-gc51a0382-dirty make ARCH=x86_64 allmodconfig make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' If you fix the issue, kindly add following tag as appropriate Reported-by: kbuild test robot sparse warnings: (new ones prefixed by >>) >> drivers/gpu/drm/drm_mm.c:248:6: sparse: sparse: symbol 'insert_hole_addr' >> was not declared. Should it be static? drivers/gpu/drm/drm_mm.c:401:24: sparse: sparse: Using plain integer as NULL pointer drivers/gpu/drm/drm_mm.c:441:24: sparse: sparse: Using plain integer as NULL pointer Please review and possibly fold the followup patch. --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v2 1/1] drm/mm: optimize rb_hole_addr rbtree search
Userspace can severely fragment rb_hole_addr rbtree by manipulating alignment while allocating buffers. Fragmented rb_hole_addr rbtree would result in large delays while allocating buffer object for a userspace application. It takes long time to find suitable hole because if we fail to find a suitable hole in the first attempt then we look for neighbouring nodes using rb_prev()/rb_next(). Traversing rbtree using rb_prev()/rb_next() can take really long time if the tree is fragmented. This patch improves searches in fragmented rb_hole_addr rbtree by modifying it to an augmented rbtree which will store an extra field in drm_mm_node, subtree_max_hole. Each drm_mm_node now stores maximum hole size for its subtree in drm_mm_node->subtree_max_hole. Using drm_mm_node->subtree_max_hole, it is possible to eliminate complete subtree if that subtree is unable to serve a request hence reducing number of rb_prev()/rb_next() used. With this patch applied, 1 million bo allocations on amdgpu took ~8 sec and without the patch the test code took 28 sec for only 50k bo allocs. partial test code: int test_fragmentation(void) { int i = 0; uint32_t minor_version; uint32_t major_version; struct amdgpu_bo_alloc_request request = {}; amdgpu_bo_handle vram_handle[MAX_ALLOC] = {}; amdgpu_device_handle device_handle; request.alloc_size = 4096; request.phys_alignment = 8192; request.preferred_heap = AMDGPU_GEM_DOMAIN_VRAM; int fd = open("/dev/dri/card0", O_RDWR | O_CLOEXEC); amdgpu_device_initialize(fd, _version, _version, _handle); for (i = 0; i < MAX_ALLOC; i++) { amdgpu_bo_alloc(device_handle, , _handle[i]); } for (i = 0; i < MAX_ALLOC; i++) amdgpu_bo_free(vram_handle[i]); return 0; } v2: Use RB_DECLARE_CALLBACKS_MAX to maintain subtree_max_hole Signed-off-by: Nirmoy Das --- drivers/gpu/drm/drm_mm.c | 133 +-- include/drm/drm_mm.h | 1 + 2 files changed, 115 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c index 8981abe8b7c9..effc6e5cac45 100644 --- a/drivers/gpu/drm/drm_mm.c +++ b/drivers/gpu/drm/drm_mm.c @@ -212,20 +212,6 @@ static void drm_mm_interval_tree_add_node(struct drm_mm_node *hole_node, _mm_interval_tree_augment); } -#define RB_INSERT(root, member, expr) do { \ - struct rb_node **link = _node, *rb = NULL; \ - u64 x = expr(node); \ - while (*link) { \ - rb = *link; \ - if (x < expr(rb_entry(rb, struct drm_mm_node, member))) \ - link = >rb_left; \ - else \ - link = >rb_right; \ - } \ - rb_link_node(>member, rb, link); \ - rb_insert_color(>member, ); \ -} while (0) - #define HOLE_SIZE(NODE) ((NODE)->hole_size) #define HOLE_ADDR(NODE) (__drm_mm_hole_node_start(NODE)) @@ -255,16 +241,42 @@ static void insert_hole_size(struct rb_root_cached *root, rb_insert_color_cached(>rb_hole_size, root, first); } +RB_DECLARE_CALLBACKS_MAX(static, augment_callbacks, +struct drm_mm_node, rb_hole_addr, +u64, subtree_max_hole, HOLE_SIZE) + +void insert_hole_addr(struct rb_root *root, struct drm_mm_node *node) +{ + struct rb_node **link = >rb_node, *rb_parent = NULL; + u64 start = HOLE_ADDR(node), subtree_max_hole = node->subtree_max_hole; + struct drm_mm_node *parent; + + while (*link) { + rb_parent = *link; + parent = rb_entry(rb_parent, struct drm_mm_node, rb_hole_addr); + if (parent->subtree_max_hole < subtree_max_hole) + parent->subtree_max_hole = subtree_max_hole; + if (start < HOLE_ADDR(parent)) + link = >rb_hole_addr.rb_left; + else + link = >rb_hole_addr.rb_right; + } + + rb_link_node(>rb_hole_addr, rb_parent, link); + rb_insert_augmented(>rb_hole_addr, root, _callbacks); +} + static void add_hole(struct drm_mm_node *node) { struct drm_mm *mm = node->mm; node->hole_size = __drm_mm_hole_node_end(node) - __drm_mm_hole_node_start(node); + node->subtree_max_hole = node->hole_size; DRM_MM_BUG_ON(!drm_mm_hole_follows(node)); insert_hole_size(>holes_size, node); - RB_INSERT(mm->holes_addr, rb_hole_addr, HOLE_ADDR); + insert_hole_addr(>holes_addr, node); list_add(>hole_stack, >hole_stack); } @@ -275,8 +287,10 @@ static void rm_hole(struct drm_mm_node *node) list_del(>hole_stack); rb_erase_cached(>rb_hole_size, >mm->holes_size); - rb_erase(>rb_hole_addr, >mm->holes_addr); + rb_erase_augmented(>rb_hole_addr, >mm->holes_addr, +
Re: [PATCH v2 1/1] drm/mm: optimize rb_hole_addr rbtree search
On 4/30/20 12:15 PM, Chris Wilson wrote: Quoting Nirmoy Das (2020-04-30 10:58:39) +void insert_hole_addr(struct rb_root *root, struct drm_mm_node *node) ^ static Ah I forgot! (sparse [make C=1] or make W=1 will spot this) Thanks for the tip :) Nirmoy Looks good, I'll check more carefully later. -Chris ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-develdata=02%7C01%7Cnirmoy.das%40amd.com%7Cfa3f0888c45546cdd9c308d7ecef60eb%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637238385206441661sdata=%2Fnn7QOOukYEcawU0bZ5WWjy99TVOpWouNPxlC8%2BW2FU%3Dreserved=0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2 1/1] drm/mm: optimize rb_hole_addr rbtree search
Quoting Nirmoy Das (2020-04-30 10:58:39) > +void insert_hole_addr(struct rb_root *root, struct drm_mm_node *node) ^ static (sparse [make C=1] or make W=1 will spot this) Looks good, I'll check more carefully later. -Chris ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel