Roland,

Yishai Hadas from Mellanox noted that commit db5a7a65c05 "mlx4_core: Scale size of MTT table with system RAM" seems to introduce a bug, where if request->num_mtt becomes 2^25 or higher

+       /*
+        * We want to scale the number of MTTs with the size of the
+        * system memory, since it makes sense to register a lot of
+        * memory on a system with a lot of memory.  As a heuristic,
+        * make sure we have enough MTTs to cover twice the system
+        * memory (with PAGE_SIZE entries).
+        *
+        * This number has to be a power of two and fit into 32 bits
+        * due to device limitations, so cap this at 2^31 as well.
+        * That limits us to 8TB of memory registration per HCA with
+        * 4KB pages, which is probably OK for the next few months.
+        */
+       si_meminfo(&si);
+       request->num_mtt =
+               roundup_pow_of_two(max_t(unsigned, request->num_mtt,
+                                        min(1UL << 31,
+ si.totalram >> (log_mtts_per_seg - 1))));
+


we are somehow getting into a situation where mlx4_buddy_init needs to allocate > 128KB using kmalloc, which is impossible... he was suggesting to replace kmalloc/kfree with vmalloc/vfree (see below). What's your thinking here? should we go to get_free_pages? limit by 2^25? something else?


Or.

--- a/drivers/net/ethernet/mellanox/mlx4/mr.c
+++ b/drivers/net/ethernet/mellanox/mlx4/mr.c
@@ -129,7 +129,7 @@ static int mlx4_buddy_init(struct mlx4_buddy *buddy, int max_order)

        for (i = 0; i <= buddy->max_order; ++i) {
                s = BITS_TO_LONGS(1 << (buddy->max_order - i));
-               buddy->bits[i] = kmalloc(s * sizeof (long), GFP_KERNEL);
+               buddy->bits[i] = vmalloc(s * sizeof(long));
                if (!buddy->bits[i])
                        goto err_out_free;
                bitmap_zero(buddy->bits[i], 1 << (buddy->max_order - i));
@@ -142,7 +142,7 @@ static int mlx4_buddy_init(struct mlx4_buddy *buddy, int max_order)

 err_out_free:
        for (i = 0; i <= buddy->max_order; ++i)
-               kfree(buddy->bits[i]);
+               vfree(buddy->bits[i]);

 err_out:
        kfree(buddy->bits);
@@ -156,7 +156,7 @@ static void mlx4_buddy_cleanup(struct mlx4_buddy *buddy)
        int i;

        for (i = 0; i <= buddy->max_order; ++i)
-               kfree(buddy->bits[i]);
+               vfree(buddy->bits[i]);

        kfree(buddy->bits);
        kfree(buddy->num_free);

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to